From 33888950698bd96615f4176d01742e8e6c885292 Mon Sep 17 00:00:00 2001 From: Jiakai Li Date: Wed, 18 Dec 2024 14:57:45 +1300 Subject: [PATCH] Update README.md - docker compose --- README.md | 106 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 106 insertions(+) diff --git a/README.md b/README.md index 8b0adde..265615a 100644 --- a/README.md +++ b/README.md @@ -97,3 +97,109 @@ And finally [bandit](https://github.com/PyCQA/bandit) is used to perform securit ```bash (.venv) $ python -m bandit -r src/ ``` + +## Dockerize Web Application + +One good practice is to create and switch to a regular user without administrative privileges as soon as you don't need them anymore. +```dockerfile +RUN useradd --create-home realpython +USER realpython +WORKDIR /home/realpython +``` + +Another good practice suggested is to use a dedicated virtual environment even within the container, due to the concern of risk interfering with the container’s own system tools. + +>Unfortunately, many Linux distributions rely on the global Python installation to run smoothly. If you start installing packages directly into the global Python environment, then you open the door for potential version conflicts. + +It was suggested to directly modify the `PATH` environment variable: +```dockerfile +ENV VIRTUALENV=/home/realpython/venv +RUN python3 -m venv $VIRTUALENV + +# Put $VIRTUALENV/bin before $PATH to prioritize it +ENV PATH="$VIRTUALENV/bin:$PATH" +``` + +The reason of doing it this way is: +- Activating your environment in the usual way would only be temporary and wouldn’t affect Docker containers derived from your image. +- If you activated the virtual environment using Dockerfile’s `RUN` instruction, then it would only last until the next instruction in your Dockerfile because each one starts a new shell session. + +The third good practice suggested is to leverage layer caching, before copying source code and run test +```dockerfile +# Copy dependency files first +COPY --chown=pagetracker pyproject.toml constraints.txt ./ +RUN python -m pip install --upgrade pip setuptools && \ + python -m pip install --no-cache-dir -c constraints.txt ".[dev]" + +# Copy source files after the cached dependency layer +COPY --chown=pagetracker src/ src/ +COPY --chown=pagetracker test/ test/ + +# Run test (install the project first) +# The reason for combining the individual commands in one RUN instruction is to reduce the number of layers to cache +RUN python -m pip install . -c constraints.txt && \ + python -m pytest test/unit/ && \ + python -m flake8 src/ && \ + python -m isort src/ --check && \ + python -m black src/ --check --quiet && \ + python -m pylint src/ --disable=C0114,C0116,R1705 && \ + python -m bandit -r src/ --quiet +``` + +## Multi-Stage Builds +```dockerfile +FROM python:3.11.2-slim-bullseye AS builder +# ... + +# Building a distribution package +RUN python -m pip wheel --wheel-dir dist/ -c constraints.txt . + +FROM python:3.11.2-slim-bullseye AS target + +RUN apt-get update && \ + apt-get upgrade -y + +RUN useradd --create-home pagetracker +USER pagetracker +WORKDIR /home/pagetracker + +ENV VIRTUALENV=/home/pagetracker/venv +RUN python -m venv $VIRTUALENV +ENV PATH="$VIRTUALENV/bin:$PATH" + +# Copy the distribution package +COPY --from=builder /home/pagetracker/dist/page_tracker*.whl /home/pagetracker + +RUN python -m pip install --upgrade pip setuptools && \ + python -m pip install --no-cache-dir page_tracker*.whl +``` + +## Version Docker Image + +Three versioning strategies: +- **Semantic versioning** uses three numbers delimited with a dot to indicate the major, minor, and patch versions. +- **Git commit hash** uses the SHA-1 hash of a Git commit tied to the source code in your image. E.g: + ```bash + $ docker build -t page-tracker:$(git rev-parse --short HEAD) . + ``` +- **Timestamp** uses temporal information, such as Unix time, to indicate when the image was built. + +## Multi-Container Docker Application + +Docker compose is used to coordinate different containers to run as a whole application +```yaml +services: + redis: + image: "redis:7.0.10-bullseye" + # ... + + web: + build: ./web + # ... + command: "gunicorn page_tracker.app:app --bind 0.0.0.0:8000" +``` + +The command overwrite makes sure that we are using a production grade webserver for deployment. When we say production grade webserver, the flask provided webserver ([reference](https://stackoverflow.com/questions/12269537/is-the-server-bundled-with-flask-safe-to-use-in-production)): +- It will not handle more than one request at a time by default. +- If you leave debug mode on and an error pops up, it opens up a shell that allows for arbitrary code to be executed on your server (think os.system('rm -rf /')). +- The development server doesn't scale well.