A few days ago, I created a Docker build for Flask with PostgreSQL (both with Alpine Linux and with Debian Linux).
Installing psypcopg-2 binary (required for Postgres) requires you to build the package from source.
Now the Docker image grows in size, as it still contains the build artifacts.
The solution? Multi-stage Docker builds.
Let’s say we have the following docker-compose.yml
file. There are two services: a Flask API called users
and a Postgres database called users-db
.
version: '3.7'
services:
users:
build:
context: .
dockerfile: Dockerfile
entrypoint: ['/usr/src/app/entrypoint.sh']
volumes:
- '.:/usr/src/app'
ports:
- 5001:5000
environment:
- FLASK_ENV=development
- APP_SETTINGS=project.config.DevelopmentConfig
- DATABASE_URL=postgresql://postgres:postgres@users-db:5432/users_dev
- DATABASE_TEST_URL=postgresql://postgres:postgres@users-db:5432/users_test
depends_on:
- users-db
users-db:
build:
context: ./project/db
dockerfile: Dockerfile
expose:
- 5432
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
We have a requirements.in
file with the following dependencies:
Flask==1.1.1
Flask-RESTful==0.3.7
Flask-SQLAlchemy==2.4.0
psycopg2-binary==2.8.3
pytest==5.0.1
We’ll need psycopg2-binary
for running PostgreSQL.
Let’s create a multi-stage Docker file that
- compiles and builds the required build packages into a virtual environment in the first stage (compile image)
- creates a clean fresh stage that copies the compiled code and only installs the run-time dependencies (run-time image)
## base image
FROM python:3.7.5-slim-buster AS compile-image
## install dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc
## virtualenv
ENV VIRTUAL_ENV=/opt/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
## add and install requirements
RUN pip install --upgrade pip && pip install pip-tools
COPY ./requirements.in .
RUN pip-compile requirements.in > requirements.txt && pip-sync
RUN pip install -r requirements.txt
## build-image
FROM python:3.7.5-slim-buster AS runtime-image
## install nc
RUN apt-get update && \
apt-get install -y --no-install-recommends netcat-openbsd
## copy Python dependencies from build image
COPY --from=compile-image /opt/venv /opt/venv
## set working directory
WORKDIR /usr/src/app
## add user
RUN addgroup --system user && adduser --system --no-create-home --group user
RUN chown -R user:user /usr/src/app && chmod -R 755 /usr/src/app
## add entrypoint.sh
COPY ./entrypoint.sh /usr/src/app/entrypoint.sh
RUN chmod +x /usr/src/app/entrypoint.sh
## switch to non-root user
USER user
## add app
COPY . /usr/src/app
## set environment variables
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
ENV PATH="/opt/venv/bin:$PATH"
## run server
CMD python manage.py run -h 0.0.0.0
The first stage installs gcc
, which we need to build psypcop2-binary
. Then we create a virtual environment. We upgrade pip
(the package manager for Python) and install pip-tools
.pip-tools
is a way to pin dependencies, so you can be sure what gets mounted in your container (use with pip-compile
and pip-sync
).
In the second stage, we start fresh from the same base image (a Python 3.7 Debian image).
The Flask app needs netcat
, so we’ll use apt
to install the package.
Now it gets interesting. We’ll copy the packages from the virtual environment into the run-time stage.
The next steps show standard Docker steps: set a working directory, add a non-root user, copy the application app source, set environment variables, run the app.
Further Reading
- Test-Driven Development with Python, Flask, and Docker by Michael Herman
- Advanced multi-stage build patterns by Tõnis Tiigi
- 9 Common Dockerfile Mistakes by Jorge Silva
- Multi-stage builds #1: Smaller images for compiled code by Itamar Turner-Trauring
- Multi-stage builds #2: Python specifics—virtualenv, –user, and other methods by Itamar Turner-Trauring
- Python Application Dependency Management in 2018 by Hynek Schlawack