Why Docker for ML?
Docker solves the "it works on my machine" problem. It packages your model, code, and all dependencies into a container that runs identically everywhere.
- Reproducibility: Same environment in dev, test, and production
- Portability: Run anywhere Docker runs
- Isolation: No dependency conflicts
- Scalability: Easy to scale with Kubernetes
Docker Basics
# Check Docker installation
docker --version
# Pull an image
docker pull python:3.11-slim
# Run a container
docker run -it python:3.11-slim python --version
# List running containers
docker ps
# List all containers
docker ps -a
# Stop a container
docker stop container_id
# Remove a container
docker rm container_id
# List images
docker images
# Remove an image
docker rmi image_id
Basic ML Dockerfile
# Dockerfile
FROM python:3.11-slim
# Set working directory
WORKDIR /app
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
# Install system dependencies
RUN apt-get update && apt-get install -y \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements first (for caching)
COPY requirements.txt .
# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
# Expose port
EXPOSE 8000
# Run the application
CMD ["python", "app.py"]
Multi-Stage Builds for Smaller Images
# Multi-stage Dockerfile for production
FROM python:3.11-slim as builder
WORKDIR /app
# Install build dependencies
RUN apt-get update && apt-get install -y build-essential
# Create virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Production stage
FROM python:3.11-slim as production
WORKDIR /app
# Copy virtual environment from builder
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Copy application
COPY . .
# Create non-root user
RUN useradd -m appuser && chown -R appuser:appuser /app
USER appuser
EXPOSE 8000
CMD ["gunicorn", "-w", "4", "-b", "0.0.0.0:8000", "app:app"]
GPU Support with NVIDIA Docker
# GPU-enabled Dockerfile
FROM nvidia/cuda:11.8-cudnn8-runtime-ubuntu22.04
# Install Python
RUN apt-get update && apt-get install -y \
python3.11 \
python3-pip \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python3", "train.py"]
# Build
docker build -t ml-gpu .
# Run with GPU access
docker run --gpus all ml-gpu
# Or specific GPUs
docker run --gpus '"device=0,1"' ml-gpu
Docker Compose for ML Workflows
# docker-compose.yml
version: '3.8'
services:
api:
build:
context: .
dockerfile: Dockerfile
ports:
- "8000:8000"
volumes:
- ./models:/app/models
environment:
- MODEL_PATH=/app/models/model.pkl
- LOG_LEVEL=INFO
depends_on:
- redis
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
redis:
image: redis:7-alpine
ports:
- "6379:6379"
worker:
build: .
command: celery -A tasks worker --loglevel=info
depends_on:
- redis
mlflow:
image: ghcr.io/mlflow/mlflow:v2.9.0
ports:
- "5000:5000"
volumes:
- ./mlruns:/mlruns
command: mlflow server --host 0.0.0.0 --backend-store-uri sqlite:///mlflow.db
# Commands
docker-compose up -d
docker-compose logs -f api
docker-compose down
Optimizing Docker Images for ML
# .dockerignore
__pycache__
*.pyc
*.pyo
.git
.gitignore
*.md
.env
.venv
venv
notebooks/
tests/
*.ipynb
.pytest_cache
.coverage
htmlcov/
data/raw/
*.log
# Optimized requirements installation
# Install heavy packages first (better caching)
COPY requirements-base.txt .
RUN pip install --no-cache-dir -r requirements-base.txt
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Use slim or alpine images when possible
FROM python:3.11-slim # ~150MB vs ~900MB for full
FROM python:3.11-alpine # ~50MB (but may have compatibility issues)
# Clear pip cache
RUN pip install --no-cache-dir -r requirements.txt
# Combine RUN commands
RUN apt-get update && apt-get install -y \
package1 \
package2 \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean
Development vs Production
# Dockerfile.dev
FROM python:3.11-slim
WORKDIR /app
COPY requirements-dev.txt .
RUN pip install --no-cache-dir -r requirements-dev.txt
# Mount code as volume for hot reload
CMD ["uvicorn", "app:app", "--reload", "--host", "0.0.0.0"]
# docker-compose.dev.yml
version: '3.8'
services:
api:
build:
context: .
dockerfile: Dockerfile.dev
volumes:
- .:/app # Mount for hot reload
- /app/.venv # Exclude venv
ports:
- "8000:8000"
environment:
- DEBUG=true
# Run development environment
docker-compose -f docker-compose.dev.yml up
Serving Models with Docker
# Complete ML API Dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy model and code
COPY models/ ./models/
COPY src/ ./src/
COPY app.py .
# Set environment variables
ENV MODEL_PATH=/app/models/model.joblib
ENV PORT=8000
EXPOSE $PORT
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:$PORT/health || exit 1
# Run with Gunicorn
CMD gunicorn app:app \
--workers 4 \
--worker-class uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:$PORT \
--timeout 120 \
--access-logfile - \
--error-logfile -
Container Registry and Deployment
# Build and tag
docker build -t myapp:v1.0 .
# Tag for registry
docker tag myapp:v1.0 registry.example.com/myapp:v1.0
# Push to registry
docker push registry.example.com/myapp:v1.0
# AWS ECR
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
docker tag myapp:v1.0 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.0
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.0
# Google Container Registry
gcloud auth configure-docker
docker tag myapp:v1.0 gcr.io/my-project/myapp:v1.0
docker push gcr.io/my-project/myapp:v1.0
# Docker Hub
docker login
docker tag myapp:v1.0 username/myapp:v1.0
docker push username/myapp:v1.0
Best Practices
- Pin versions: Use specific image tags, not :latest
- Non-root user: Run containers as non-root for security
- Health checks: Always include health check endpoints
- Logging: Log to stdout/stderr for container orchestrators
- Secrets: Use environment variables or secrets management
- Layer caching: Order Dockerfile commands by change frequency
Master ML Deployment
Our Data Science program covers Docker, Kubernetes, and cloud deployment. Learn to deploy production-ready ML systems.
Explore Data Science Program