Docker Multistage Builds
Introduction
Docker multistage builds are a powerful feature that allows you to create smaller, more efficient Docker images without sacrificing build tools or debugging capabilities. In traditional Docker builds, you often face a dilemma: include all build tools for a complete build environment (resulting in large images) or create complex build scripts to minimize image size. Multistage builds solve this problem by allowing you to use multiple FROM
statements in your Dockerfile, where each FROM
instruction starts a new build stage.
In this guide, you'll learn:
- What multistage builds are and why they matter
- How to implement multistage builds step by step
- Real-world examples and best practices
- Advanced techniques to optimize your Docker workflow
The Problem with Single-Stage Builds
Before diving into multistage builds, let's understand why they're needed. Consider this single-stage Dockerfile for a Node.js application:
FROM node:18
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]
While this works, it creates an image containing:
- The Node.js runtime
- All npm dependencies (including dev dependencies)
- Build tools and source code
- The compiled application
The result? A bloated image with unnecessary components that:
- Consumes more disk space and network bandwidth
- Has a larger attack surface
- Takes longer to deploy
Multistage Builds: The Solution
Multistage builds let you use multiple stages in a single Dockerfile, copying only the necessary artifacts from one stage to another. Here's how we can improve the previous example:
# Build stage
FROM node:18 AS build
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
# Production stage
FROM node:18-slim
WORKDIR /app
COPY --from=build /app/dist ./dist
COPY --from=build /app/package*.json ./
RUN npm install --only=production
EXPOSE 3000
CMD ["npm", "start"]
In this example:
- The first stage (named
build
) installs all dependencies and builds the application - The second stage starts with a slim Node.js image
- Only the built assets and production dependencies are copied from the build stage
- The final image is significantly smaller and contains only what's needed to run the application
How Multistage Builds Work
Let's break down the key components of multistage builds:
Multiple FROM Instructions
Each FROM
instruction in a Dockerfile starts a new build stage. You can name stages using the AS
keyword:
FROM golang:1.18 AS builder
# ...build stage instructions
FROM alpine:latest
# ...production stage instructions
Copying Files Between Stages
The COPY --from=<stage>
instruction copies files from a previous stage to the current stage:
COPY --from=builder /go/src/app/main ./
This copies the main
binary from the builder
stage to the current stage.
Selective Execution
You can build up to a specific stage using the --target
flag:
docker build --target builder -t myapp:build .
This is useful for debugging or creating different image variants.
Step-by-Step Example: A Go Web Application
Let's walk through creating a multistage Dockerfile for a Go web application:
# Stage 1: Build the application
FROM golang:1.18 AS builder
WORKDIR /app
# Copy go mod and sum files
COPY go.mod go.sum ./
# Download dependencies
RUN go mod download
# Copy source code
COPY . .
# Build the application
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .
# Stage 2: Create the minimal runtime image
FROM alpine:3.15
RUN apk --no-cache add ca-certificates
WORKDIR /root/
# Copy the binary from builder stage
COPY --from=builder /app/main .
# Copy any config files
COPY --from=builder /app/config ./config
# Expose port
EXPOSE 8080
# Run the application
CMD ["./main"]
This Dockerfile:
- Uses a full Go image to build the application
- Creates a tiny Alpine-based image for the runtime
- Includes only the compiled binary and configuration files
- Results in an image that's often less than 20MB (compared to 300MB+ for the full Go image)
Real-World Application: React Frontend with Node.js Backend
Here's a more complex example of a multistage build for a full-stack application:
# Stage 1: Build the React frontend
FROM node:18 AS frontend-builder
WORKDIR /app/frontend
COPY frontend/package*.json ./
RUN npm install
COPY frontend/ ./
RUN npm run build
# Stage 2: Build the Node.js backend
FROM node:18 AS backend-builder
WORKDIR /app/backend
COPY backend/package*.json ./
RUN npm install
COPY backend/ ./
RUN npm run build
# Stage 3: Production image
FROM node:18-slim
WORKDIR /app
# Copy backend build
COPY --from=backend-builder /app/backend/dist ./dist
COPY --from=backend-builder /app/backend/package*.json ./
RUN npm install --only=production
# Copy frontend build to the public directory
COPY --from=frontend-builder /app/frontend/build ./public
EXPOSE 8080
CMD ["node", "dist/server.js"]
This Dockerfile:
- Builds the React frontend in one stage
- Builds the Node.js backend in another stage
- Creates a final production image with just the compiled assets
- Significantly reduces the image size by excluding build tools and source code
Best Practices for Multistage Builds
1. Name Your Stages
Always name your build stages for clarity:
FROM golang:1.18 AS builder
This makes your Dockerfile more readable and allows you to reference stages in any order.
2. Order Dependencies for Caching
Place instructions that change less frequently at the beginning of each stage:
COPY package.json package-lock.json ./
RUN npm install
COPY . .
Docker caches the results of each step. By copying just the package files first, you can avoid reinstalling dependencies when only your source code changes.
3. Use Distroless or Minimal Base Images
For the final stage, use the smallest possible base image:
FROM gcr.io/distroless/java:11
# or
FROM alpine:3.15
# or
FROM scratch # For compiled languages that don't need an OS
4. Keep Only What You Need
Only copy artifacts that are required for runtime:
# Bad practice
COPY --from=builder /app ./
# Good practice
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/config ./config
5. Utilize Build Arguments
Use build arguments to create flexible builds:
ARG NODE_ENV=production
FROM node:18 AS builder
ARG NODE_ENV
ENV NODE_ENV=${NODE_ENV}
# ...rest of Dockerfile
This allows you to build different image variants:
docker build --build-arg NODE_ENV=development -t myapp:dev .
Advanced Multistage Techniques
Conditional Stages
You can conditionally include stages based on build arguments:
FROM base AS common
# ...common setup
FROM common AS development
# ...development-specific setup
FROM common AS production
# ...production-specific setup
FROM ${TARGET:-production} AS final
# Final stage uses either development or production
Build with:
docker build --build-arg TARGET=development -t myapp:dev .
Parallelized Builds
With Docker BuildKit, you can parallelize independent stages:
# These stages can build in parallel
FROM golang:1.18 AS api-builder
# ...build API
FROM node:18 AS frontend-builder
# ...build frontend
# Final stage depends on both
FROM alpine:3.15
COPY --from=api-builder /app/api ./
COPY --from=frontend-builder /app/dist ./public
Enable BuildKit with:
DOCKER_BUILDKIT=1 docker build -t myapp .
Testing Stages
You can include testing stages in your build:
FROM node:18 AS builder
# ...build application
FROM builder AS tester
RUN npm test
FROM node:18-slim AS production
COPY --from=builder /app/dist ./dist
# ...rest of production setup
Run tests during build with:
docker build --target tester .
Visualizing Multistage Builds
Here's a diagram showing the flow of a typical multistage build:
Summary and Key Takeaways
Docker multistage builds are a powerful technique for creating optimized container images by:
- Separating build-time and runtime environments
- Including only necessary artifacts in the final image
- Reducing image size, attack surface, and deployment time
- Improving build caching and development workflows
Key benefits include:
- Smaller image sizes (often 10-50x smaller)
- Improved security through reduced attack surface
- Faster deployments due to smaller image transfers
- Simplified Dockerfiles that don't require external build scripts
Exercises
-
Basic Exercise: Convert a single-stage Dockerfile for a simple web application to use multistage builds.
-
Intermediate Exercise: Create a multistage Dockerfile for a compiled language (Go, Rust, C++) that produces a final image under 20MB.
-
Advanced Exercise: Implement a multistage build with three stages: development, testing, and production, using build arguments to select the appropriate stage.
Additional Resources
Happy containerizing!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)