Skip to main content

Building Docker Images

Learn to create efficient, secure, and production-ready Docker images using Dockerfiles.

The Dockerfile

A Dockerfile is a text document that contains all the commands a user could call on the command line to assemble an image.

Basic Structure

# 1. Base Image
FROM node:18-alpine

# 2. Working Directory
WORKDIR /app

# 3. Copy Dependencies
COPY package*.json ./

# 4. Install Dependencies
RUN npm ci --only=production

# 5. Copy Source Code
COPY . .

# 6. Expose Port
EXPOSE 3000

# 7. Define User (Security)
USER node

# 8. Startup Command
CMD ["node", "server.js"]

Key Instructions

InstructionDescriptionExample
FROMBase image to start fromFROM ubuntu:22.04
WORKDIRSets working directoryWORKDIR /app
COPYCopies files from host to imageCOPY . .
RUNExecutes command during buildRUN apt-get update
ENVSets environment variablesENV NODE_ENV=production
EXPOSEDocuments listening portsEXPOSE 80
CMDDefault command to runCMD ["npm", "start"]
ENTRYPOINTMain executableENTRYPOINT ["python"]

Image Layers & Caching

Docker images are built from layers. Each instruction creates a new layer. Order matters! Put least frequently changed instructions at the top to maximize cache hits.
# BAD: Re-installs dependencies every time code changes
COPY . .
RUN npm install

# GOOD: Uses cache if package.json hasn't changed
COPY package*.json ./
RUN npm install
COPY . .

Multi-Stage Builds

Drastically reduce image size by separating build tools from runtime artifacts.

Example: Go Application

# Stage 1: Builder
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp main.go

# Stage 2: Runtime
FROM alpine:latest
WORKDIR /root/
# Copy only the binary from the builder stage
COPY --from=builder /app/myapp .
CMD ["./myapp"]
Result:
  • Builder image: ~800MB (contains Go compiler, source code)
  • Runtime image: ~15MB (contains only binary and minimal OS)

Building & Tagging

# Build with default tag (latest)
docker build -t myapp .

# Build with specific tag
docker build -t myapp:1.0 .

# Build with multiple tags
docker build -t myapp:1.0 -t myapp:latest .

# Build from specific file
docker build -f Dockerfile.prod -t myapp:prod .

# Build without cache (if needed)
docker build --no-cache -t myapp:clean .

Managing Images

# List images
docker images

# Remove image
docker rmi myapp:1.0

# Remove dangling images (untagged, <none>)
docker image prune

# Save image to tarball
docker save -o myapp.tar myapp:1.0

# Load image from tarball
docker load -i myapp.tar

Best Practices

Start with alpine based images (e.g., node:alpine, python:alpine) to keep images small and secure.
Create a non-root user and switch to it with USER instruction.
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
Exclude files like node_modules, .git, and secrets from the build context.
# .dockerignore
node_modules
.git
.env
Dockerfile

Advanced Dockerfile Techniques

BuildKit Features

Enable BuildKit for modern features:
DOCKER_BUILDKIT=1 docker build -t myapp .

Cache Mounts (Speed Up Builds)

Mount package manager caches to speed up repeated builds:
# Node.js with npm cache
RUN --mount=type=cache,target=/root/.npm \
    npm ci --only=production

# Python with pip cache
RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

# Go with module cache
RUN --mount=type=cache,target=/go/pkg/mod \
    go build -o app .

Secret Mounts (Don’t Bake Secrets!)

Access secrets during build without storing in layer:
# syntax=docker/dockerfile:1.4
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc \
    npm ci --only=production
docker build --secret id=npmrc,src=.npmrc .

SSH Mounts (Clone Private Repos)

RUN --mount=type=ssh \
    git clone [email protected]:private/repo.git
docker build --ssh default .

Image Security Scanning

Docker Scout

# Analyze local image
docker scout cves myapp:latest

# View recommendations
docker scout recommendations myapp:latest

Trivy

# Install trivy
brew install trivy  # macOS

# Scan image
trivy image myapp:latest

# Scan with severity filter
trivy image --severity HIGH,CRITICAL myapp:latest

Best Practices for Secure Images

# 1. Use specific versions
FROM node:18.19.0-alpine3.19

# 2. Run as non-root
RUN addgroup -S app && adduser -S app -G app
USER app

# 3. Minimize attack surface - use distroless
FROM gcr.io/distroless/nodejs18-debian12
COPY --from=builder /app /app
CMD ["server.js"]

# 4. Don't install unnecessary packages
RUN apk add --no-cache curl  # --no-cache reduces size

# 5. Use COPY instead of ADD (ADD has extra features you rarely need)
COPY package*.json ./

Distroless Images

Minimal images containing only your app and runtime dependencies. No shell, no package manager.
# Build stage
FROM golang:1.21 AS builder
WORKDIR /app
COPY . .
RUN CGO_ENABLED=0 go build -o server

# Runtime stage - distroless
FROM gcr.io/distroless/static-debian12
COPY --from=builder /app/server /server
CMD ["/server"]
Base ImageSizeAttack Surface
ubuntu:22.04~77MBHigh
alpine:3.19~7MBMedium
distroless/static~2MBMinimal
scratch0MBNone (just your binary)

Image Optimization Checklist

Combine RUN commands to reduce layers:
# Bad: 3 layers
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*

# Good: 1 layer
RUN apt-get update && \
    apt-get install -y curl && \
    rm -rf /var/lib/apt/lists/*
Put least-changing commands first:
# Package files change less often than code
COPY package*.json ./
RUN npm ci

# Code changes frequently - cache invalidates here
COPY . .
Keep build tools out of final image:
FROM node:18 AS builder
RUN npm ci && npm run build

FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html

Interview Questions & Answers

Multi-stage builds use multiple FROM statements:
  • Build stage: Has compilers, dev dependencies
  • Runtime stage: Has only the built artifact
Benefits:
  • Smaller final image (MBs vs GBs)
  • Fewer vulnerabilities (no build tools)
  • Single Dockerfile for build and runtime
Each instruction creates a layer. Docker caches layers and reuses them if:
  • The instruction hasn’t changed
  • All previous layers are cached
Cache busting: If a layer changes, all subsequent layers are rebuilt.Optimize by:
  • Putting changing content (COPY .) last
  • Copying dependency files separately before code
FeatureCOPYADD
Copy local files
Auto-extract tar
Download URLs
Preferred
Best Practice: Always use COPY unless you need tar extraction.
  1. Use Alpine/distroless base images
  2. Multi-stage builds to exclude build tools
  3. Combine RUN commands to reduce layers
  4. Clean up in the same layer (rm -rf /var/cache/*)
  5. Use .dockerignore to exclude unnecessary files
  6. Don’t install debugging tools in production
An image with no tag (shows as <none>:<none>).Causes:
  • Rebuilding with same tag (old image becomes dangling)
  • Intermediate build stages
Clean up:
docker image prune        # Remove dangling only
docker image prune -a     # Remove all unused
Never do this:
ENV API_KEY=secret123  # Stored in image history!
COPY .env .            # Baked into layer!
Do this instead:
  • Use BuildKit secret mounts
  • Pass at runtime: docker run -e API_KEY=secret
  • Use Docker secrets (Swarm) or Kubernetes secrets

Common Pitfalls

1. COPY . . Before Dependencies: Invalidates cache on every code change. Copy package.json first.2. Not Cleaning Up in Same Layer: RUN apt-get install && rm cache saves space; separate RUN commands don’t.3. Secrets in Build Args: Build args are visible in image history. Use secret mounts.4. Using latest Base Image: Builds aren’t reproducible. Pin specific versions.5. Large Build Context: .dockerignore should exclude node_modules, .git, etc.6. Root User in Production: Security risk. Always create and use a non-root user.

Next: Docker Networking →