Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Building Docker Images
Learn to create efficient, secure, and production-ready Docker images using Dockerfiles.The Dockerfile
ADockerfile is a text document that contains all the commands a user could call on the command line to assemble an image.
Basic Structure
Key Instructions
| Instruction | Description | Example |
|---|---|---|
FROM | Base image to start from | FROM ubuntu:22.04 |
WORKDIR | Sets working directory | WORKDIR /app |
COPY | Copies files from host to image | COPY . . |
RUN | Executes command during build | RUN apt-get update |
ENV | Sets environment variables | ENV NODE_ENV=production |
EXPOSE | Documents listening ports | EXPOSE 80 |
CMD | Default command to run | CMD ["npm", "start"] |
ENTRYPOINT | Main executable | ENTRYPOINT ["python"] |
Image Layers & Caching
Docker images are built from layers. Each Dockerfile instruction creates a new layer, and Docker caches every layer. When it encounters an instruction that has not changed (and all preceding layers are also cached), it reuses the cached version instead of re-executing the instruction. Order matters! Once a layer is invalidated, every layer after it must be rebuilt. Think of it like a stack of pancakes — you cannot swap one from the middle without removing everything above it.Multi-Stage Builds
Drastically reduce image size by separating build tools from runtime artifacts.Example: Go Application
- Builder image: ~800MB (contains Go compiler, source code, all modules)
- Runtime image: ~15MB (contains only the binary and minimal Alpine OS)
Building & Tagging
Managing Images
Best Practices
Use Alpine Images
Use Alpine Images
alpine based images (e.g., node:alpine, python:alpine) to keep images small and secure.Don't Run as Root
Don't Run as Root
USER instruction.Use .dockerignore
Use .dockerignore
node_modules, .git, and secrets from the build context.Advanced Dockerfile Techniques
BuildKit Features
Enable BuildKit for modern features:Cache Mounts (Speed Up Builds)
Cache mounts persist data between builds without baking it into the image layer. This is like having a shared tool shed between builds — packages downloaded once are reused on the next build, but the cache never ships in the final image.Secret Mounts (Don’t Bake Secrets!)
Access secrets during build without storing them in any image layer. This is critical because anyone withdocker history or access to your registry can inspect layer contents. A secret mount is like a temporary sticky note that disappears after the build step.
SSH Mounts (Clone Private Repos)
Image Security Scanning
Docker Scout
Trivy
Best Practices for Secure Images
Distroless Images
Minimal images containing only your app and runtime dependencies. No shell, no package manager.| Base Image | Size | Attack Surface |
|---|---|---|
ubuntu:22.04 | ~77MB | High |
alpine:3.19 | ~7MB | Medium |
distroless/static | ~2MB | Minimal |
scratch | 0MB | None (just your binary) |
Image Optimization Checklist
Reduce Layer Count
Reduce Layer Count
Order Commands for Cache
Order Commands for Cache
Use Multi-Stage Builds
Use Multi-Stage Builds
Interview Questions & Answers
What is a multi-stage build and why use it?
What is a multi-stage build and why use it?
- Build stage: Has compilers, dev dependencies
- Runtime stage: Has only the built artifact
- Smaller final image (MBs vs GBs)
- Fewer vulnerabilities (no build tools)
- Single Dockerfile for build and runtime
How does Docker layer caching work?
How does Docker layer caching work?
- The instruction hasn’t changed
- All previous layers are cached
- Putting changing content (COPY .) last
- Copying dependency files separately before code
What is the difference between ADD and COPY?
What is the difference between ADD and COPY?
| Feature | COPY | ADD |
|---|---|---|
| Copy local files | ✓ | ✓ |
| Auto-extract tar | ✗ | ✓ |
| Download URLs | ✗ | ✓ |
| Preferred | ✓ | ✗ |
How do you reduce Docker image size?
How do you reduce Docker image size?
- Use Alpine/distroless base images
- Multi-stage builds to exclude build tools
- Combine RUN commands to reduce layers
- Clean up in the same layer (
rm -rf /var/cache/*) - Use .dockerignore to exclude unnecessary files
- Don’t install debugging tools in production
What is a dangling image?
What is a dangling image?
<none>:<none>).Causes:- Rebuilding with same tag (old image becomes dangling)
- Intermediate build stages
How do you handle secrets in Docker builds?
How do you handle secrets in Docker builds?
- Use BuildKit secret mounts
- Pass at runtime:
docker run -e API_KEY=secret - Use Docker secrets (Swarm) or Kubernetes secrets
Common Pitfalls
Interview Deep-Dive
You have a Node.js application where Docker builds take 8 minutes in CI. The team is frustrated. Walk me through how you would diagnose and fix the build performance.
You have a Node.js application where Docker builds take 8 minutes in CI. The team is frustrated. Walk me through how you would diagnose and fix the build performance.
- First, I would examine the Dockerfile instruction order. The most common cause of slow builds is cache invalidation too early in the layer chain. If
COPY . .comes beforeRUN npm install, then every source code change triggers a fullnpm install(~2-4 minutes). MovingCOPY package*.json ./andRUN npm ciaboveCOPY . .means npm only re-runs when dependencies actually change. - Second, I would check the
.dockerignorefile. Without one, Docker sends the entire build context to the daemon, includingnode_modules(potentially 300MB+),.githistory, test fixtures, and IDE files. Adding these to.dockerignorecan cut context transfer from minutes to seconds. - Third, I would enable BuildKit (
DOCKER_BUILDKIT=1) and use cache mounts for npm:RUN --mount=type=cache,target=/root/.npm npm ci --only=production. This persists npm’s download cache between builds, so packages that were already downloaded are not re-fetched. - Fourth, I would check if the CI runner is using Docker layer caching. Many CI systems (GitHub Actions, GitLab CI) discard the layer cache between runs by default. Using
--cache-fromwith a registry-based cache (e.g.,docker build --cache-from myapp:cache --build-arg BUILDKIT_INLINE_CACHE=1) can recover cached layers across CI runs. - Fifth, if this is a multi-stage build, I would check if independent stages are building in parallel. BuildKit builds independent stages concurrently, which can cut wall-clock time significantly for builds with separate frontend and backend stages.
docker pull myapp:cache) before building, or by using BuildKit’s registry-based cache (--cache-from type=registry,ref=myapp:cache). Some teams also distribute a “warm cache” image in onboarding docs. Once the first build completes, subsequent builds will be fast because the cache is populated locally.Explain what happens inside Docker when a layer is 'invalidated' during a build. Why does invalidating one layer force all subsequent layers to rebuild?
Explain what happens inside Docker when a layer is 'invalidated' during a build. Why does invalidating one layer force all subsequent layers to rebuild?
- Docker’s layer model is a content-addressable stack. Each layer’s identity (its hash) is computed from both its own content and the identity of all layers below it. This means a layer’s hash is determined by its instruction, its inputs, and its parent layer’s hash.
- When a layer changes (e.g., you modify a file that is COPY’d), its hash changes. Since the next layer’s hash depends on the previous layer’s hash, the next layer’s identity also changes — even if its own instruction and inputs are identical. This cascades through every subsequent layer like a chain reaction.
- Concretely: if you have
COPY . .followed byRUN npm install, and you change a source file, the COPY layer gets a new hash. The npm install layer’s cache key includes the parent layer hash, so it no longer matches the cached version, and npm install runs again from scratch — even thoughpackage.jsondid not change. - This is an intentional design choice for correctness. If Docker reused a cached layer whose parent changed, you could get inconsistent builds where the upper layer was built against different content than what is currently below it. The trade-off is build speed, which you recover by ordering instructions from least-frequently-changed to most-frequently-changed.
--mount=type=cache,target=/path) is orthogonal to the layer cache. It persists a directory across builds without baking it into any layer. Even when a layer is invalidated and re-executes, the cache mount still contains data from the previous run. For package managers (npm, pip, go mod), this means downloaded packages survive layer cache invalidation. The key insight is that the layer cache answers “did this instruction change?” while the cache mount answers “do I still have the downloaded artifacts?” They solve different problems and work best together.Your Go microservice Dockerfile uses 'FROM scratch' for the runtime stage. A teammate switches it to Alpine because they need to debug in production. What are the trade-offs, and what alternatives would you suggest?
Your Go microservice Dockerfile uses 'FROM scratch' for the runtime stage. A teammate switches it to Alpine because they need to debug in production. What are the trade-offs, and what alternatives would you suggest?
- The trade-off is security surface versus debuggability.
scratchis a zero-byte base image — no shell, no package manager, no libc, no CA certificates, nothing. The attack surface is effectively zero because there are no tools for an attacker to use even if they achieve code execution. Alpine adds a shell (/bin/sh), a package manager (apk), musl libc, and base utilities — roughly 7MB of additional attack surface. - For production, I would push back on switching the runtime image and instead propose alternatives for debugging. First, ephemeral debug containers in Kubernetes (
kubectl debug -it pod-name --image=nicolaka/netshoot) let you attach a debug container to a running pod’s network and process namespace without modifying the production image. Second, a multi-stage Dockerfile with adebugtarget:FROM alpine AS debugthat includes tools, andFROM scratch AS prodthat does not. CI builds theprodtarget; developers can build thedebugtarget locally. - Third, for Go specifically, you can compile with debug symbols and use Delve remotely — the debugger runs on the developer’s machine and connects to the Go process over a port. No shell needed in the container.
- If the team absolutely needs Alpine in production, I would add
--cap-drop=ALL, run as a non-root user, and use a read-only filesystem to mitigate the increased surface area. But my strong preference is to keep production images minimal and debug through external tooling.
scratch, there is no libc, so the binary fails at startup with a cryptic error about missing shared libraries. The options are: use Alpine (which includes musl libc) or Distroless (which includes glibc). Alternatively, you can statically link with musl using CC=musl-gcc CGO_ENABLED=1 go build -ldflags '-linkmode external -extldflags "-static"', which produces a static binary that still runs on scratch. The musl approach requires the musl toolchain in the build stage but keeps the runtime minimal.Next: Docker Networking →