Skip to main content

Chapter 5: Container Images

Container images are the portable, versioned packages that contain everything needed to run an application. Let’s understand the OCI format and implement image pulling!
Prerequisites: Chapter 3: Filesystem
Further Reading: System Design: Distributed Storage
Time: 3-4 hours
Outcome: Pull and run images from Docker Hub

OCI Image Specification

┌─────────────────────────────────────────────────────────────────────────────┐
│                        OCI IMAGE STRUCTURE                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   IMAGE MANIFEST (application/vnd.oci.image.manifest.v1+json)               │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │  {                                                                   │  │
│   │    "schemaVersion": 2,                                               │  │
│   │    "mediaType": "application/vnd.oci.image.manifest.v1+json",       │  │
│   │    "config": {                         ← Image configuration        │  │
│   │      "digest": "sha256:abc123...",                                   │  │
│   │      "size": 7023                                                    │  │
│   │    },                                                                │  │
│   │    "layers": [                         ← Filesystem layers          │  │
│   │      { "digest": "sha256:layer1...", "size": 32654848 },            │  │
│   │      { "digest": "sha256:layer2...", "size": 16724992 },            │  │
│   │      { "digest": "sha256:layer3...", "size": 73109 }                │  │
│   │    ]                                                                 │  │
│   │  }                                                                   │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│   CONFIG BLOB (application/vnd.oci.image.config.v1+json)                   │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │  {                                                                   │  │
│   │    "architecture": "amd64",                                          │  │
│   │    "os": "linux",                                                    │  │
│   │    "config": {                                                       │  │
│   │      "Env": ["PATH=/usr/local/bin:/usr/bin"],                       │  │
│   │      "Cmd": ["/bin/sh"],                                             │  │
│   │      "WorkingDir": "/"                                               │  │
│   │    },                                                                │  │
│   │    "rootfs": {                                                       │  │
│   │      "type": "layers",                                               │  │
│   │      "diff_ids": ["sha256:...", "sha256:...", "sha256:..."]         │  │
│   │    },                                                                │  │
│   │    "history": [...]                                                  │  │
│   │  }                                                                   │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
│   LAYER BLOBS (application/vnd.oci.image.layer.v1.tar+gzip)                │
│   ┌─────────────────────────────────────────────────────────────────────┐  │
│   │  Layer 1: Base OS files (compressed tar)                            │  │
│   │  Layer 2: Runtime/Dependencies (compressed tar)                     │  │
│   │  Layer 3: Application code (compressed tar)                         │  │
│   └─────────────────────────────────────────────────────────────────────┘  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Image Registry Protocol

┌─────────────────────────────────────────────────────────────────────────────┐
│                    DOCKER REGISTRY V2 API                                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   STEP 1: Get Authentication Token                                          │
│   ───────────────────────────────                                           │
│   GET https://auth.docker.io/token?                                         │
│       service=registry.docker.io&                                           │
│       scope=repository:library/alpine:pull                                  │
│                                                                              │
│   Response: { "token": "eyJ0eXAiOi..." }                                    │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────     │
│                                                                              │
│   STEP 2: Fetch Image Manifest                                              │
│   ────────────────────────────                                              │
│   GET https://registry-1.docker.io/v2/library/alpine/manifests/latest      │
│   Authorization: Bearer eyJ0eXAiOi...                                       │
│   Accept: application/vnd.oci.image.manifest.v1+json                        │
│                                                                              │
│   Response: (the manifest JSON)                                             │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────     │
│                                                                              │
│   STEP 3: Fetch Config Blob                                                 │
│   ─────────────────────────                                                 │
│   GET https://registry-1.docker.io/v2/library/alpine/blobs/sha256:abc...   │
│                                                                              │
│   ─────────────────────────────────────────────────────────────────────     │
│                                                                              │
│   STEP 4: Fetch Layer Blobs (for each layer)                               │
│   ─────────────────────────────────────────                                 │
│   GET https://registry-1.docker.io/v2/library/alpine/blobs/sha256:layer... │
│                                                                              │
│   Response: (gzipped tar file)                                              │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Part 1: Registry Client

src/main/java/com/minidocker/image/RegistryClient.java
package com.minidocker.image;

import java.io.IOException;
import java.io.InputStream;
import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Map;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

/**
 * Client for Docker Registry V2 API.
 * 
 * Handles:
 * - Authentication (Bearer tokens)
 * - Manifest fetching
 * - Blob (layer) downloading
 */
public class RegistryClient {
    
    private static final String DOCKER_HUB = "https://registry-1.docker.io";
    private static final String AUTH_URL = "https://auth.docker.io/token";
    
    private final HttpClient httpClient;
    private final ObjectMapper objectMapper;
    
    public RegistryClient() {
        this.httpClient = HttpClient.newBuilder()
            .followRedirects(HttpClient.Redirect.ALWAYS)
            .build();
        this.objectMapper = new ObjectMapper();
    }
    
    /**
     * Parses image reference (e.g., "alpine:3.18" or "nginx:latest").
     */
    public ImageReference parseReference(String image) {
        String registry = DOCKER_HUB;
        String repository;
        String tag = "latest";
        
        // Handle explicit registry
        if (image.contains("/") && image.split("/")[0].contains(".")) {
            String[] parts = image.split("/", 2);
            registry = "https://" + parts[0];
            image = parts[1];
        }
        
        // Handle tag
        if (image.contains(":")) {
            String[] parts = image.split(":", 2);
            repository = parts[0];
            tag = parts[1];
        } else {
            repository = image;
        }
        
        // Add "library/" prefix for Docker Hub official images
        if (registry.equals(DOCKER_HUB) && !repository.contains("/")) {
            repository = "library/" + repository;
        }
        
        return new ImageReference(registry, repository, tag);
    }
    
    /**
     * Gets authentication token for a repository.
     */
    public String getToken(ImageReference ref) throws IOException, InterruptedException {
        String scope = "repository:" + ref.repository() + ":pull";
        String url = AUTH_URL + "?service=registry.docker.io&scope=" + scope;
        
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .GET()
            .build();
        
        HttpResponse<String> response = httpClient.send(request, 
            HttpResponse.BodyHandlers.ofString());
        
        if (response.statusCode() != 200) {
            throw new IOException("Failed to get token: " + response.statusCode());
        }
        
        JsonNode json = objectMapper.readTree(response.body());
        return json.get("token").asText();
    }
    
    /**
     * Fetches the image manifest.
     */
    public ImageManifest getManifest(ImageReference ref, String token) 
            throws IOException, InterruptedException {
        
        String url = ref.registry() + "/v2/" + ref.repository() + 
                    "/manifests/" + ref.tag();
        
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Authorization", "Bearer " + token)
            .header("Accept", "application/vnd.docker.distribution.manifest.v2+json")
            .header("Accept", "application/vnd.oci.image.manifest.v1+json")
            .GET()
            .build();
        
        HttpResponse<String> response = httpClient.send(request,
            HttpResponse.BodyHandlers.ofString());
        
        if (response.statusCode() != 200) {
            throw new IOException("Failed to get manifest: " + response.statusCode());
        }
        
        return ImageManifest.parse(response.body(), objectMapper);
    }
    
    /**
     * Fetches a blob (config or layer) and saves to disk.
     */
    public Path downloadBlob(ImageReference ref, String token, String digest, Path destDir)
            throws IOException, InterruptedException {
        
        Path destPath = destDir.resolve(digest.replace(":", "_"));
        
        // Check if already downloaded
        if (Files.exists(destPath)) {
            System.out.println("Layer cached: " + digest.substring(0, 19) + "...");
            return destPath;
        }
        
        String url = ref.registry() + "/v2/" + ref.repository() + "/blobs/" + digest;
        
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Authorization", "Bearer " + token)
            .GET()
            .build();
        
        System.out.println("Downloading: " + digest.substring(0, 19) + "...");
        
        HttpResponse<InputStream> response = httpClient.send(request,
            HttpResponse.BodyHandlers.ofInputStream());
        
        if (response.statusCode() != 200) {
            throw new IOException("Failed to download blob: " + response.statusCode());
        }
        
        Files.createDirectories(destDir);
        
        try (InputStream in = response.body()) {
            Files.copy(in, destPath);
        }
        
        return destPath;
    }
    
    /**
     * Image reference components.
     */
    public record ImageReference(String registry, String repository, String tag) {
        @Override
        public String toString() {
            return repository + ":" + tag;
        }
    }
}

Part 2: Image Manifest

src/main/java/com/minidocker/image/ImageManifest.java
package com.minidocker.image;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

import java.util.ArrayList;
import java.util.List;

/**
 * Represents an OCI/Docker image manifest.
 */
public class ImageManifest {
    
    private final int schemaVersion;
    private final String configDigest;
    private final long configSize;
    private final List<Layer> layers;
    
    public ImageManifest(int schemaVersion, String configDigest, long configSize, 
                        List<Layer> layers) {
        this.schemaVersion = schemaVersion;
        this.configDigest = configDigest;
        this.configSize = configSize;
        this.layers = layers;
    }
    
    public static ImageManifest parse(String json, ObjectMapper mapper) throws Exception {
        JsonNode root = mapper.readTree(json);
        
        int schemaVersion = root.get("schemaVersion").asInt();
        
        JsonNode config = root.get("config");
        String configDigest = config.get("digest").asText();
        long configSize = config.get("size").asLong();
        
        List<Layer> layers = new ArrayList<>();
        for (JsonNode layer : root.get("layers")) {
            layers.add(new Layer(
                layer.get("mediaType").asText(),
                layer.get("digest").asText(),
                layer.get("size").asLong()
            ));
        }
        
        return new ImageManifest(schemaVersion, configDigest, configSize, layers);
    }
    
    public int getSchemaVersion() { return schemaVersion; }
    public String getConfigDigest() { return configDigest; }
    public long getConfigSize() { return configSize; }
    public List<Layer> getLayers() { return layers; }
    
    public long getTotalSize() {
        return configSize + layers.stream().mapToLong(Layer::size).sum();
    }
    
    /**
     * Represents a filesystem layer.
     */
    public record Layer(String mediaType, String digest, long size) {
        public boolean isGzipped() {
            return mediaType.contains("gzip");
        }
    }
}

Part 3: Image Config

src/main/java/com/minidocker/image/ImageConfig.java
package com.minidocker.image;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.List;

/**
 * Represents image configuration (CMD, ENV, etc.).
 */
public class ImageConfig {
    
    private final String architecture;
    private final String os;
    private final List<String> env;
    private final List<String> cmd;
    private final List<String> entrypoint;
    private final String workingDir;
    private final String user;
    private final List<String> exposedPorts;
    
    public ImageConfig(String architecture, String os, List<String> env,
                       List<String> cmd, List<String> entrypoint, String workingDir,
                       String user, List<String> exposedPorts) {
        this.architecture = architecture;
        this.os = os;
        this.env = env;
        this.cmd = cmd;
        this.entrypoint = entrypoint;
        this.workingDir = workingDir;
        this.user = user;
        this.exposedPorts = exposedPorts;
    }
    
    public static ImageConfig load(Path configPath) throws Exception {
        ObjectMapper mapper = new ObjectMapper();
        String json = Files.readString(configPath);
        JsonNode root = mapper.readTree(json);
        
        String architecture = root.get("architecture").asText();
        String os = root.get("os").asText();
        
        JsonNode config = root.get("config");
        
        List<String> env = new ArrayList<>();
        if (config.has("Env")) {
            for (JsonNode e : config.get("Env")) {
                env.add(e.asText());
            }
        }
        
        List<String> cmd = new ArrayList<>();
        if (config.has("Cmd")) {
            for (JsonNode c : config.get("Cmd")) {
                cmd.add(c.asText());
            }
        }
        
        List<String> entrypoint = new ArrayList<>();
        if (config.has("Entrypoint")) {
            for (JsonNode e : config.get("Entrypoint")) {
                entrypoint.add(e.asText());
            }
        }
        
        String workingDir = config.has("WorkingDir") ? 
            config.get("WorkingDir").asText() : "/";
            
        String user = config.has("User") ? 
            config.get("User").asText() : "";
            
        List<String> exposedPorts = new ArrayList<>();
        if (config.has("ExposedPorts")) {
            config.get("ExposedPorts").fieldNames().forEachRemaining(exposedPorts::add);
        }
        
        return new ImageConfig(architecture, os, env, cmd, entrypoint, 
                              workingDir, user, exposedPorts);
    }
    
    public String getArchitecture() { return architecture; }
    public String getOs() { return os; }
    public List<String> getEnv() { return env; }
    public List<String> getCmd() { return cmd; }
    public List<String> getEntrypoint() { return entrypoint; }
    public String getWorkingDir() { return workingDir; }
    public String getUser() { return user; }
    public List<String> getExposedPorts() { return exposedPorts; }
    
    /**
     * Gets the effective command to run.
     */
    public String[] getCommand() {
        List<String> command = new ArrayList<>();
        command.addAll(entrypoint);
        command.addAll(cmd);
        return command.toArray(new String[0]);
    }
}

Part 4: Image Puller

src/main/java/com/minidocker/image/ImagePuller.java
package com.minidocker.image;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
import java.util.List;

/**
 * Pulls container images from registries.
 */
public class ImagePuller {
    
    private final RegistryClient client;
    private final Path storageDir;
    
    public ImagePuller(Path storageDir) {
        this.client = new RegistryClient();
        this.storageDir = storageDir;
    }
    
    /**
     * Pulls an image from a registry.
     * 
     * @param image Image name (e.g., "alpine:3.18")
     * @return Path to extracted rootfs
     */
    public PulledImage pull(String image) throws Exception {
        System.out.println("Pulling image: " + image);
        
        // Parse image reference
        RegistryClient.ImageReference ref = client.parseReference(image);
        System.out.println("Repository: " + ref.repository());
        System.out.println("Tag: " + ref.tag());
        
        // Get authentication token
        String token = client.getToken(ref);
        System.out.println("✓ Authenticated");
        
        // Get manifest
        ImageManifest manifest = client.getManifest(ref, token);
        System.out.println("✓ Fetched manifest (" + manifest.getLayers().size() + " layers)");
        System.out.println("  Total size: " + formatBytes(manifest.getTotalSize()));
        
        // Create directories
        Path blobsDir = storageDir.resolve("blobs");
        Path layersDir = storageDir.resolve("layers");
        Path imageDir = storageDir.resolve("images").resolve(
            ref.repository().replace("/", "_") + "_" + ref.tag());
        
        Files.createDirectories(blobsDir);
        Files.createDirectories(layersDir);
        Files.createDirectories(imageDir);
        
        // Download config
        Path configPath = client.downloadBlob(ref, token, 
            manifest.getConfigDigest(), blobsDir);
        
        ImageConfig config = ImageConfig.load(configPath);
        System.out.println("✓ Config loaded (arch: " + config.getArchitecture() + 
                          ", os: " + config.getOs() + ")");
        
        // Download and extract layers
        List<Path> extractedLayers = new ArrayList<>();
        
        for (int i = 0; i < manifest.getLayers().size(); i++) {
            ImageManifest.Layer layer = manifest.getLayers().get(i);
            
            System.out.println("Layer " + (i + 1) + "/" + manifest.getLayers().size() + 
                              " (" + formatBytes(layer.size()) + ")");
            
            // Download layer blob
            Path layerBlob = client.downloadBlob(ref, token, layer.digest(), blobsDir);
            
            // Extract layer
            Path extractedLayer = layersDir.resolve(layer.digest().replace(":", "_"));
            
            if (!Files.exists(extractedLayer)) {
                extractLayer(layerBlob, extractedLayer, layer.isGzipped());
                System.out.println("  ✓ Extracted");
            } else {
                System.out.println("  ✓ Cached");
            }
            
            extractedLayers.add(extractedLayer);
        }
        
        // Write image metadata
        writeImageMetadata(imageDir, ref, manifest, config);
        
        System.out.println("✓ Image pulled successfully: " + image);
        
        return new PulledImage(imageDir, extractedLayers, config);
    }
    
    private void extractLayer(Path tarball, Path destDir, boolean gzipped) 
            throws IOException, InterruptedException {
        
        Files.createDirectories(destDir);
        
        ProcessBuilder pb;
        if (gzipped) {
            pb = new ProcessBuilder("tar", "-xzf", tarball.toString(),
                                   "-C", destDir.toString());
        } else {
            pb = new ProcessBuilder("tar", "-xf", tarball.toString(),
                                   "-C", destDir.toString());
        }
        pb.inheritIO();
        
        int exitCode = pb.start().waitFor();
        if (exitCode != 0) {
            throw new IOException("tar extraction failed");
        }
    }
    
    private void writeImageMetadata(Path imageDir, RegistryClient.ImageReference ref,
                                   ImageManifest manifest, ImageConfig config) 
            throws IOException {
        
        // Write simple metadata file
        StringBuilder meta = new StringBuilder();
        meta.append("repository=").append(ref.repository()).append("\n");
        meta.append("tag=").append(ref.tag()).append("\n");
        meta.append("architecture=").append(config.getArchitecture()).append("\n");
        meta.append("os=").append(config.getOs()).append("\n");
        
        if (!config.getCmd().isEmpty()) {
            meta.append("cmd=").append(String.join(" ", config.getCmd())).append("\n");
        }
        if (!config.getEntrypoint().isEmpty()) {
            meta.append("entrypoint=").append(String.join(" ", config.getEntrypoint()))
                .append("\n");
        }
        
        Files.writeString(imageDir.resolve("metadata"), meta.toString());
    }
    
    private String formatBytes(long bytes) {
        if (bytes < 1024) return bytes + " B";
        if (bytes < 1024 * 1024) return String.format("%.1f KB", bytes / 1024.0);
        if (bytes < 1024 * 1024 * 1024) return String.format("%.1f MB", bytes / (1024.0 * 1024));
        return String.format("%.2f GB", bytes / (1024.0 * 1024 * 1024));
    }
    
    /**
     * Result of pulling an image.
     */
    public record PulledImage(Path imageDir, List<Path> layers, ImageConfig config) {}
}

Part 5: Using Pulled Images

public class Container {
    
    public static void main(String[] args) throws Exception {
        Path storageDir = Path.of("/var/lib/minidocker");
        ImagePuller puller = new ImagePuller(storageDir);
        
        // Pull the image
        PulledImage image = puller.pull("alpine:3.18");
        
        // Get the command to run
        String[] command = image.config().getCommand();
        if (command.length == 0) {
            command = new String[]{"/bin/sh"};
        }
        
        // Create and run container
        Container container = new Container(
            image.layers(),
            "alpine-container",
            command,
            ResourceLimits.defaults(),
            storageDir
        );
        
        container.run();
    }
}

Image Storage Structure

/var/lib/minidocker/
├── blobs/                              # Downloaded blobs (shared)
│   ├── sha256_abc123...                # Config blob
│   ├── sha256_layer1...                # Layer blob (compressed)
│   ├── sha256_layer2...                # Layer blob (compressed)
│   └── sha256_layer3...                # Layer blob (compressed)

├── layers/                             # Extracted layers (shared)
│   ├── sha256_layer1.../               # Extracted layer 1
│   │   ├── bin/
│   │   ├── etc/
│   │   └── lib/
│   ├── sha256_layer2.../               # Extracted layer 2
│   └── sha256_layer3.../               # Extracted layer 3

├── images/                             # Image metadata
│   ├── library_alpine_3.18/
│   │   └── metadata
│   └── library_nginx_latest/
│       └── metadata

└── containers/                         # Running containers
    └── abc123.../
        ├── upper/                      # Container writes
        ├── work/                       # Overlay work dir
        └── merged/                     # Merged view (rootfs)

Exercises

Add command to list local images:
// minidocker images
// REPOSITORY          TAG       SIZE
// library/alpine      3.18      7.2 MB
// library/nginx       latest    142 MB
Add command to remove unused images:
// minidocker rmi alpine:3.18
// 1. Check if any containers use this image
// 2. Remove image metadata
// 3. Garbage collect unused layers
Build images from Dockerfile:
// FROM alpine:3.18
// RUN apk add --no-cache python3
// COPY app.py /app/
// CMD ["python3", "/app/app.py"]

// 1. Pull base image
// 2. Run each instruction in container
// 3. Commit changes as new layer
// 4. Save manifest and config

Key Takeaways

Content Addressable

Layers identified by SHA256 hash of contents

Layer Sharing

Common base layers shared between images

Manifest + Config

Manifest lists layers; Config has runtime settings

Incremental Pull

Only download layers not already cached

Congratulations! 🎉

You’ve built a working container runtime with:
  • ✅ Linux namespaces for isolation
  • ✅ Cgroups for resource limits
  • ✅ Overlay filesystem with copy-on-write
  • ✅ Bridge networking with NAT
  • ✅ OCI-compatible image pulling

Docker Project Complete!

You now understand how containers work at the kernel level!

What’s Next?

Continue learning with other Build Your Own X projects: