Skip to main content

Chapter 9: The Container Workflow - Artifact Registry and Cloud Build

Before a container ever reaches a Kubernetes cluster, it must be built, secured, and stored. In Google Cloud, this “Supply Chain” is handled by Cloud Build and Artifact Registry, integrated with deep security features like Artifact Analysis and Binary Authorization.

1. Artifact Registry: Beyond Docker

Artifact Registry is the next generation of Google Container Registry (GCR). It is a fully managed service that supports container images, language packages, and OS packages.

Key Architectural Features

  • Regional Repositories: Unlike the legacy GCR (which used multi-regional buckets), Artifact Registry allows you to store images in specific regions (e.g., europe-west1). This reduces latency for your GKE clusters and eliminates cross-region egress costs.
  • Virtual Repositories: You can create a single “Virtual” endpoint that aggregates multiple upstream repositories. This allows developers to use one URL while pulling from both internal private repos and public mirrors.
  • Remote Repositories: Act as a caching proxy for public registries like Docker Hub or npmjs.org. This protects your builds from upstream outages (like the “left-pad” incident).

2. Cloud Build: Serverless CI/CD

Cloud Build is a serverless execution engine. It executes your build as a series of Build Steps, where each step is a container.

2.1 Custom Build Steps

If the standard Google-managed steps (gcloud, docker, git) aren’t enough, you can create your own.
  • Concept: Any container image can be a build step.
  • Example: Create a custom Go-based container that performs specialized security audits or database migrations.

2.2 Secret Manager Integration

Never hardcode API keys or credentials in your cloudbuild.yaml.
  • The Right Way: Store secrets in Secret Manager and reference them in your build configuration.
Example (Injecting a GitHub Token):
availableSecrets:
  secretManager:
  - versionName: projects/$PROJECT_ID/secrets/github-token/versions/latest
    env: 'GITHUB_TOKEN'

steps:
- name: 'gcr.io/cloud-builders/git'
  entrypoint: 'bash'
  args:
  - '-c'
  - |
    git clone https://[email protected]/my-org/private-repo.git
  secretEnv: ['GITHUB_TOKEN']

2.3 Layer Caching with GCS

To speed up builds, Cloud Build can cache Docker layers in a GCS bucket.
  • Mechanism: The --cache flag in the gcloud builds submit command.
  • Benefit: Reduces build times by 50-80% for complex apps where only the application code changes, not the base OS or dependencies.

3. The “Hardened Container” Stack

Security in containerization is about layers. GCP provides a unique stack to ensure your containers are isolated and minimal.

Container-Optimized OS (COS)

The default node image for GKE.
  • Minimal Footprint: It includes only the essential components to run Docker/containerd.
  • Read-Only Rootfs: The root file system is read-only. Even if a container escapes to the host, it cannot modify the OS.
  • Auto-Updates: It includes a built-in mechanism for safe, automatic security updates.

gVisor (GKE Sandbox)

Standard containers share the host’s Linux kernel. If a container exploits a kernel vulnerability, it can take over the entire machine.
  • The Solution: gVisor is an open-source “user-space kernel” written in Go.
  • How it works: It intercepts system calls from the container and handles them in a sandbox, providing a strong boundary between the container and the host kernel.
  • Use Case: Running untrusted code or multi-tenant workloads.

4. Advanced Supply Chain: Remote Repositories and Private Pools

4.1 Artifact Registry: Remote and Virtual Repositories

Artifact Registry isn’t just a place to push images; it’s a proxy for the world.
  • Remote Repositories: You can point a repository to Docker Hub or npmjs.org. When you pull an image, Artifact Registry caches it locally. This protects you from upstream outages and reduces latency.
  • Virtual Repositories: Aggregate multiple repositories (standard, remote, and other virtual) into a single endpoint. This is the “Gold Standard” for enterprise package management, providing a unified URL for developers.

4.2 Cloud Build Private Pools: VPC Integration

For enterprise security, you often need to build code that resides in a private network (e.g., a private GitLab or a Bitbucket server).
  • VPC Peering: Private Pools require a VPC peering connection between your network and the Google-managed “Service Networking” project.
  • No Internet Access: You can configure a Private Pool to have zero external internet access, ensuring that your source code and build artifacts never leave your private perimeter.

5. Security: The Hardened Host and Sandbox

4.1 Artifact Analysis & Vulnerability Insights

Once an image is in Artifact Registry, Artifact Analysis kicks in.
  • Continuous Scanning: It continuously scans images for new CVEs (Common Vulnerabilities and Exposures). If a new vulnerability is discovered in nginx:latest, you’ll see an alert in your registry.
  • On-Demand Scanning (Vulnerability Insights API): For CI/CD pipelines, you can use the Vulnerability Insights API to scan an image before it is fully deployed.
    • Workflow: Cloud Build builds the image -> Triggers On-Demand Scan -> If “Critical” count > 0 -> Build Fails.
# Example: Triggering an on-demand scan via CLI
gcloud artifacts docker images scan us-central1-docker.pkg.dev/$PROJECT_ID/prod-images/my-app:v1

4.2 Artifact Registry Cleanup Policies

Storing every version of every image forever is expensive. Cleanup Policies allow you to automate the deletion of old or unused artifacts.

Tag Immutability: Protecting Production

A critical security best practice is Tag Immutability.
  • The Problem: Someone accidentally pushes a broken image as v1.0.0, overwriting the stable production image.
  • The Solution: Configure Artifact Registry to make tags immutable. Once v1.0.0 is pushed, it cannot be overwritten. Updates must be pushed as v1.0.1.

4.3 Artifact Analysis & Severity Levels

Once an image is in Artifact Registry, Artifact Analysis scans it for CVEs.
  • Severity Levels:
    • CRITICAL: Remote code execution or complete system takeover.
    • HIGH: Significant security risk (e.g., privilege escalation).
    • MEDIUM/LOW: Minor risks requiring authentication or physical access.
  • CVSS Score: A numerical value (0-10) indicating the risk. SRE teams typically set a “Break-the-Build” threshold at 7.0 or higher.

5. Security: Binary Authorization (BinAuthz)

Binary Authorization is the ultimate gatekeeper for GKE.

5.1 The “Chain of Trust”

  1. Policy: You define a policy in BinAuthz stating that every image must be signed by an Attestor.
  2. Attestation: After Cloud Build successfully runs security scans and tests, it uses a PGP key to “sign” the image hash.
  3. Enforcement: When you attempt to kubectl apply a deployment, the GKE admission controller checks BinAuthz. If the image doesn’t have a valid signature from the required Attestor, the deployment is rejected.

5. Interview Preparation: Architectural Deep Dive

1. Q: Why should an organization migrate from Container Registry (GCR) to Artifact Registry? A: Artifact Registry is the successor to GCR and offers several enterprise features:
  • Regional Repositories: GCR is multi-regional (e.g., us.gcr.io), leading to egress costs when GKE clusters pull across regions. Artifact Registry is regional (e.g., us-central1-docker.pkg.dev), eliminating those costs.
  • Cleanup Policies: Automated deletion of old images based on tags or age.
  • Multiple Formats: Supports Docker, npm, Maven, Python, and OS packages (Apt/Yum) in a single service.
  • VPC Service Controls: Native support for security perimeters.
2. Q: Explain the security benefit of gVisor (GKE Sandbox). A: Standard containers share the host’s Linux kernel. If a container escapes (e.g., via a kernel exploit), it could gain control of the host node. gVisor acts as a user-space kernel that intercepts and filters system calls. It provides a much stronger isolation boundary, ensuring that even if a container is compromised, it cannot interact directly with the host kernel, making it ideal for running untrusted code or multi-tenant workloads. 3. Q: How does Binary Authorization integrate with Cloud Build? A: Binary Authorization is a deploy-time security gate. During the CI/CD process, Cloud Build (or a scanning tool) creates an Attestation (a digital signature) for a specific image hash. The Binary Authorization policy in GKE is configured to only allow images that have a valid signature from a specific “Attestor.” This ensures that only code that has passed security scans and automated tests can reach production. 4. Q: What is a Cloud Build “Private Pool” and why use it? A: By default, Cloud Build runs in a Google-managed multi-tenant network. A Private Pool allows you to run build workers inside a private VPC. This is critical for builds that need to access internal resources like a private Bitbucket server, a Cloud SQL instance with only a private IP, or an on-premise artifact repository via Interconnect, all while keeping the traffic entirely off the public internet. 5. Q: What is the “Vulnerability Insights API” and how is it used in a pipeline? A: It is the API used for On-Demand Scanning. Unlike the continuous scan that happens in the background of Artifact Registry, the Insights API allows you to trigger a scan programmatically before an image is stored or deployed. In a CI/CD pipeline, you can call this API to get a report on CVEs and fail the build if the number of “Critical” or “High” vulnerabilities exceeds your organization’s threshold.

Implementation: The “Secure Build” Lab

Creating a Secure Repository and Build Pipeline

# 1. Create a Docker repository with Vulnerability Scanning enabled
gcloud artifacts repositories create prod-images \
    --repository-format=docker \
    --location=us-central1 \
    --description="Production hardened images"

# 2. Submit a build that uses Kaniko for caching
gcloud builds submit . \
    --config=cloudbuild.yaml \
    --substitutions=_IMAGE_NAME=my-app,_VERSION=v1

# 3. cloudbuild.yaml example
# steps:
# - name: 'gcr.io/kaniko-project/executor:latest'
#   args:
#   - --destination=us-central1-docker.pkg.dev/$PROJECT_ID/prod-images/$_IMAGE_NAME:$_VERSION
#   - --cache=true
#   - --cache-ttl=24h

Pro-Tip: Vulnerability Alerts

Don’t just scan; act. You can set up a Pub/Sub topic for Artifact Analysis. Every time a new vulnerability is found, GCS can send a message to a Cloud Function that automatically triggers a re-build or alerts your SRE team on Slack.