Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Kubernetes Fundamentals

Master the core concepts of Kubernetes (K8s) container orchestration and understand its architecture.

What is Kubernetes?

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery.

Orchestration

Manages container lifecycle, scheduling, and health

Scaling

Automatically scales apps up or down based on demand

Self-healing

Restarts failed containers, replaces and kills containers that don’t respond

Load Balancing

Distributes network traffic to maintain stability

Kubernetes Architecture

A Kubernetes cluster consists of a Control Plane and a set of Worker Nodes.

Control Plane Components

The “brain” of the cluster. Think of it like the management team of a large warehouse: the API Server is the front desk that takes all orders, etcd is the filing cabinet that stores the master records, the Scheduler is the floor manager who assigns work to available workers, and the Controller Manager is the quality inspector who constantly checks that reality matches the plan.
  • API Server: The frontend for the K8s control plane. Exposes the Kubernetes API. Every kubectl command, every controller, every other component talks through the API Server — it is the single point of entry.
  • etcd: Consistent and highly-available key-value store for all cluster data. If etcd is lost and unrecoverable, your entire cluster state is gone. Back it up.
  • Scheduler: Watches for newly created Pods with no assigned node, and selects a node for them to run on based on resource availability, constraints, and affinity rules.
  • Controller Manager: Runs controller processes (e.g., Node Controller, Job Controller). Each controller is an infinite reconciliation loop that watches for drift between desired state and actual state and corrects it.

Node Components

Run on every node, maintaining running pods and providing the Kubernetes runtime environment.
  • Kubelet: An agent that runs on each node. It ensures that containers are running in a Pod.
  • Kube-Proxy: Maintains network rules on nodes. Allows network communication to your Pods.
  • Container Runtime: The software that is responsible for running containers (e.g., Docker, containerd).

Core Objects

1. Pods

The smallest deployable unit in Kubernetes. A Pod is not a container — it is a wrapper around one or more containers that share a network and storage context. Think of a Pod as a shared apartment: each container (roommate) has its own room (filesystem), but they share the kitchen (network namespace) and living room (volumes).
  • Represents a single instance of a running process.
  • Can contain one or more containers (usually one, but sidecar patterns are common).
  • Containers in a Pod share:
    • Network: Same IP address and port space (can talk via localhost).
    • Storage: Shared volumes.

2. Namespaces

Virtual clusters backed by the same physical cluster.
  • Used to divide cluster resources between multiple users/teams.
  • Examples: default, kube-system, dev, prod.

kubectl Basics

kubectl is the command-line tool for communicating with the Kubernetes API server.

Cluster Info & Navigation

# Check cluster status
kubectl cluster-info

# List all nodes
kubectl get nodes

# List all namespaces
kubectl get namespaces

# Set default namespace context
kubectl config set-context --current --namespace=dev

Viewing Resources

# List pods in current namespace
kubectl get pods

# List pods with more details (IP, Node)
kubectl get pods -o wide

# List pods in all namespaces
kubectl get pods -A

# Describe a specific pod (Crucial for debugging!)
kubectl describe pod my-pod

# View pod logs
kubectl logs my-pod
kubectl logs my-pod -c my-container  # If multi-container
kubectl logs -f my-pod  # Follow logs

Interacting with Pods

# Execute command inside a container
kubectl exec -it my-pod -- /bin/bash
kubectl exec -it my-pod -- /bin/sh  # If bash isn't available

# Port forward (Access pod from localhost)
kubectl port-forward my-pod 8080:80

Creating Your First Pod

Kubernetes objects are typically defined in YAML files.

Imperative (CLI)

Quick for testing, but not recommended for production.
kubectl run nginx --image=nginx:latest --restart=Never

Declarative (YAML)

The “Infrastructure as Code” way.
# nginx-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
  labels:
    app: web
    env: dev
spec:
  containers:
  - name: nginx
    image: nginx:1.21
    ports:
    - containerPort: 80
    resources:
      limits:
        memory: "128Mi"
        cpu: "500m"
Apply the configuration:
# Create/Update resource
kubectl apply -f nginx-pod.yaml

# Verify
kubectl get pods -l app=web

# Delete
kubectl delete -f nginx-pod.yaml

Pod Lifecycle

  1. Pending: Pod accepted by system, but container image not yet created.
  2. Running: Pod bound to a node, all containers created, at least one running.
  3. Succeeded: All containers terminated successfully (exit code 0).
  4. Failed: All containers terminated, at least one with failure.
  5. Unknown: State cannot be obtained.

Resource Management

Every container should have resource requests and limits defined.

Requests vs Limits

This is one of the most important concepts in Kubernetes resource management. Requests are your reservation — “I need at least this much.” Limits are your ceiling — “I can never use more than this.” The analogy: a request is like reserving a table at a restaurant (guaranteed capacity), and a limit is the maximum tab you can run up.
ConceptDescriptionInterview Insight
RequestMinimum resources guaranteedUsed by Scheduler to place pods. If a node does not have enough unrequested capacity, the pod will not be scheduled there.
LimitMaximum resources allowedEnforced at runtime. Memory limit exceeded = OOMKilled. CPU limit exceeded = throttled (not killed).
resources:
  requests:
    memory: "64Mi"     # Scheduler reserves 64Mi on the node for this container
    cpu: "250m"        # 0.25 CPU cores (250 millicores). 1000m = 1 full core.
  limits:
    memory: "128Mi"    # Container is OOMKilled if it exceeds 128Mi
    cpu: "500m"        # Container is throttled (not killed) above 0.5 cores
Common Interview Question: What happens when a container exceeds its memory limit?
  • The container is OOMKilled (Out of Memory Killed) by the kernel.
  • If CPU limit is exceeded, the container is throttled, not killed.

Quality of Service (QoS) Classes

Kubernetes assigns QoS classes based on resource settings:
QoS ClassConditionEviction Priority
Guaranteedrequests = limits (both CPU and memory)Last to be evicted
Burstablerequests < limits, or only one is setMiddle priority
BestEffortNo requests or limitsFirst to be evicted

Health Probes (Critical for Interviews!)

Probes allow Kubernetes to know when to restart or route traffic to a container. Without probes, Kubernetes only knows if the main process exited — it has no way to detect a deadlocked application or a service that is running but unable to handle requests.

Liveness Probe

“Is the container alive?” - If it fails, the container is restarted. Use this to detect deadlocks, infinite loops, or corrupted state where the process is running but not functional.
livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 15
  periodSeconds: 10
  failureThreshold: 3

Readiness Probe

“Is the container ready to receive traffic?” - If it fails, the Pod is removed from Service endpoints (no traffic routed to it), but the container is NOT restarted. Use this for temporary conditions like warming caches or waiting for a downstream dependency.
readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Startup Probe

“Has the application started?” - For slow-starting apps (JVM warmup, large ML model loading, database migrations). Disables liveness/readiness probes until it succeeds, preventing premature restarts during boot.
startupProbe:
  httpGet:
    path: /healthz
    port: 8080
  failureThreshold: 30
  periodSeconds: 10
Interview Tip: Always explain the difference between liveness and readiness probes. Liveness restarts containers; Readiness controls traffic routing.

etcd Deep Dive

etcd is the “source of truth” for Kubernetes. Understanding it is crucial for interviews.

Key Facts

  • Distributed key-value store using Raft consensus
  • Stores all cluster state: Pods, Services, Secrets, ConfigMaps
  • Strongly consistent - reads return the most recent write
  • Typically runs as a 3 or 5 node cluster (odd numbers for quorum)

Common Interview Questions

  • Existing workloads continue running (kubelet manages local pods)
  • No new operations possible (no scheduling, no API calls)
  • Cluster is in read-only mode until etcd recovers
ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key
Quorum = (n/2) + 1. For 3 nodes, quorum is 2. If you lose quorum, etcd becomes read-only.

RBAC Basics

Role-Based Access Control (RBAC) regulates access to Kubernetes resources.

Key Components

ResourceScopeDescription
RoleNamespaceDefines permissions within a namespace
ClusterRoleCluster-wideDefines permissions across all namespaces
RoleBindingNamespaceBinds Role to users/groups/service accounts
ClusterRoleBindingCluster-wideBinds ClusterRole cluster-wide
# Role: Can read pods in "dev" namespace
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: dev
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
---
# RoleBinding: Bind to user "jane"
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: dev
subjects:
- kind: User
  name: jane
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Pod Lifecycle

  1. Pending: Pod accepted by system, but container image not yet created.
  2. Running: Pod bound to a node, all containers created, at least one running.
  3. Succeeded: All containers terminated successfully (exit code 0).
  4. Failed: All containers terminated, at least one with failure.
  5. Unknown: State cannot be obtained.

Interview Questions & Answers

A Container is a single running process with its own filesystem and network namespace. A Pod is a Kubernetes abstraction that can contain one or more containers that share:
  • Network namespace (same IP, communicate via localhost)
  • Storage volumes
  • Lifecycle (created and destroyed together)
  1. Watches for unscheduled Pods (via API Server)
  2. Filtering: Eliminates nodes that don’t meet requirements (resources, taints, nodeSelector)
  3. Scoring: Ranks remaining nodes based on priorities (least utilized, affinity rules)
  4. Binding: Assigns Pod to the highest-scoring node
  • create: Creates a resource. Fails if it already exists.
  • apply: Creates or updates a resource. Idempotent. Recommended for GitOps workflows.
  1. Node Controller marks node as NotReady after 40s of no heartbeat
  2. After pod-eviction-timeout (default 5min), pods are evicted
  3. Deployment/ReplicaSet controllers create replacement pods on healthy nodes
A container that runs alongside the main application container in the same Pod to provide supporting functionality:
  • Logging: Collects and ships logs (e.g., Fluentd sidecar)
  • Service Mesh: Handles networking (e.g., Envoy proxy in Istio)
  • Security: Handles TLS termination or secrets injection

Key Takeaways

  • Control Plane manages the cluster; Nodes run the applications.
  • Pods are the atomic unit of scheduling.
  • Use Declarative (YAML) configuration for reproducibility.
  • Always define resource requests and limits.
  • Implement liveness and readiness probes for production workloads.
  • kubectl describe and kubectl logs are your best friends for debugging.

Interview Deep-Dive

Strong Answer:
  • kubectl serializes the command into an API request (a Pod object in JSON) and sends it to the API server over HTTPS.
  • The API server authenticates the request (client certificate or bearer token), authorizes it (RBAC check — does this user have permission to create pods in this namespace?), and runs it through admission controllers (mutating webhooks might inject a sidecar, validating webhooks might enforce a label requirement).
  • The validated Pod object is persisted to etcd. At this point, the Pod exists in the cluster state but has no node assigned — its nodeName field is empty.
  • The scheduler is watching etcd (via the API server’s watch mechanism) for pods with no node assignment. It picks up this pod, runs its filtering phase (which nodes have enough CPU and memory, match nodeSelector, tolerate taints), then its scoring phase (which of the remaining nodes is the best fit), and writes the selected node name back to the pod’s spec in etcd.
  • The kubelet on the selected node is also watching for pods assigned to its node. It sees the new pod, pulls the nginx image through the container runtime (containerd via CRI), creates the pause container to hold network namespaces, then starts the nginx container inside that pod sandbox.
  • The kubelet reports the pod’s status back to the API server, which updates etcd. Now kubectl get pods shows the pod as Running.
Follow-up: If the scheduler cannot find any node for the pod, what happens? How would you debug it?The pod stays in Pending state indefinitely. Running kubectl describe pod <name> shows FailedScheduling events with a message like “0/3 nodes are available: 1 Insufficient cpu, 2 node(s) had taint that the pod didn’t tolerate.” The fix depends on the root cause: if it is resources, you either scale down other workloads, add nodes, or reduce the pod’s resource requests. If it is taints, you either add a toleration to the pod or remove the taint from the node. I have seen teams burn hours on Pending pods when the real issue was a ResourceQuota rejecting pods without explicit resource requests.
Strong Answer:
  • First, I would check kubectl describe pod <name> to confirm the exit code is 137 (SIGKILL from OOM) and see which container is being killed. The Last State section shows the exact reason.
  • Next, I would look at actual memory usage over time, not just the limit. If the cluster runs Prometheus, I would query container_memory_working_set_bytes for the container over the past hour to see whether usage is steadily climbing (memory leak) or spiking under load (insufficient allocation).
  • If it is a gradual climb, the application has a memory leak. Increasing the limit only delays the inevitable. The fix is in the application code — profiling with language-specific tools (Java heap dumps, Go pprof, Python tracemalloc).
  • If usage is spiking under load, the limit is genuinely too low. I would check whether the requests and limits are set correctly. A common mistake is setting requests equal to limits (Guaranteed QoS) at a value that is too low. The pod gets exactly what it asks for and no more, so any burst kills it.
  • I would also check if the pod is running a JVM-based application, because the JVM has its own memory management. If MaxRAMPercentage is not set correctly, the JVM might try to use more heap than the container limit allows, and the kernel kills the container before the JVM even knows it is out of memory.
Follow-up: What is the difference between the memory metric in kubectl top pod and what the OOM killer actually uses?kubectl top pod reports the working set size, which is the memory the kernel considers “in use” and non-reclaimable. The OOM killer looks at the cgroup memory usage, which includes the page cache. If your application reads large files, the page cache might push the cgroup usage above the limit even though the application itself is not using that much heap. This is why container_memory_working_set_bytes (from cAdvisor/Prometheus) is a better metric than raw RSS for understanding OOM risk.
Strong Answer:
  • Liveness probe answers “is this container alive?” If it fails, kubelet kills and restarts the container. Use it when your application can get into a deadlocked or hung state that only a restart can fix.
  • Readiness probe answers “is this container ready to receive traffic?” If it fails, the pod is removed from Service endpoints but NOT restarted. Use it when the application needs time to warm up (load a cache, connect to a database) or when it is temporarily overloaded.
  • Startup probe answers “has this application finished starting?” It runs before liveness and readiness probes. While it is running, those other probes are disabled. Use it for slow-starting applications (large Java apps, ML model loading) to avoid liveness probes killing the container before it finishes booting.
  • A real scenario: a team configured a liveness probe with initialDelaySeconds: 5 on a Java application that took 90 seconds to start. During every deployment, the liveness probe failed because the app was not ready yet, so kubelet kept killing and restarting the container, which restarted the 90-second boot, which failed the liveness check again. The pod went into CrashLoopBackOff and never became healthy. The fix was adding a startup probe with a high failureThreshold (30 retries at 10-second intervals = 5-minute window to start), which disabled the liveness probe during boot.
Follow-up: Should a readiness probe check downstream dependencies like the database, or only the application itself?This is a trade-off with strong opinions on both sides. If the readiness probe checks the database and the database goes down, ALL pods become unready simultaneously, which means the Service has zero endpoints and returns 503 to every request. That is often worse than serving degraded responses. My preference is to have the readiness probe check only the application’s own health, and handle downstream failures with circuit breakers and graceful degradation in the application code. The exception is during startup — checking that the database connection is established before accepting traffic is reasonable.

Next: Kubernetes Workloads →