Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Kubernetes Fundamentals
Master the core concepts of Kubernetes (K8s) container orchestration and understand its architecture.What is Kubernetes?
Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery.Orchestration
Scaling
Self-healing
Load Balancing
Kubernetes Architecture
A Kubernetes cluster consists of a Control Plane and a set of Worker Nodes.Control Plane Components
The “brain” of the cluster. Think of it like the management team of a large warehouse: the API Server is the front desk that takes all orders, etcd is the filing cabinet that stores the master records, the Scheduler is the floor manager who assigns work to available workers, and the Controller Manager is the quality inspector who constantly checks that reality matches the plan.- API Server: The frontend for the K8s control plane. Exposes the Kubernetes API. Every
kubectlcommand, every controller, every other component talks through the API Server — it is the single point of entry. - etcd: Consistent and highly-available key-value store for all cluster data. If etcd is lost and unrecoverable, your entire cluster state is gone. Back it up.
- Scheduler: Watches for newly created Pods with no assigned node, and selects a node for them to run on based on resource availability, constraints, and affinity rules.
- Controller Manager: Runs controller processes (e.g., Node Controller, Job Controller). Each controller is an infinite reconciliation loop that watches for drift between desired state and actual state and corrects it.
Node Components
Run on every node, maintaining running pods and providing the Kubernetes runtime environment.- Kubelet: An agent that runs on each node. It ensures that containers are running in a Pod.
- Kube-Proxy: Maintains network rules on nodes. Allows network communication to your Pods.
- Container Runtime: The software that is responsible for running containers (e.g., Docker, containerd).
Core Objects
1. Pods
The smallest deployable unit in Kubernetes. A Pod is not a container — it is a wrapper around one or more containers that share a network and storage context. Think of a Pod as a shared apartment: each container (roommate) has its own room (filesystem), but they share the kitchen (network namespace) and living room (volumes).- Represents a single instance of a running process.
- Can contain one or more containers (usually one, but sidecar patterns are common).
- Containers in a Pod share:
- Network: Same IP address and port space (can talk via
localhost). - Storage: Shared volumes.
- Network: Same IP address and port space (can talk via
2. Namespaces
Virtual clusters backed by the same physical cluster.- Used to divide cluster resources between multiple users/teams.
- Examples:
default,kube-system,dev,prod.
kubectl Basics
kubectl is the command-line tool for communicating with the Kubernetes API server.
Cluster Info & Navigation
Viewing Resources
Interacting with Pods
Creating Your First Pod
Kubernetes objects are typically defined in YAML files.Imperative (CLI)
Quick for testing, but not recommended for production.Declarative (YAML)
The “Infrastructure as Code” way.Pod Lifecycle
- Pending: Pod accepted by system, but container image not yet created.
- Running: Pod bound to a node, all containers created, at least one running.
- Succeeded: All containers terminated successfully (exit code 0).
- Failed: All containers terminated, at least one with failure.
- Unknown: State cannot be obtained.
Resource Management
Every container should have resource requests and limits defined.Requests vs Limits
This is one of the most important concepts in Kubernetes resource management. Requests are your reservation — “I need at least this much.” Limits are your ceiling — “I can never use more than this.” The analogy: a request is like reserving a table at a restaurant (guaranteed capacity), and a limit is the maximum tab you can run up.| Concept | Description | Interview Insight |
|---|---|---|
| Request | Minimum resources guaranteed | Used by Scheduler to place pods. If a node does not have enough unrequested capacity, the pod will not be scheduled there. |
| Limit | Maximum resources allowed | Enforced at runtime. Memory limit exceeded = OOMKilled. CPU limit exceeded = throttled (not killed). |
Quality of Service (QoS) Classes
Kubernetes assigns QoS classes based on resource settings:| QoS Class | Condition | Eviction Priority |
|---|---|---|
| Guaranteed | requests = limits (both CPU and memory) | Last to be evicted |
| Burstable | requests < limits, or only one is set | Middle priority |
| BestEffort | No requests or limits | First to be evicted |
Health Probes (Critical for Interviews!)
Probes allow Kubernetes to know when to restart or route traffic to a container. Without probes, Kubernetes only knows if the main process exited — it has no way to detect a deadlocked application or a service that is running but unable to handle requests.Liveness Probe
“Is the container alive?” - If it fails, the container is restarted. Use this to detect deadlocks, infinite loops, or corrupted state where the process is running but not functional.Readiness Probe
“Is the container ready to receive traffic?” - If it fails, the Pod is removed from Service endpoints (no traffic routed to it), but the container is NOT restarted. Use this for temporary conditions like warming caches or waiting for a downstream dependency.Startup Probe
“Has the application started?” - For slow-starting apps (JVM warmup, large ML model loading, database migrations). Disables liveness/readiness probes until it succeeds, preventing premature restarts during boot.etcd Deep Dive
etcd is the “source of truth” for Kubernetes. Understanding it is crucial for interviews.Key Facts
- Distributed key-value store using Raft consensus
- Stores all cluster state: Pods, Services, Secrets, ConfigMaps
- Strongly consistent - reads return the most recent write
- Typically runs as a 3 or 5 node cluster (odd numbers for quorum)
Common Interview Questions
What happens if etcd goes down?
What happens if etcd goes down?
- Existing workloads continue running (kubelet manages local pods)
- No new operations possible (no scheduling, no API calls)
- Cluster is in read-only mode until etcd recovers
How is etcd backed up?
How is etcd backed up?
What is the quorum?
What is the quorum?
RBAC Basics
Role-Based Access Control (RBAC) regulates access to Kubernetes resources.Key Components
| Resource | Scope | Description |
|---|---|---|
| Role | Namespace | Defines permissions within a namespace |
| ClusterRole | Cluster-wide | Defines permissions across all namespaces |
| RoleBinding | Namespace | Binds Role to users/groups/service accounts |
| ClusterRoleBinding | Cluster-wide | Binds ClusterRole cluster-wide |
Pod Lifecycle
- Pending: Pod accepted by system, but container image not yet created.
- Running: Pod bound to a node, all containers created, at least one running.
- Succeeded: All containers terminated successfully (exit code 0).
- Failed: All containers terminated, at least one with failure.
- Unknown: State cannot be obtained.
Interview Questions & Answers
What is the difference between a Pod and a Container?
What is the difference between a Pod and a Container?
- Network namespace (same IP, communicate via localhost)
- Storage volumes
- Lifecycle (created and destroyed together)
How does the Kubernetes Scheduler work?
How does the Kubernetes Scheduler work?
- Watches for unscheduled Pods (via API Server)
- Filtering: Eliminates nodes that don’t meet requirements (resources, taints, nodeSelector)
- Scoring: Ranks remaining nodes based on priorities (least utilized, affinity rules)
- Binding: Assigns Pod to the highest-scoring node
What is the difference between kubectl apply and kubectl create?
What is the difference between kubectl apply and kubectl create?
- create: Creates a resource. Fails if it already exists.
- apply: Creates or updates a resource. Idempotent. Recommended for GitOps workflows.
How does Kubernetes handle node failures?
How does Kubernetes handle node failures?
- Node Controller marks node as
NotReadyafter 40s of no heartbeat - After
pod-eviction-timeout(default 5min), pods are evicted - Deployment/ReplicaSet controllers create replacement pods on healthy nodes
What is a sidecar container?
What is a sidecar container?
- Logging: Collects and ships logs (e.g., Fluentd sidecar)
- Service Mesh: Handles networking (e.g., Envoy proxy in Istio)
- Security: Handles TLS termination or secrets injection
Key Takeaways
- Control Plane manages the cluster; Nodes run the applications.
- Pods are the atomic unit of scheduling.
- Use Declarative (YAML) configuration for reproducibility.
- Always define resource requests and limits.
- Implement liveness and readiness probes for production workloads.
kubectl describeandkubectl logsare your best friends for debugging.
Interview Deep-Dive
Walk me through exactly what happens from the moment you type 'kubectl run nginx --image=nginx' until the container is actually running on a node.
Walk me through exactly what happens from the moment you type 'kubectl run nginx --image=nginx' until the container is actually running on a node.
- kubectl serializes the command into an API request (a Pod object in JSON) and sends it to the API server over HTTPS.
- The API server authenticates the request (client certificate or bearer token), authorizes it (RBAC check — does this user have permission to create pods in this namespace?), and runs it through admission controllers (mutating webhooks might inject a sidecar, validating webhooks might enforce a label requirement).
- The validated Pod object is persisted to etcd. At this point, the Pod exists in the cluster state but has no node assigned — its
nodeNamefield is empty. - The scheduler is watching etcd (via the API server’s watch mechanism) for pods with no node assignment. It picks up this pod, runs its filtering phase (which nodes have enough CPU and memory, match nodeSelector, tolerate taints), then its scoring phase (which of the remaining nodes is the best fit), and writes the selected node name back to the pod’s spec in etcd.
- The kubelet on the selected node is also watching for pods assigned to its node. It sees the new pod, pulls the nginx image through the container runtime (containerd via CRI), creates the pause container to hold network namespaces, then starts the nginx container inside that pod sandbox.
- The kubelet reports the pod’s status back to the API server, which updates etcd. Now
kubectl get podsshows the pod as Running.
kubectl describe pod <name> shows FailedScheduling events with a message like “0/3 nodes are available: 1 Insufficient cpu, 2 node(s) had taint that the pod didn’t tolerate.” The fix depends on the root cause: if it is resources, you either scale down other workloads, add nodes, or reduce the pod’s resource requests. If it is taints, you either add a toleration to the pod or remove the taint from the node. I have seen teams burn hours on Pending pods when the real issue was a ResourceQuota rejecting pods without explicit resource requests.A production pod is getting OOMKilled repeatedly. How do you diagnose and fix this without just blindly increasing the memory limit?
A production pod is getting OOMKilled repeatedly. How do you diagnose and fix this without just blindly increasing the memory limit?
- First, I would check
kubectl describe pod <name>to confirm the exit code is 137 (SIGKILL from OOM) and see which container is being killed. The Last State section shows the exact reason. - Next, I would look at actual memory usage over time, not just the limit. If the cluster runs Prometheus, I would query
container_memory_working_set_bytesfor the container over the past hour to see whether usage is steadily climbing (memory leak) or spiking under load (insufficient allocation). - If it is a gradual climb, the application has a memory leak. Increasing the limit only delays the inevitable. The fix is in the application code — profiling with language-specific tools (Java heap dumps, Go pprof, Python tracemalloc).
- If usage is spiking under load, the limit is genuinely too low. I would check whether the requests and limits are set correctly. A common mistake is setting requests equal to limits (Guaranteed QoS) at a value that is too low. The pod gets exactly what it asks for and no more, so any burst kills it.
- I would also check if the pod is running a JVM-based application, because the JVM has its own memory management. If
MaxRAMPercentageis not set correctly, the JVM might try to use more heap than the container limit allows, and the kernel kills the container before the JVM even knows it is out of memory.
kubectl top pod and what the OOM killer actually uses?kubectl top pod reports the working set size, which is the memory the kernel considers “in use” and non-reclaimable. The OOM killer looks at the cgroup memory usage, which includes the page cache. If your application reads large files, the page cache might push the cgroup usage above the limit even though the application itself is not using that much heap. This is why container_memory_working_set_bytes (from cAdvisor/Prometheus) is a better metric than raw RSS for understanding OOM risk.Explain the difference between liveness, readiness, and startup probes. Give me a real scenario where misconfiguring one of them caused a production outage.
Explain the difference between liveness, readiness, and startup probes. Give me a real scenario where misconfiguring one of them caused a production outage.
- Liveness probe answers “is this container alive?” If it fails, kubelet kills and restarts the container. Use it when your application can get into a deadlocked or hung state that only a restart can fix.
- Readiness probe answers “is this container ready to receive traffic?” If it fails, the pod is removed from Service endpoints but NOT restarted. Use it when the application needs time to warm up (load a cache, connect to a database) or when it is temporarily overloaded.
- Startup probe answers “has this application finished starting?” It runs before liveness and readiness probes. While it is running, those other probes are disabled. Use it for slow-starting applications (large Java apps, ML model loading) to avoid liveness probes killing the container before it finishes booting.
- A real scenario: a team configured a liveness probe with
initialDelaySeconds: 5on a Java application that took 90 seconds to start. During every deployment, the liveness probe failed because the app was not ready yet, so kubelet kept killing and restarting the container, which restarted the 90-second boot, which failed the liveness check again. The pod went into CrashLoopBackOff and never became healthy. The fix was adding a startup probe with a highfailureThreshold(30 retries at 10-second intervals = 5-minute window to start), which disabled the liveness probe during boot.
Next: Kubernetes Workloads →