Skip to main content

Kubernetes Interview Questions (50+ Detailed Q&A)

1. Architecture & Components

Answer:
  • Control Plane (Master):
    • API Server: Gateway. Only component talking to etcd.
    • Etcd: Key-Value store. Source of truth.
    • Scheduler: Assigns Pods to Nodes.
    • Controller Manager: Reconciles state (ReplicaSet, Node).
    • Cloud Controller: Talks to AWS/GCP (LBs, Disk).
  • Worker Node:
    • Kubelet: Agent talking to API Server. Manages Pods.
    • Kube-proxy: Network rules (IPTables).
    • Runtime: Docker/Containerd.
Request Flow (Creating a Deployment):
  1. kubectl apply -f deployment.yaml \u2192 API Server
  2. API Server validates, writes to etcd
  3. Deployment Controller sees new Deployment \u2192 creates ReplicaSet
  4. ReplicaSet Controller sees new RS \u2192 creates Pod specs
  5. Scheduler sees unscheduled Pods \u2192 assigns to Nodes
  6. Kubelet on Node sees new Pod assignment \u2192 pulls image, starts container
  7. Kube-proxy updates iptables rules for Service
Component Failure Scenarios:
  • API Server down: Cluster unmanageable (but existing Pods keep running)
  • etcd down: Cluster state lost (catastrophic)
  • Scheduler down: New Pods stay Pending
  • Kubelet down: Node marked NotReady, Pods evicted after timeout
Answer: Distributed KV Store. Stores Cluster State (Config, Secrets, Metadata). Consistent (Raft Consensus). Critical: If etcd is lost, cluster is lost. Needs High Availability (3-5 nodes).
Answer:
  • IPTables: Default. Fast. Linux Kernel routing.
  • IPVS: For massive scale (thousands of services). Hash table based.
  • Userspace: Old, slow.
Answer: Authentication -> Authorization -> Admission Control -> Write to Etcd. Stateless. Can be scaled horizontally.
Answer:
  1. Filtering: Which nodes meet requirements? (RAM, CPU, Taints).
  2. Scoring: Rank valid nodes (Least load, Image affinity).
  3. Binding: Notify API Server.
Answer: Infinite Loop: Watch Current State -> Compare with Desired State -> Make changes. Example: ReplicaSet sees 2 pods, wants 3. Creates 1.
Answer: Interfaces to make K8s modular.
  • CRI (Runtime): Swap Docker for Containerd/CRI-O.
  • CNI (Network): Swap Flannel for Calico.
  • CSI (Storage): Swap EBS for PD-SSD.
Answer: Small container holding the Network Namespace for the Pod. If app container dies, Pause stays alive so IP is kept.
Answer: Pending -> ContainerCreating -> Running -> Succeeded/Failed. CrashLoopBackOff: Repeated failure.
Answer: Managed directly by Kubelet (Manifests in /etc/kubernetes/manifests). Not managed by API Server/scheduler. Used for Control Plane components (etcd, apiserver) in self-hosted clusters.

2. Workloads & Scheduling

Answer:
  • Deployment: Stateless. Random names (app-xyz). Easy update/rollback.
  • StatefulSet: Sticky identity (app-0, app-1). Ordered startup. Persistent storage attachment.
  • DaemonSet: One pod per Node. (Logs, Monitoring).
Answer:
  • Job: Run to completion (Batch). Retries on failure.
  • CronJob: Time-based schedule.
Answer: Node says “Stay away” (Taint). Pod says “I can handle it” (Toleration). Use: Dedicated hardware (GPU nodes), Master nodes (NoSchedule).
Answer:
  • Selector: Simple equality (disk=ssd). Hard rule.
  • Affinity: Expressive (Not In, Exists). Soft rules (preferredDuringScheduling).
Answer: Run before main container. Sequential. Use: Wait for DB service, Download assets. If fails, Pod restarts.
Answer: Helper container running alongside main app. Use: Log shipping (Fluentd), Proxy (Envoy/Istio), Config watcher.
Answer:
  • Request: Guaranteed. Used for Scheduling.
  • Limit: Max cap. If CPU exceeded -> Throttle. If RAM exceeded -> OOMKill.
Answer: Limit how many pods can be down simultaneously during voluntary disruptions (Node Drain/Upgrade). “Always keep min 2 pods up”.
Answer:
  • Rolling: New pod up -> Old pod down. Zero downtime.
  • Recreate: All old down -> All new up. Downtime.
Answer:
  1. Guaranteed: Limit == Request for all. (Last to be evicted).
  2. Burstable: Request < Limit.
  3. BestEffort: No requests. (First to be evicted).

3. Networking & Service Discovery

Answer:
  1. Every Pod gets an IP.
  2. All Pods can talk to all Pods without NAT.
Answer:
  • ClusterIP: Internal VIP.
  • NodePort: Random port (30000+) on Node IP.
  • LoadBalancer: Cloud LB.
  • ExternalName: DNS alias to external service.
Answer: CoreDNS. Watches API for new Services. Adds A-record: my-svc.my-ns.svc.cluster.local.
Answer:
  • Ingress: The Rule (Resource). “Route /api to Service B”.
  • Controller: The Implementation (Nginx Pod). Reads Rule, updates nginx.conf.
Answer: K8s Firewall. Deny all by default. Allow specific traffic. Implemented by CNI (Calico).
Answer: ClusterIP: None. DNS returns List of Pod IPs directly, not VIP. Used for StatefulSets (Direct peering).
Answer: Sidecar proxy (Envoy) in every pod. Features: mTLS, Canary, Circuit Breaking, Tracing.
Answer:
  • Flannel: VXLAN Overlay. Simple. No NetPol.
  • Calico: Layer 3 BGP. Complex. Supports NetPol.
Answer: Next-gen Ingress. Standardization. Role separation (Infra vs App).
Answer: Dev tool. Tunnels local machine port to Pod port via API Server.

4. Storage & Config

Answer:
  • PV: Physical volume (Disk). Admin creates it.
  • PVC: Claim (Request). User creates it.
Answer: Dynamic Provisioning. PVC asks for “Standard”. StorageClass talks to AWS -> Creates EBS -> Creates PV -> Binds.
Answer:
  • RWO: ReadWriteOnce (1 Node). Block storage.
  • RWX: ReadWriteMany (Multiple Nodes). NFS/EFS.
Answer:
  • Conf: Plain text. Env vars, files.
  • Secret: Base64. Encrypted at rest (if configured).
Answer: Expose Pod info (Name, IP, Namespace) to container as Env/File.
Answer: Ephemeral volume. Starts empty. Dies with Pod. Good for cache.

5. Troubleshooting & Security (Deep Dive)

Answer:
  1. kubectl logs (App error).
  2. kubectl describe (Liveness probe failed? OOM?).
  3. kubectl get events.
Answer: Bad tag? Private registry auth missing (ImagePullSecrets)? Network?
Answer: Scheduler cannot find a node. Insufficient CPU/RAM? Taints? Affinity Rules? PVC pending?
Answer: Finalizers? Unresponsive Storage? Force: delete pod --grace-period=0 --force.
Answer:
  • Role: Namespaced (Can read Pods in ‘dev’).
  • ClusterRole: Global (Can read Nodes).
  • Binding: Connects User to Role.
Answer: Identity for Pods. Pods use SA token to talk to API Server.
Answer: runAsUser: 1000. readOnlyRootFilesystem. Defines privileges at Pod/Container level.
Answer: Interceptors before persistence.
  • Validating: “No, wrong schema”. (OPA Gatekeeper).
  • Mutating: “I’ll add a sidecar automatically”.
Answer: Policy as Code. “Registry must be internal”, “Ingress must be HTTPS”.
Answer: Secrets are base64 by default. Must enable EncryptionAtRest provider to encrypt etcd data.
Answer: Apps are insecure by default (Allow All). First step: Deny All Ingress. Then whitelist.
Answer: gVisor / Kata Containers. Sandbox containers with distinct Kernel for high isolation.
Answer:
  1. Upgrade Master components.
  2. Drain Node (Evict pods).
  3. Upgrade Kubelet.
  4. Uncordon.
Answer:
  • Helm: Templating ({{ .Values }}). Package Management. Complex.
  • Kustomize: Overlay/Patching. Native to Kubectl. Cleaner (No templates).

6. Kubernetes Medium Level Questions

Answer: Ensures one pod per node.
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      containers:
      - name: fluentd
        image: fluentd:latest
Use cases: Logging agents, monitoring agents, network plugins.
Answer: For stateful applications with stable network identity.
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
Features: Ordered deployment, stable pod names (mysql-0, mysql-1).
Answer:
# Job: run once
apiVersion: batch/v1
kind: Job
metadata:
  name: backup
spec:
  template:
    spec:
      containers:
      - name: backup
        image: backup:latest
      restartPolicy: Never
  backoffLimit: 3

# CronJob: scheduled
apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-backup
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup:latest
          restartPolicy: Never
Answer: Run before app containers.
spec:
  initContainers:
  - name: wait-for-db
    image: busybox
    command: ['sh', '-c', 'until nc -z db 5432; do sleep 1; done']
  containers:
  - name: app
    image: myapp
Answer: Helper container alongside main container.
spec:
  containers:
  - name: app
    image: myapp
  - name: log-shipper
    image: fluentd
    volumeMounts:
    - name: logs
      mountPath: /var/log
  volumes:
  - name: logs
    emptyDir: {}
Answer:
resources:
  requests:
    memory: "256Mi"
    cpu: "500m"
  limits:
    memory: "512Mi"
    cpu: "1000m"
  • Requests: Minimum guaranteed
  • Limits: Maximum allowed
Answer: Ensure minimum availability during disruptions.
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: myapp-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: myapp
Answer: Control pod-to-pod traffic.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
Answer:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: myapp
            port:
              number: 80
Answer: Traffic management, security, observability.
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp
  http:
  - route:
    - destination:
        host: myapp
        subset: v1
      weight: 90
    - destination:
        host: myapp
        subset: v2
      weight: 10

7. Kubernetes Advanced Level Questions

Answer: Extend Kubernetes API.
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              size:
                type: string
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
Answer: Automate application management.
// Reconcile loop
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    var db examplev1.Database
    if err := r.Get(ctx, req.NamespacedName, &db); err != nil {
        return ctrl.Result{}, err
    }
    
    // Create/update resources based on spec
    // ...
    
    return ctrl.Result{}, nil
}
Answer: Intercept API requests before persistence.
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: pod-policy
webhooks:
- name: validate.example.com
  clientConfig:
    service:
      name: webhook
      namespace: default
      path: /validate
  rules:
  - operations: ["CREATE"]
    apiGroups: [""]
    apiVersions: ["v1"]
    resources: ["pods"]
Answer:
apiVersion: v1
kind: Namespace
metadata:
  name: production
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted
Levels: Privileged, Baseline, Restricted.
Answer:
# ClusterRole for cross-namespace access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: secret-reader
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list"]
  resourceNames: ["my-secret"]

# RoleBinding in specific namespace
apiVersion: rbac.authorization.k8s.io/v1
kind:RoleBinding
metadata:
  name: read-secrets
  namespace: production
subjects:
- kind: ServiceAccount
  name: myapp
roleRef:
  kind: ClusterRole
  name: secret-reader
  apiGroup: rbac.authorization.k8s.io
Answer: Automatically adjust cluster size.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.25.0
        command:
        - ./cluster-autoscaler
        - --cloud-provider=gce
        - --nodes=1:10:node-pool-1
Answer:
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000
globalDefault: false
description: "High priority pods"

---
apiVersion: v1
kind: Pod
metadata:
  name: critical-app
spec:
  priorityClassName: high-priority
  containers:
  - name: app
    image: myapp
Answer:
# Taint node
kubectl taint nodes node1 key=value:NoSchedule

# Pod with toleration
spec:
  tolerations:
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"
Answer:
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: app
            operator: In
            values:
            - myapp
        topologyKey: kubernetes.io/hostname
Use case: Spread pods across nodes for HA.
Answer:
# Pod logs
kubectl logs pod-name -c container-name --previous

# Exec into pod
kubectl exec -it pod-name -- /bin/sh

# Describe for events
kubectl describe pod pod-name

# Port forward
kubectl port-forward pod-name 8080:80

# Debug with ephemeral container
kubectl debug pod-name -it --image=busybox

# Node issues
kubectl get nodes
kubectl describe node node-name
kubectl top nodes

# Network debugging
kubectl run tmp --image=nicolaka/netshoot -it --rm