Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Kubernetes Storage

Stateful applications require storage that persists beyond the lifecycle of a Pod. Containers are ephemeral by design — everything inside a container’s filesystem vanishes when the container restarts. For a stateless web server, that is fine. For a database, that is a catastrophe. Kubernetes storage abstractions solve this by separating “I need 100GB of fast disk” (what developers ask for) from “here is an AWS EBS volume in us-east-1a” (what infrastructure provides).

The Storage Abstraction

Kubernetes separates storage infrastructure (Admin) from storage consumption (User). Think of it like renting an apartment: the developer (tenant) says “I need a 2-bedroom apartment” (PVC), the cloud provider (landlord) offers available units (PVs), and the StorageClass is the real estate agency that matches requests to inventory — or builds new units on demand.

1. Volumes (Ephemeral)

Tied to the Pod’s lifecycle. If the Pod dies, the data is lost (except for hostPath).

emptyDir

Creates a temporary directory that exists for the lifetime of the Pod. When the Pod is deleted, the data is gone. Think of it as a whiteboard in a meeting room — useful during the meeting, erased when the room is freed.
  • Use case: Cache, scratch space, sharing data between containers in a Pod (e.g., a sidecar writes logs to an emptyDir that the main container reads from).
volumes:
- name: cache-volume
  emptyDir: {}       # Uses disk by default. Set medium: Memory for tmpfs (faster, counts against memory limits)

hostPath

Mounts a file/directory from the Node’s filesystem directly into the Pod.
  • Use case: Node agents (logging, monitoring), accessing Docker socket, or node-level configuration.
  • Warning: Pods become tied to that specific node. If the pod is rescheduled to a different node, it sees different (or no) data. This also creates a security risk — a pod with hostPath access can read sensitive files from the node filesystem.
Production gotcha: Never use hostPath for application data in production. It breaks portability, prevents rescheduling, and creates a single point of failure. The only legitimate uses are node-level agents (DaemonSets for logging/monitoring) and very specific debugging scenarios.

2. Persistent Storage (Durable)

PersistentVolume (PV)

A piece of storage in the cluster (e.g., AWS EBS, NFS share).
  • Managed by Admins.
  • Exists independently of any Pod.

PersistentVolumeClaim (PVC)

A request for storage by a user.
  • “I need 10Gi of ReadWriteOnce storage.”
  • Kubernetes finds a matching PV and binds them.

StorageClass (SC)

Enables Dynamic Provisioning. This is the modern way to manage storage — instead of manually creating PVs for every request (imagine doing that for 500 PVCs), the StorageClass automatically provisions new volumes on demand through a CSI driver. Most cloud-managed Kubernetes clusters come with a default StorageClass pre-configured.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs     # CSI driver that provisions the actual storage
parameters:
  type: gp2                             # AWS EBS volume type (gp2, gp3, io1, io2)
reclaimPolicy: Delete                   # What happens to the PV when PVC is deleted
                                        # (Delete = remove PV and underlying disk)
allowVolumeExpansion: true              # Allow PVCs to grow (but never shrink!)

Practical Example: Database Storage

Step 1: Create PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
spec:
  storageClassName: standard  # References the StorageClass above for dynamic provisioning.
                              # If omitted, the cluster's default StorageClass is used.
  accessModes:
    - ReadWriteOnce           # Single node can mount as read-write. Required for block storage.
  resources:
    requests:
      storage: 20Gi           # Requested size. Cloud providers round up to the nearest
                               # supported size (e.g., AWS EBS minimum is 1 GiB).

Step 2: Use PVC in Pod

apiVersion: v1
kind: Pod
metadata:
  name: mysql
spec:
  containers:
  - name: mysql
    image: mysql:8.0
    volumeMounts:
    - name: mysql-persistent-storage
      mountPath: /var/lib/mysql   # MySQL's default data directory.
                                  # The PVC data survives pod restarts and rescheduling.
  volumes:
  - name: mysql-persistent-storage
    persistentVolumeClaim:
      claimName: mysql-pv-claim   # Must match the PVC name in the same namespace
Production gotcha: In production, you would use a StatefulSet instead of a bare Pod for MySQL. StatefulSets give you stable network identities, ordered deployment, and automatic PVC creation via volumeClaimTemplates. A bare Pod with a PVC works for learning, but the Pod will not be rescheduled if the node fails — only controllers (Deployments, StatefulSets) provide self-healing.

Access Modes

ModeDescriptionUse Case
ReadWriteOnce (RWO)Mounted by single node as R/WBlock storage (AWS EBS, Azure Disk)
ReadOnlyMany (ROX)Mounted by multiple nodes as Read-onlyStatic content (NFS)
ReadWriteMany (RWX)Mounted by multiple nodes as R/WShared filesystems (NFS, EFS)

Reclaim Policies

What happens to the PV when the PVC is deleted? This is one of the most consequential configuration decisions in Kubernetes storage, and getting it wrong is how teams accidentally delete production databases.
  • Retain: PV remains in a “Released” state. Data is safe. An administrator must manually reclaim it (delete the PV, re-create it, or clean up the underlying storage). This is the safe choice for production.
  • Delete: PV and underlying storage (e.g., EBS volume) are deleted immediately and irreversibly. This is the default for most dynamic StorageClasses. Convenient for dev/test, dangerous for production.
  • Recycle: (Deprecated) Performs rm -rf on the volume. Do not use.

CSI (Container Storage Interface)

CSI is the standard interface between Kubernetes and storage providers.
ProviderDriverUse Case
AWS EBSebs.csi.aws.comBlock storage on AWS
AWS EFSefs.csi.aws.comShared filesystem (RWX)
GCP PDpd.csi.storage.gke.ioBlock storage on GCP
Azure Diskdisk.csi.azure.comBlock storage on Azure
Longhorndriver.longhorn.ioOn-prem distributed storage
Cephrbd.csi.ceph.comOn-prem enterprise storage

Installing a CSI Driver (AWS EBS Example)

# Install via Helm
helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver aws-ebs-csi-driver/aws-ebs-csi-driver \
  --namespace kube-system

Volume Snapshots

Take point-in-time snapshots of PVCs for backup or cloning.

VolumeSnapshotClass

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: ebs-snapshot-class
driver: ebs.csi.aws.com           # Must match the CSI driver for the storage
deletionPolicy: Delete             # Delete the cloud snapshot when the VolumeSnapshot object is deleted.
                                   # Use "Retain" if you want snapshots to survive deletion of the K8s resource.

Create a Snapshot

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: mysql-snapshot
spec:
  volumeSnapshotClassName: ebs-snapshot-class
  source:
    persistentVolumeClaimName: mysql-pv-claim  # Source PVC to snapshot
    # NOTE: The PVC must be backed by a CSI driver that supports snapshots.
    # Snapshots are crash-consistent, not application-consistent. For databases,
    # flush data to disk (e.g., FLUSH TABLES WITH READ LOCK for MySQL) before snapshotting.

Restore from Snapshot

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-restored
spec:
  storageClassName: standard
  dataSource:
    name: mysql-snapshot              # Reference to the VolumeSnapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi     # Must be >= the original PVC size. Cannot shrink during restore.

Volume Expansion

Increase PVC size without data loss (if StorageClass supports it).
# StorageClass must have
allowVolumeExpansion: true
# Edit PVC to increase size
kubectl edit pvc mysql-pv-claim
# Change resources.requests.storage from 20Gi to 50Gi
Interview Tip: Volume expansion only works if the StorageClass has allowVolumeExpansion: true. You cannot shrink volumes.

Storage Best Practices

Don’t manually create PVs. Let StorageClass provision them automatically.
  • RWO: Most block storage (EBS, Azure Disk)
  • RWX: Shared filesystems (EFS, NFS) - needed for multi-pod writes
  • Delete: For dev/test (auto-cleanup)
  • Retain: For production (prevent accidental data loss)
  • Use VolumeSnapshots for cloud storage
  • Use Velero for cluster-wide backup/restore
  • Test your restore process regularly!

Ephemeral Volumes

For temporary storage that doesn’t need to persist.

emptyDir with Memory Backend

volumes:
- name: cache
  emptyDir:
    medium: Memory    # Use RAM
    sizeLimit: 256Mi

Generic Ephemeral Volumes

Dynamically provisioned volumes tied to pod lifecycle:
volumes:
- name: scratch
  ephemeral:
    volumeClaimTemplate:
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 10Gi

Interview Questions & Answers

AspectPersistentVolume (PV)PersistentVolumeClaim (PVC)
Created byAdmin (or dynamically)Developer
RepresentsActual storage resourceRequest for storage
LifecycleCluster-scopedNamespace-scoped
AnalogyHotel roomReservation
Depends on the Reclaim Policy of the PV:
  • Delete: PV and underlying storage are deleted
  • Retain: PV becomes Released (data preserved, but not reusable)
The PV goes through states: AvailableBoundReleased
  1. User creates PVC referencing a StorageClass
  2. StorageClass controller sees the PVC
  3. Controller calls CSI driver to provision storage
  4. CSI driver creates actual storage (e.g., AWS EBS volume)
  5. PV is automatically created and bound to PVC
ModeMeaningExample Storage
RWOSingle node R/WAWS EBS, Azure Disk
ROXMultiple nodes read-onlyNFS, S3-backed
RWXMultiple nodes R/WAWS EFS, NFS, CephFS
Use RWX when multiple pods need to write to the same volume.
Option 1: Pod-based copy
kubectl run copy-pod --image=busybox -- sleep infinity
# Mount both PVCs and use cp/rsync
Option 2: VolumeSnapshot
  • Create snapshot of source PVC
  • Create new PVC from snapshot
Option 3: Velero
  • Backup PVC with Velero
  • Restore to new PVC
Container Storage Interface is a standard API for storage drivers.Benefits:
  • Vendor-neutral storage integration
  • Supports any storage system with a CSI driver
  • Features: snapshots, cloning, resizing
  • Decouples storage from Kubernetes release cycle

Common Pitfalls

1. Wrong Access Mode: Using RWO when you need pods on multiple nodes to write. Check if your storage supports RWX.2. Delete Reclaim Policy in Production: Accidentally deleting a PVC deletes your data. Use Retain for production.3. Not Testing Restore: Having snapshots is useless if you’ve never tested restoring from them.4. Undersizing Volumes: Disk full = application crash. Use monitoring and volume expansion.5. StatefulSet Storage Orphans: Deleting a StatefulSet doesn’t delete PVCs. Clean them up manually or use cascade.

Key Takeaways

  • Volumes are ephemeral (Pod lifecycle).
  • PV/PVC provides durable storage.
  • StorageClass automates PV creation (Dynamic Provisioning).
  • Access Modes determine how many nodes can mount the volume.
  • Use VolumeSnapshots for backup and cloning.
  • CSI drivers enable storage vendor integration.
  • Always test your backup and restore procedures!

Interview Deep-Dive

Strong Answer:
  • First concern is data durability. I would use a StatefulSet with a volumeClaimTemplate backed by cloud block storage (AWS EBS gp3, GCP PD-SSD). The reclaim policy must be Retain so accidental PVC deletion does not destroy the volume. I have seen a team lose a production database because they deleted a StatefulSet during cleanup and the PVCs cascaded to Delete.
  • Second concern is performance. Database I/O patterns are random read/write with fsync calls. I would use SSD-backed storage and test actual IOPS with fio before going live. Undersized EBS volumes might throttle at 3000 IOPS, which is fatal for a transaction-heavy database.
  • Third concern is backup and recovery. VolumeSnapshots for point-in-time backups (crash-consistent, not application-consistent). For application-consistent backups, configure WAL archiving to ship write-ahead logs to S3.
  • Fourth concern is failover time. EBS volumes are ReadWriteOnce. If the node fails, the volume must detach before reattaching to another node. This takes 1-5 minutes, during which the database is unavailable.
Follow-up: What is the difference between a crash-consistent and an application-consistent snapshot?A crash-consistent snapshot captures the exact state of the disk at a point in time, including unflushed buffers. PostgreSQL can recover from this using WAL replay, but it takes time and may roll back uncommitted transactions. An application-consistent snapshot freezes the application first (CHECKPOINT in PostgreSQL), flushes all buffers to disk, then takes the snapshot. The database starts clean with no recovery needed. VolumeSnapshots are crash-consistent by default. For application consistency, you need a pre-snapshot hook (which Velero supports via pod annotations).
Strong Answer:
  • Start with kubectl describe pvc <name> and read the Events section:
  • “waiting for first consumer to be created before binding” — The StorageClass has volumeBindingMode: WaitForFirstConsumer. The PV is not provisioned until a pod actually schedules. If the pod is also Pending, check the pod’s events separately.
  • “no persistent volumes available for this claim and no storage class is set” — The PVC has no storageClassName, and there is no default StorageClass. Fix: set the storageClassName or mark a StorageClass as default.
  • “failed to provision volume: the CSI driver is not found” — The CSI driver is not installed or its pods are down. Check kubectl get pods -n kube-system | grep csi.
  • “exceeded quota: requested storage exceeds namespace quota” — The namespace ResourceQuota for storage has been hit.
  • I would also verify the StorageClass exists (kubectl get sc), check that the provisioner is healthy, and confirm the requested access mode is supported by the storage backend (you cannot get RWX from EBS).
Follow-up: What happens to data when a StatefulSet is scaled down? Are the PVCs deleted?No. When you scale from 3 to 2, pod mysql-2 is terminated and deleted, but PVC data-mysql-2 is deliberately preserved. This is a safety feature — Kubernetes assumes you might scale back up and want the data intact. The downside is orphaned PVCs accumulating storage costs. You must manually delete them after confirming the data is no longer needed. Some teams run periodic scripts that identify unbound PVCs older than a threshold.
Strong Answer:
  • Block storage (EBS, Azure Disk, GCP PD): Attached to a single node (ReadWriteOnce). Lowest latency, highest IOPS. Best for databases and workloads needing fast random I/O. Limitation is RWO — the pod is tied to a specific node until the volume detaches.
  • File storage (EFS, NFS, CephFS): Shared filesystem for multiple pods simultaneously (ReadWriteMany). Higher latency than block storage, but enables shared access. Best for legacy applications needing a shared filesystem, or when multiple pods write to the same directory.
  • Object storage (S3, GCS, MinIO): Accessed via HTTP API, not POSIX operations. Infinite scalability, very low cost per GB. Best for storing artifacts (images, videos, logs, backups) accessed by application code via SDK. Not suitable for database storage.
  • Typical pattern: database uses block storage (PV with EBS), application pods store user uploads in object storage (S3 via SDK), and a legacy reporting tool uses file storage (EFS mounted as RWX).
Follow-up: Can you use S3 as a PersistentVolume? What are the trade-offs?Yes, through CSI drivers like Mountpoint for Amazon S3 or s3fs-fuse. These mount an S3 bucket as a POSIX filesystem inside the pod. The trade-off is severe: S3 is an object store, not a filesystem. Directory listing, file renaming, and appending are either slow (requiring full object rewrite) or have eventual consistency. Random I/O performance is orders of magnitude slower than block storage. I would only use this for sequential reads of large files (like ML training data). For anything transactional, use EBS.

Next: Kubernetes Windows & Linux →