Skip to main content

Kubernetes Storage

Stateful applications require storage that persists beyond the lifecycle of a Pod.

The Storage Abstraction

Kubernetes separates storage infrastructure (Admin) from storage consumption (User).

1. Volumes (Ephemeral)

Tied to the Pod’s lifecycle. If the Pod dies, the data is lost (except for hostPath).

emptyDir

Creates a temporary directory.
  • Use case: Cache, scratch space, sharing data between containers in a Pod.
volumes:
- name: cache-volume
  emptyDir: {}

hostPath

Mounts a file/directory from the Node’s filesystem.
  • Use case: Node agents (logging, monitoring).
  • Warning: Pods become stuck on that specific node.

2. Persistent Storage (Durable)

PersistentVolume (PV)

A piece of storage in the cluster (e.g., AWS EBS, NFS share).
  • Managed by Admins.
  • Exists independently of any Pod.

PersistentVolumeClaim (PVC)

A request for storage by a user.
  • “I need 10Gi of ReadWriteOnce storage.”
  • Kubernetes finds a matching PV and binds them.

StorageClass (SC)

Enables Dynamic Provisioning.
  • Instead of manually creating PVs, the SC creates them on-demand when a PVC is created.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Delete
allowVolumeExpansion: true

Practical Example: Database Storage

Step 1: Create PVC

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
spec:
  storageClassName: standard  # Uses dynamic provisioning
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi

Step 2: Use PVC in Pod

apiVersion: v1
kind: Pod
metadata:
  name: mysql
spec:
  containers:
  - name: mysql
    image: mysql:8.0
    volumeMounts:
    - name: mysql-persistent-storage
      mountPath: /var/lib/mysql
  volumes:
  - name: mysql-persistent-storage
    persistentVolumeClaim:
      claimName: mysql-pv-claim

Access Modes

ModeDescriptionUse Case
ReadWriteOnce (RWO)Mounted by single node as R/WBlock storage (AWS EBS, Azure Disk)
ReadOnlyMany (ROX)Mounted by multiple nodes as Read-onlyStatic content (NFS)
ReadWriteMany (RWX)Mounted by multiple nodes as R/WShared filesystems (NFS, EFS)

Reclaim Policies

What happens to the PV when the PVC is deleted?
  • Retain: PV remains (Released state). Data is safe. Admin must manually reclaim.
  • Delete: PV and underlying storage (e.g., EBS volume) are deleted.
  • Recycle: (Deprecated) Performs rm -rf.

CSI (Container Storage Interface)

CSI is the standard interface between Kubernetes and storage providers.
ProviderDriverUse Case
AWS EBSebs.csi.aws.comBlock storage on AWS
AWS EFSefs.csi.aws.comShared filesystem (RWX)
GCP PDpd.csi.storage.gke.ioBlock storage on GCP
Azure Diskdisk.csi.azure.comBlock storage on Azure
Longhorndriver.longhorn.ioOn-prem distributed storage
Cephrbd.csi.ceph.comOn-prem enterprise storage

Installing a CSI Driver (AWS EBS Example)

# Install via Helm
helm repo add aws-ebs-csi-driver https://kubernetes-sigs.github.io/aws-ebs-csi-driver
helm install aws-ebs-csi-driver aws-ebs-csi-driver/aws-ebs-csi-driver \
  --namespace kube-system

Volume Snapshots

Take point-in-time snapshots of PVCs for backup or cloning.

VolumeSnapshotClass

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: ebs-snapshot-class
driver: ebs.csi.aws.com
deletionPolicy: Delete

Create a Snapshot

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: mysql-snapshot
spec:
  volumeSnapshotClassName: ebs-snapshot-class
  source:
    persistentVolumeClaimName: mysql-pv-claim

Restore from Snapshot

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-restored
spec:
  storageClassName: standard
  dataSource:
    name: mysql-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi

Volume Expansion

Increase PVC size without data loss (if StorageClass supports it).
# StorageClass must have
allowVolumeExpansion: true
# Edit PVC to increase size
kubectl edit pvc mysql-pv-claim
# Change resources.requests.storage from 20Gi to 50Gi
Interview Tip: Volume expansion only works if the StorageClass has allowVolumeExpansion: true. You cannot shrink volumes.

Storage Best Practices

Don’t manually create PVs. Let StorageClass provision them automatically.
  • RWO: Most block storage (EBS, Azure Disk)
  • RWX: Shared filesystems (EFS, NFS) - needed for multi-pod writes
  • Delete: For dev/test (auto-cleanup)
  • Retain: For production (prevent accidental data loss)
  • Use VolumeSnapshots for cloud storage
  • Use Velero for cluster-wide backup/restore
  • Test your restore process regularly!

Ephemeral Volumes

For temporary storage that doesn’t need to persist.

emptyDir with Memory Backend

volumes:
- name: cache
  emptyDir:
    medium: Memory    # Use RAM
    sizeLimit: 256Mi

Generic Ephemeral Volumes

Dynamically provisioned volumes tied to pod lifecycle:
volumes:
- name: scratch
  ephemeral:
    volumeClaimTemplate:
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 10Gi

Interview Questions & Answers

AspectPersistentVolume (PV)PersistentVolumeClaim (PVC)
Created byAdmin (or dynamically)Developer
RepresentsActual storage resourceRequest for storage
LifecycleCluster-scopedNamespace-scoped
AnalogyHotel roomReservation
Depends on the Reclaim Policy of the PV:
  • Delete: PV and underlying storage are deleted
  • Retain: PV becomes Released (data preserved, but not reusable)
The PV goes through states: AvailableBoundReleased
  1. User creates PVC referencing a StorageClass
  2. StorageClass controller sees the PVC
  3. Controller calls CSI driver to provision storage
  4. CSI driver creates actual storage (e.g., AWS EBS volume)
  5. PV is automatically created and bound to PVC
ModeMeaningExample Storage
RWOSingle node R/WAWS EBS, Azure Disk
ROXMultiple nodes read-onlyNFS, S3-backed
RWXMultiple nodes R/WAWS EFS, NFS, CephFS
Use RWX when multiple pods need to write to the same volume.
Option 1: Pod-based copy
kubectl run copy-pod --image=busybox -- sleep infinity
# Mount both PVCs and use cp/rsync
Option 2: VolumeSnapshot
  • Create snapshot of source PVC
  • Create new PVC from snapshot
Option 3: Velero
  • Backup PVC with Velero
  • Restore to new PVC
Container Storage Interface is a standard API for storage drivers.Benefits:
  • Vendor-neutral storage integration
  • Supports any storage system with a CSI driver
  • Features: snapshots, cloning, resizing
  • Decouples storage from Kubernetes release cycle

Common Pitfalls

1. Wrong Access Mode: Using RWO when you need pods on multiple nodes to write. Check if your storage supports RWX.2. Delete Reclaim Policy in Production: Accidentally deleting a PVC deletes your data. Use Retain for production.3. Not Testing Restore: Having snapshots is useless if you’ve never tested restoring from them.4. Undersizing Volumes: Disk full = application crash. Use monitoring and volume expansion.5. StatefulSet Storage Orphans: Deleting a StatefulSet doesn’t delete PVCs. Clean them up manually or use cascade.

Key Takeaways

  • Volumes are ephemeral (Pod lifecycle).
  • PV/PVC provides durable storage.
  • StorageClass automates PV creation (Dynamic Provisioning).
  • Access Modes determine how many nodes can mount the volume.
  • Use VolumeSnapshots for backup and cloning.
  • CSI drivers enable storage vendor integration.
  • Always test your backup and restore procedures!

Next: Kubernetes Windows & Linux →