Kubernetes Storage
Stateful applications require storage that persists beyond the lifecycle of a Pod.The Storage Abstraction
Kubernetes separates storage infrastructure (Admin) from storage consumption (User).1. Volumes (Ephemeral)
Tied to the Pod’s lifecycle. If the Pod dies, the data is lost (except forhostPath).
emptyDir
Creates a temporary directory.- Use case: Cache, scratch space, sharing data between containers in a Pod.
hostPath
Mounts a file/directory from the Node’s filesystem.- Use case: Node agents (logging, monitoring).
- Warning: Pods become stuck on that specific node.
2. Persistent Storage (Durable)
PersistentVolume (PV)
A piece of storage in the cluster (e.g., AWS EBS, NFS share).- Managed by Admins.
- Exists independently of any Pod.
PersistentVolumeClaim (PVC)
A request for storage by a user.- “I need 10Gi of ReadWriteOnce storage.”
- Kubernetes finds a matching PV and binds them.
StorageClass (SC)
Enables Dynamic Provisioning.- Instead of manually creating PVs, the SC creates them on-demand when a PVC is created.
Practical Example: Database Storage
Step 1: Create PVC
Step 2: Use PVC in Pod
Access Modes
| Mode | Description | Use Case |
|---|---|---|
| ReadWriteOnce (RWO) | Mounted by single node as R/W | Block storage (AWS EBS, Azure Disk) |
| ReadOnlyMany (ROX) | Mounted by multiple nodes as Read-only | Static content (NFS) |
| ReadWriteMany (RWX) | Mounted by multiple nodes as R/W | Shared filesystems (NFS, EFS) |
Reclaim Policies
What happens to the PV when the PVC is deleted?- Retain: PV remains (Released state). Data is safe. Admin must manually reclaim.
- Delete: PV and underlying storage (e.g., EBS volume) are deleted.
- Recycle: (Deprecated) Performs
rm -rf.
CSI (Container Storage Interface)
CSI is the standard interface between Kubernetes and storage providers.Popular CSI Drivers
| Provider | Driver | Use Case |
|---|---|---|
| AWS EBS | ebs.csi.aws.com | Block storage on AWS |
| AWS EFS | efs.csi.aws.com | Shared filesystem (RWX) |
| GCP PD | pd.csi.storage.gke.io | Block storage on GCP |
| Azure Disk | disk.csi.azure.com | Block storage on Azure |
| Longhorn | driver.longhorn.io | On-prem distributed storage |
| Ceph | rbd.csi.ceph.com | On-prem enterprise storage |
Installing a CSI Driver (AWS EBS Example)
Volume Snapshots
Take point-in-time snapshots of PVCs for backup or cloning.VolumeSnapshotClass
Create a Snapshot
Restore from Snapshot
Volume Expansion
Increase PVC size without data loss (if StorageClass supports it).Storage Best Practices
Use StorageClass for Dynamic Provisioning
Use StorageClass for Dynamic Provisioning
Don’t manually create PVs. Let StorageClass provision them automatically.
Choose the Right Access Mode
Choose the Right Access Mode
- RWO: Most block storage (EBS, Azure Disk)
- RWX: Shared filesystems (EFS, NFS) - needed for multi-pod writes
Set Proper Reclaim Policy
Set Proper Reclaim Policy
- Delete: For dev/test (auto-cleanup)
- Retain: For production (prevent accidental data loss)
Backup Strategy
Backup Strategy
- Use VolumeSnapshots for cloud storage
- Use Velero for cluster-wide backup/restore
- Test your restore process regularly!
Ephemeral Volumes
For temporary storage that doesn’t need to persist.emptyDir with Memory Backend
Generic Ephemeral Volumes
Dynamically provisioned volumes tied to pod lifecycle:Interview Questions & Answers
What is the difference between PV and PVC?
What is the difference between PV and PVC?
| Aspect | PersistentVolume (PV) | PersistentVolumeClaim (PVC) |
|---|---|---|
| Created by | Admin (or dynamically) | Developer |
| Represents | Actual storage resource | Request for storage |
| Lifecycle | Cluster-scoped | Namespace-scoped |
| Analogy | Hotel room | Reservation |
What happens when you delete a PVC?
What happens when you delete a PVC?
Depends on the Reclaim Policy of the PV:
- Delete: PV and underlying storage are deleted
- Retain: PV becomes
Released(data preserved, but not reusable)
Available → Bound → ReleasedHow does Dynamic Provisioning work?
How does Dynamic Provisioning work?
- User creates PVC referencing a StorageClass
- StorageClass controller sees the PVC
- Controller calls CSI driver to provision storage
- CSI driver creates actual storage (e.g., AWS EBS volume)
- PV is automatically created and bound to PVC
What is the difference between RWO and RWX?
What is the difference between RWO and RWX?
| Mode | Meaning | Example Storage |
|---|---|---|
| RWO | Single node R/W | AWS EBS, Azure Disk |
| ROX | Multiple nodes read-only | NFS, S3-backed |
| RWX | Multiple nodes R/W | AWS EFS, NFS, CephFS |
How do you migrate data between PVCs?
How do you migrate data between PVCs?
Option 1: Pod-based copyOption 2: VolumeSnapshot
- Create snapshot of source PVC
- Create new PVC from snapshot
- Backup PVC with Velero
- Restore to new PVC
What is CSI and why is it important?
What is CSI and why is it important?
Container Storage Interface is a standard API for storage drivers.Benefits:
- Vendor-neutral storage integration
- Supports any storage system with a CSI driver
- Features: snapshots, cloning, resizing
- Decouples storage from Kubernetes release cycle
Common Pitfalls
Key Takeaways
- Volumes are ephemeral (Pod lifecycle).
- PV/PVC provides durable storage.
- StorageClass automates PV creation (Dynamic Provisioning).
- Access Modes determine how many nodes can mount the volume.
- Use VolumeSnapshots for backup and cloning.
- CSI drivers enable storage vendor integration.
- Always test your backup and restore procedures!
Next: Kubernetes Windows & Linux →