Understanding Kubernetes PersistentVolumes
Introduction
When running applications in Kubernetes, you'll often need to store data that persists even when pods are restarted, rescheduled, or deleted. This is where PersistentVolumes come into play. They are a critical component of Kubernetes storage architecture that allows applications to access and store data independently of the pod lifecycle.
In this guide, we'll explore Kubernetes PersistentVolumes (PVs), understand how they work together with PersistentVolumeClaims (PVCs), and learn how to implement persistent storage in your Kubernetes applications.
What is a PersistentVolume?
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It's a resource in the cluster just like a node is a cluster resource.
Key characteristics of PersistentVolumes:
- They exist independently of pods
- They have a lifecycle independent of any individual pod
- They represent physical storage resources in your infrastructure
- They can be provisioned statically by cluster administrators or dynamically via Storage Classes
PersistentVolumes vs Regular Volumes
Before diving deeper, let's understand how PersistentVolumes differ from regular Kubernetes volumes:
Regular Volumes | PersistentVolumes |
---|---|
Tied to pod lifecycle | Independent of pod lifecycle |
Defined in pod specification | Cluster-level resources |
Lost when pod is deleted | Persist after pod deletion |
Limited reusability | Can be reused by different pods |
How PersistentVolumes Work with PersistentVolumeClaims
PersistentVolumes work with another Kubernetes resource called PersistentVolumeClaim (PVC). This two-part system separates the concerns of storage provisioning from storage consumption:
- PersistentVolume (PV): A cluster resource provisioned by administrators or dynamically by the system
- PersistentVolumeClaim (PVC): A request for storage by a user/application
- Binding: When a PVC is created, Kubernetes binds it to a suitable PV based on requirements
PersistentVolume Lifecycle
PersistentVolumes have their own lifecycle that consists of several phases:
- Provisioning: Creating the PV (static or dynamic)
- Binding: Connecting a PVC to a PV
- Using: Pods access storage via the PVC
- Releasing: PVC is deleted, but PV still exists
- Reclaiming: Preparing the PV for reuse (based on reclaim policy)
PersistentVolume Reclaim Policies
When a PVC is deleted, the PV that was bound to it isn't automatically deleted. Instead, it enters the "Released" state and what happens next depends on the reclaim policy:
- Retain: The PV is kept as-is with its data. It must be manually reclaimed by an administrator.
- Delete: The PV and its associated storage resource are automatically deleted.
- Recycle: (Deprecated) Basic scrub (
rm -rf /thevolume/*
) before making it available again.
Creating a Static PersistentVolume
Let's create a simple PersistentVolume that uses a hostPath (local directory on the node) for storage:
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
In this example:
- We've named our PV
task-pv-volume
- We've set its capacity to 10Gi
- We've set the access mode to
ReadWriteOnce
(can be mounted by a single node for reading and writing) - We've specified a path on the host to store the data
Access Modes for PersistentVolumes
PersistentVolumes support different access modes:
- ReadWriteOnce (RWO): Volume can be mounted as read-write by a single node
- ReadOnlyMany (ROX): Volume can be mounted read-only by many nodes
- ReadWriteMany (RWX): Volume can be mounted as read-write by many nodes
- ReadWriteOncePod (RWOP): Volume can be mounted as read-write by a single pod (Kubernetes v1.22+)
It's important to note that not all storage types support all access modes.
Creating a PersistentVolumeClaim
To use a PersistentVolume, we need to create a PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
In this example:
- We've named our PVC
task-pv-claim
- We've specified that we want to use the
manual
storage class (same as our PV) - We've requested 3Gi of storage (which is less than the 10Gi our PV provides)
- We've specified the same access mode as our PV
Using a PersistentVolumeClaim in a Pod
Now that we have a PVC, we can use it in a pod:
apiVersion: v1
kind: Pod
metadata:
name: task-pv-pod
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: task-pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: task-pv-storage
In this example:
- We've created a pod that uses our PVC
- We've mounted the PVC to the path
/usr/share/nginx/html
in the container - The nginx container will now be able to read and write to this path, and the data will persist even if the pod is deleted
Dynamic Provisioning with StorageClasses
So far, we've been working with static provisioning where an administrator creates PVs manually. But Kubernetes also supports dynamic provisioning using StorageClasses:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
allowVolumeExpansion: true
With a StorageClass in place, users can simply create PVCs without worrying about PVs:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dynamic-pvc
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
Kubernetes will automatically provision a PV that matches the requirements.
Real-World Example: WordPress with MySQL
Let's see a practical example of using PersistentVolumes for a WordPress site with MySQL database:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: wordpress-pvc
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
name: mysql
spec:
containers:
- name: mysql
image: mysql:5.7
env:
- name: MYSQL_ROOT_PASSWORD
value: password
- name: MYSQL_DATABASE
value: wordpress
ports:
- containerPort: 3306
name: mysql
volumeMounts:
- name: mysql-data
mountPath: /var/lib/mysql
volumes:
- name: mysql-data
persistentVolumeClaim:
claimName: mysql-pvc
---
apiVersion: v1
kind: Pod
metadata:
name: wordpress
spec:
containers:
- name: wordpress
image: wordpress:latest
env:
- name: WORDPRESS_DB_HOST
value: mysql
- name: WORDPRESS_DB_PASSWORD
value: password
ports:
- containerPort: 80
name: wordpress
volumeMounts:
- name: wordpress-data
mountPath: /var/www/html
volumes:
- name: wordpress-data
persistentVolumeClaim:
claimName: wordpress-pvc
In this example:
- We create two PVCs: one for MySQL and one for WordPress
- Both pods use their respective PVCs to store data
- If either pod is deleted, the data will persist in the PVs
Monitoring PersistentVolumes
You can monitor PersistentVolumes and PersistentVolumeClaims using kubectl:
# List all PersistentVolumes
kubectl get pv
# Output example:
# NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
# task-pv-volume 10Gi RWO Retain Bound default/task-pv-claim manual 1h
# List all PersistentVolumeClaims
kubectl get pvc
# Output example:
# NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
# task-pv-claim Bound task-pv-volume 10Gi RWO manual 1h
# Get detailed information about a specific PV
kubectl describe pv task-pv-volume
Common Issues and Troubleshooting
PVC Stuck in Pending State
If your PVC is stuck in the "Pending" state, it means Kubernetes can't find a suitable PV to bind it to. Check:
- Storage Classes match: Make sure the PVC's
storageClassName
matches an existing Storage Class - Access Modes: Ensure the requested access mode is supported by available PVs
- Capacity: Verify that you're not requesting more storage than available PVs provide
Unable to Delete a PV
If you can't delete a PV, it might be because:
- PV is still bound to a PVC: Delete the PVC first
- Reclaim policy is set to "Retain": You may need to manually delete the underlying storage resource
Best Practices for Using PersistentVolumes
- Use StorageClasses for dynamic provisioning to reduce administrative overhead
- Set appropriate reclaim policies based on your needs
- Consider using Helm charts for complex applications that require persistent storage
- Use labels and selectors to help with PV/PVC matching
- Monitor storage usage to avoid running out of space
- Consider using StatefulSets for applications that require stable, persistent storage with predictable names
Summary
PersistentVolumes provide a way to manage storage in Kubernetes that's independent of the pod lifecycle. They allow applications to store data persistently, even when pods are deleted or rescheduled. The PV and PVC system separates the concerns of storage provisioning from consumption, making it easier to manage storage resources in a Kubernetes cluster.
Key points to remember:
- PVs are cluster resources that represent physical storage
- PVCs are requests for those resources
- StorageClasses enable dynamic provisioning
- Reclaim policies determine what happens to storage when a PVC is deleted
- PVs support different access modes for different use cases
Exercises
- Create a PersistentVolume and PersistentVolumeClaim manually, then use it in a pod running a simple application that writes to a file.
- Set up dynamic provisioning using a StorageClass on a cloud provider (like AWS, GCP, or Azure).
- Create a StatefulSet that uses PersistentVolumeClaims, and observe how Kubernetes handles storage when pods are deleted and recreated.
- Experiment with different reclaim policies and observe their behavior when PVCs are deleted.
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)