Kubernetes Resource Management
Introduction
Resource management is one of the most critical aspects of running applications on Kubernetes. When you deploy applications in a Kubernetes cluster, you're essentially sharing a pool of compute resources (CPU, memory, storage) among all your workloads. Without proper resource management, some applications might consume too many resources, leaving others resource-starved and potentially causing performance issues or outright failures.
In this guide, we'll explore how Kubernetes manages compute resources, how to configure resource requests and limits for your containers, and best practices to ensure your applications run efficiently and reliably.
Understanding Kubernetes Resources
Kubernetes primarily manages two types of resources:
-
CPU - Measured in CPU units, where 1 CPU equals:
- 1 vCPU/Core for cloud providers
- 1 hyperthread on bare-metal Intel processors
-
Memory - Measured in bytes, typically expressed as:
- Mebibytes (Mi): 1 Mi = 2^20 bytes = 1,048,576 bytes
- Gibibytes (Gi): 1 Gi = 2^30 bytes = 1,073,741,824 bytes
Let's visualize how these resources are allocated in a Kubernetes cluster:
Resource Requests and Limits
Kubernetes uses two key concepts to manage container resources:
Resource Requests
A resource request specifies the minimum amount of resources that should be reserved for a container. When you specify a request for a container, the Kubernetes scheduler uses this information to decide which node to place the Pod on.
Resource Limits
A resource limit defines the maximum amount of resources that a container can use. If a container attempts to use more than its limit:
- For CPU: The container will be throttled (not killed)
- For Memory: The container might be terminated (OOMKilled) if it tries to use more memory
Configuring Resource Requests and Limits
Resource requests and limits are configured at the container level within a Pod specification:
apiVersion: v1
kind: Pod
metadata:
name: resource-demo
spec:
containers:
- name: resource-demo-container
image: nginx
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
In this example:
- The container requests 64 MiB of memory and 0.25 CPU cores
- The container is limited to 128 MiB of memory and 0.5 CPU cores
Note: CPU resources are specified in "millicores". 1000m equals 1 full CPU core.
How to Set Appropriate Values
Setting appropriate resource requests and limits requires a balance:
- Too low requests: Your application might not get the resources it needs
- Too high requests: Resources might be reserved but not used, wasting cluster capacity
- Too low limits: Your application might get throttled or OOMKilled
- Too high limits: You risk having "noisy neighbors" that can consume excessive resources
Steps to determine appropriate values:
- Profile your application to understand its resource usage patterns
- Start with a baseline (e.g., set requests at 50% of expected usage)
- Monitor resource usage and adjust accordingly
- Consider peak-to-average ratios when setting limits
Quality of Service (QoS) Classes
Kubernetes assigns each Pod to one of three QoS classes based on its resource configuration:
- Guaranteed: When requests and limits are set and they are equal for all containers
- Burstable: When requests are set but they don't meet the criteria for Guaranteed
- BestEffort: When neither requests nor limits are set
These classes determine the priority of Pods during resource constraints:
Let's look at examples of Pods with different QoS classes:
Guaranteed QoS Example
apiVersion: v1
kind: Pod
metadata:
name: guaranteed-pod
spec:
containers:
- name: guaranteed-container
image: nginx
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "500m"
Burstable QoS Example
apiVersion: v1
kind: Pod
metadata:
name: burstable-pod
spec:
containers:
- name: burstable-container
image: nginx
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
BestEffort QoS Example
apiVersion: v1
kind: Pod
metadata:
name: besteffort-pod
spec:
containers:
- name: besteffort-container
image: nginx
# No resources specified
Practical Example: Resource Management for a Web Application
Let's create a more real-world example for a simple web application with an API backend and a database:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-frontend
spec:
replicas: 3
selector:
matchLabels:
app: web-frontend
template:
metadata:
labels:
app: web-frontend
spec:
containers:
- name: nginx
image: nginx:1.19
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "200m"
ports:
- containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-backend
spec:
replicas: 2
selector:
matchLabels:
app: api-backend
template:
metadata:
labels:
app: api-backend
spec:
containers:
- name: api-server
image: my-api-image:v1
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
ports:
- containerPort: 8080
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: database
spec:
serviceName: "database"
replicas: 1
selector:
matchLabels:
app: database
template:
metadata:
labels:
app: database
spec:
containers:
- name: postgres
image: postgres:13
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
Resource Requirements Analysis:
-
Web Frontend: Lightweight, serving static content
- Low CPU and memory needs (100m CPU, 64Mi memory)
- Scales horizontally (3 replicas)
-
API Backend: Medium workload with business logic
- Moderate CPU and memory (250m CPU, 256Mi memory)
- Some headroom for traffic spikes (limits at 2x requests)
-
Database: Resource-intensive with data persistence
- Higher memory requirements (512Mi requested, 1Gi limit)
- CPU needs depend on query complexity
Monitoring Resource Usage
To effectively manage resources, you need to monitor actual usage. The Kubernetes Metrics Server collects resource metrics from Kubelets and makes them available through the Metrics API.
You can check resource usage with:
kubectl top pods
Example output:
NAME CPU(cores) MEMORY(bytes)
web-frontend-75d9546fd6-8z9vk 12m 78Mi
web-frontend-75d9546fd6-qjvtx 10m 76Mi
web-frontend-75d9546fd6-xvd4z 11m 75Mi
api-backend-85d7fd564c-n8p5g 120m 180Mi
api-backend-85d7fd564c-t7pkx 115m 175Mi
database-0 210m 420Mi
Based on this output, you might decide:
- Frontend pods are well within their limits
- API backend has comfortable headroom
- Database usage is approaching the requested amount but still has buffer before hitting limits
Resource Management Best Practices
- Always set resource requests for production workloads
- Don't guess resource values - measure and monitor
- Set memory limits to prevent pods from consuming too much memory
- Be cautious with CPU limits as they can cause throttling
- Aim for Guaranteed QoS for critical applications
- Reserve resources for system components and Kubernetes itself
- Use namespace resource quotas to partition your cluster
- Set default resource requests/limits using LimitRange
- Regularly review and adjust resource configurations
Namespace Resource Quotas
You can limit resources at the namespace level:
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: team-space
spec:
hard:
requests.cpu: "4"
requests.memory: 4Gi
limits.cpu: "8"
limits.memory: 8Gi
pods: "20"
This quota ensures that all pods in the "team-space" namespace cannot collectively request more than 4 CPU cores and 4GiB of memory.
Default Resource Limits with LimitRange
You can set default resource values for containers in a namespace:
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: team-space
spec:
limits:
- default:
cpu: "300m"
memory: "256Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
type: Container
With this LimitRange, any container created without specified resource requests/limits will get these default values.
Common Mistakes to Avoid
- Not setting any resources - leads to resource contention
- Setting requests = limits for CPU - can lead to throttling
- Setting memory limits too low - causes container restarts
- Overestimating resources - wastes cluster capacity
- Using the same values for all applications - different workloads have different needs
Summary
Effective resource management is crucial for running reliable and efficient Kubernetes workloads. By properly configuring resource requests and limits, you ensure that:
- Your applications have the resources they need to run reliably
- The Kubernetes scheduler can make informed placement decisions
- Your cluster resources are used efficiently
- Critical applications are protected during resource constraints
Remember these key points:
- Requests are guaranteed minimums used for scheduling
- Limits are enforced maximums that control resource usage
- QoS classes determine eviction priority during resource pressure
- Always monitor and adjust resource configurations based on actual usage
Additional Exercises
-
Resource Profiling Exercise:
- Deploy a sample application without resource specifications
- Monitor its resource usage for 24 hours with
kubectl top pods
- Based on the observations, set appropriate requests and limits
-
QoS Classification Practice:
- Create three different pod specifications, each targeting a different QoS class
- Verify the QoS class with
kubectl get pod <pod-name> -o yaml | grep qosClass
-
Resource Quota Challenge:
- Create a namespace with a specific resource quota
- Try to deploy workloads that would exceed the quota
- Modify the workloads to fit within the quota
Additional Resources
- Kubernetes Documentation: Managing Resources for Containers
- Kubernetes Best Practices: Resource Management
- Kubernetes Patterns for Application Resource Management
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)