Kubernetes Requests
Introduction
When deploying applications on a Kubernetes cluster, one of the most important aspects of configuration is specifying the resources your applications need to run properly. Kubernetes provides a mechanism called resource requests that allows you to inform the scheduler about the minimum resources your containers require to function effectively.
Resource requests are part of the broader resource management system in Kubernetes that helps ensure applications are placed on nodes with sufficient resources and receive appropriate allocation during runtime. Understanding and properly configuring resource requests is crucial for maintaining a stable and efficient Kubernetes environment.
What Are Kubernetes Requests?
Resource requests in Kubernetes specify the minimum amount of resources that a container needs to run properly. When you define resource requests in your pod specification, you're essentially telling the Kubernetes scheduler:
"My container needs at least this much CPU and memory to function correctly."
These requests play a critical role in the scheduling decisions that Kubernetes makes when placing pods on nodes in your cluster.
Key Concepts
Before diving into the details of resource requests, let's understand some fundamental concepts:
Resources in Kubernetes
Kubernetes primarily manages two types of resources:
-
CPU - Measured in CPU units, where 1 CPU unit equals:
- 1 vCPU/Core for cloud providers
- 1 hyperthread on bare-metal Intel processors
-
Memory - Typically measured in bytes, with common abbreviations being:
- Mi (Mebibytes) - 2^20 bytes
- Gi (Gibibytes) - 2^30 bytes
Requests vs. Limits
Kubernetes resource management has two key components:
- Requests: The minimum resources guaranteed to the container
- Limits: The maximum resources a container can use
This guide focuses primarily on requests, but understanding the relationship between requests and limits is important for comprehensive resource management.
How Resource Requests Work
Scheduling Phase
When you create a pod with resource requests, the Kubernetes scheduler considers these requests when deciding which node to place the pod on. The scheduler will only place the pod on nodes that have enough unallocated resources to satisfy your requests.
Running Phase
Once a pod is scheduled on a node:
- The requested resources are reserved for that pod
- The container runtime (like Docker) is configured to ensure the pod gets at least the requested resources
- Even if the pod doesn't use all its requested resources, those resources remain allocated to it
Configuring Resource Requests
You configure resource requests in the pod specification, typically within a deployment, statefulset, or other workload resource.
Basic Syntax
Here's how to specify resource requests in a container spec:
apiVersion: v1
kind: Pod
metadata:
name: frontend
spec:
containers:
- name: app
image: nginx
resources:
requests:
memory: "128Mi"
cpu: "250m"
In this example:
- The container requests 128 Mebibytes of memory
- The container requests 250 millicpu (0.25 CPU cores)
CPU Request Units
CPU requests can be specified in various formats:
0.5
or500m
- Half a CPU core1
- One full CPU core250m
- Quarter of a CPU core (250 millicpu)
The m
suffix stands for "milli," meaning 1/1000th of a CPU core.
Memory Request Units
Memory can be specified using several suffixes:
128Mi
- 128 Mebibytes1Gi
- 1 Gibibyte512M
- 512 Megabytes2G
- 2 Gigabytes
Note the difference between powers of 2 (Mi, Gi) and powers of 10 (M, G).
Practical Examples
Let's walk through some practical examples to understand how requests work in real-world scenarios.
Example 1: Simple Web Application
For a basic web server with low traffic:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.19
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "100m"
This configuration:
- Requests 64 MiB of memory per container
- Requests 0.1 CPU cores per container
- With 3 replicas, the total request is 192 MiB of memory and 0.3 CPU cores
Example 2: Database Application
For a database that requires more resources:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres-db
spec:
serviceName: "postgres"
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:13
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secrets
key: password
ports:
- containerPort: 5432
name: postgres
resources:
requests:
memory: "1Gi"
cpu: "500m"
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: "10Gi"
This example:
- Requests 1 GiB of memory for the database
- Requests 0.5 CPU cores
- Also includes a storage request of 10 GiB for persistent data
Best Practices for Setting Requests
Setting appropriate resource requests is crucial for efficient cluster operation. Here are some best practices:
1. Measure Actual Resource Usage
Before setting requests, monitor your application's actual resource usage using tools like:
- Kubernetes Metrics Server
- Prometheus
- Resource usage data from
kubectl top pods
This helps you set realistic request values.
2. Set Requests Based on Typical Workload
Set your requests based on the typical workload of your application, not peak usage. For peak usage, you can use resource limits (which we'll cover in a separate guide).
3. Start Conservative and Adjust
If you're unsure:
- Start with conservative request values
- Monitor application performance
- Adjust values based on observed behavior
4. Consider Application Startup Needs
Some applications require more resources during startup than during normal operation. Make sure your requests accommodate startup resource needs.
5. Be Mindful of Cluster Capacity
Always consider the total capacity of your cluster when setting requests. Overcommitting resources can lead to scheduling problems.
Common Mistakes to Avoid
1. Setting Requests Too Low
If you set requests too low:
- Your application might be scheduled on a node without enough resources
- Your application might experience performance issues or unexpected behavior
2. Setting Requests Too High
If you set requests too high:
- You waste cluster resources
- Fewer pods can be scheduled on your cluster
- You might hit scheduling limitations
3. Ignoring Resource Requests Entirely
Failing to set resource requests:
- Makes the scheduler's job harder
- Can lead to resource contention
- May result in unpredictable application performance
4. Using the Wrong Units
Be careful about the units you use:
512M
vs512Mi
makes a difference0.5
CPU is equivalent to500m
CPU
Debugging Resource Request Issues
If your pod is stuck in a Pending
state, it might be due to resource request issues:
-
Check pod status:
bashkubectl describe pod [pod-name]
-
Look for messages like:
0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory.
-
Either adjust your resource requests or add more capacity to your cluster.
Resource Requests in Multi-Container Pods
For pods with multiple containers, Kubernetes considers the sum of all container requests:
apiVersion: v1
kind: Pod
metadata:
name: multi-container-pod
spec:
containers:
- name: app
image: app:latest
resources:
requests:
memory: "256Mi"
cpu: "250m"
- name: sidecar
image: sidecar:latest
resources:
requests:
memory: "64Mi"
cpu: "50m"
Total pod requests:
- CPU: 300m (250m + 50m)
- Memory: 320Mi (256Mi + 64Mi)
Resource Requests vs. Namespace Quotas
Kubernetes allows setting resource quotas at the namespace level, which can limit the total resources requested by all pods in a namespace:
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-resources
namespace: development
spec:
hard:
requests.cpu: "2"
requests.memory: 2Gi
This quota ensures all pods in the "development" namespace can't request more than 2 CPU cores and 2 GiB of memory in total.
Summary
Kubernetes resource requests are a fundamental part of the resource management system that help ensure pods are scheduled on nodes with sufficient resources. By properly configuring resource requests, you:
- Help the scheduler make better decisions
- Ensure your applications have the resources they need
- Improve the stability and efficiency of your Kubernetes cluster
Remember these key points:
- Requests specify the minimum resources your container needs
- They affect scheduling decisions and resource allocation
- Setting appropriate requests requires understanding your application's resource usage
- Always measure and monitor to fine-tune your resource requests
Additional Resources
For further learning:
- Kubernetes Official Documentation on Resource Management
- Kubernetes Resource Quotas
- Kubernetes LimitRange
Exercises
-
Deploy a simple application without resource requests, then add appropriate requests and observe any differences in behavior.
-
Create a namespace with resource quotas and test how it prevents over-provisioning of resources.
-
Use
kubectl top pods
to monitor the actual resource usage of your applications and compare it with your configured requests. -
Create a multi-container pod with different resource requests for each container and observe how Kubernetes manages resources for the pod.
-
Deliberately set resource requests higher than your node capacity and observe what happens when you try to schedule the pod.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)