Kubernetes Requests

Introduction

When deploying applications on a Kubernetes cluster, one of the most important aspects of configuration is specifying the resources your applications need to run properly. Kubernetes provides a mechanism called resource requests that allows you to inform the scheduler about the minimum resources your containers require to function effectively.

Resource requests are part of the broader resource management system in Kubernetes that helps ensure applications are placed on nodes with sufficient resources and receive appropriate allocation during runtime. Understanding and properly configuring resource requests is crucial for maintaining a stable and efficient Kubernetes environment.

What Are Kubernetes Requests?

Resource requests in Kubernetes specify the minimum amount of resources that a container needs to run properly. When you define resource requests in your pod specification, you're essentially telling the Kubernetes scheduler:

"My container needs at least this much CPU and memory to function correctly."

These requests play a critical role in the scheduling decisions that Kubernetes makes when placing pods on nodes in your cluster.

Key Concepts

Before diving into the details of resource requests, let's understand some fundamental concepts:

Resources in Kubernetes

Kubernetes primarily manages two types of resources:

CPU - Measured in CPU units, where 1 CPU unit equals:
- 1 vCPU/Core for cloud providers
- 1 hyperthread on bare-metal Intel processors
Memory - Typically measured in bytes, with common abbreviations being:
- Mi (Mebibytes) - 2^20 bytes
- Gi (Gibibytes) - 2^30 bytes

Requests vs. Limits

Kubernetes resource management has two key components:

Requests: The minimum resources guaranteed to the container
Limits: The maximum resources a container can use

This guide focuses primarily on requests, but understanding the relationship between requests and limits is important for comprehensive resource management.

How Resource Requests Work

Scheduling Phase

When you create a pod with resource requests, the Kubernetes scheduler considers these requests when deciding which node to place the pod on. The scheduler will only place the pod on nodes that have enough unallocated resources to satisfy your requests.

Running Phase

Once a pod is scheduled on a node:

The requested resources are reserved for that pod
The container runtime (like Docker) is configured to ensure the pod gets at least the requested resources
Even if the pod doesn't use all its requested resources, those resources remain allocated to it

Configuring Resource Requests

You configure resource requests in the pod specification, typically within a deployment, statefulset, or other workload resource.

Basic Syntax

Here's how to specify resource requests in a container spec:

yaml
apiVersion: v1
kind: Pod
metadata:
  name: frontend
spec:
  containers:
  - name: app
    image: nginx
    resources:
      requests:
        memory: "128Mi"
        cpu: "250m"

In this example:

The container requests 128 Mebibytes of memory
The container requests 250 millicpu (0.25 CPU cores)

CPU Request Units

CPU requests can be specified in various formats:

0.5 or 500m - Half a CPU core
1 - One full CPU core
250m - Quarter of a CPU core (250 millicpu)

The m suffix stands for "milli," meaning 1/1000th of a CPU core.

Memory Request Units

Memory can be specified using several suffixes:

128Mi - 128 Mebibytes
1Gi - 1 Gibibyte
512M - 512 Megabytes
2G - 2 Gigabytes

Note the difference between powers of 2 (Mi, Gi) and powers of 10 (M, G).

Practical Examples

Let's walk through some practical examples to understand how requests work in real-world scenarios.

Example 1: Simple Web Application

For a basic web server with low traffic:

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-server
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.19
        ports:
        - containerPort: 80
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"

This configuration:

Requests 64 MiB of memory per container
Requests 0.1 CPU cores per container
With 3 replicas, the total request is 192 MiB of memory and 0.3 CPU cores

Example 2: Database Application

For a database that requires more resources:

yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres-db
spec:
  serviceName: "postgres"
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:13
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secrets
              key: password
        ports:
        - containerPort: 5432
          name: postgres
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
        volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: postgres-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: "10Gi"

This example:

Requests 1 GiB of memory for the database
Requests 0.5 CPU cores
Also includes a storage request of 10 GiB for persistent data

Best Practices for Setting Requests

Setting appropriate resource requests is crucial for efficient cluster operation. Here are some best practices:

1. Measure Actual Resource Usage

Before setting requests, monitor your application's actual resource usage using tools like:

Kubernetes Metrics Server
Prometheus
Resource usage data from kubectl top pods

This helps you set realistic request values.

2. Set Requests Based on Typical Workload

Set your requests based on the typical workload of your application, not peak usage. For peak usage, you can use resource limits (which we'll cover in a separate guide).

3. Start Conservative and Adjust

If you're unsure:

Start with conservative request values
Monitor application performance
Adjust values based on observed behavior

4. Consider Application Startup Needs

Some applications require more resources during startup than during normal operation. Make sure your requests accommodate startup resource needs.

5. Be Mindful of Cluster Capacity

Always consider the total capacity of your cluster when setting requests. Overcommitting resources can lead to scheduling problems.

Common Mistakes to Avoid

1. Setting Requests Too Low

If you set requests too low:

Your application might be scheduled on a node without enough resources
Your application might experience performance issues or unexpected behavior

2. Setting Requests Too High

If you set requests too high:

You waste cluster resources
Fewer pods can be scheduled on your cluster
You might hit scheduling limitations

3. Ignoring Resource Requests Entirely

Failing to set resource requests:

Makes the scheduler's job harder
Can lead to resource contention
May result in unpredictable application performance

4. Using the Wrong Units

Be careful about the units you use:

512M vs 512Mi makes a difference
0.5 CPU is equivalent to 500m CPU

Debugging Resource Request Issues

If your pod is stuck in a Pending state, it might be due to resource request issues:

Check pod status:
bash
```
kubectl describe pod [pod-name]
```

Look for messages like:

0/3 nodes are available: 3 Insufficient cpu, 3 Insufficient memory.

Either adjust your resource requests or add more capacity to your cluster.

Resource Requests in Multi-Container Pods

For pods with multiple containers, Kubernetes considers the sum of all container requests:

yaml
apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
spec:
  containers:
  - name: app
    image: app:latest
    resources:
      requests:
        memory: "256Mi"
        cpu: "250m"
  - name: sidecar
    image: sidecar:latest
    resources:
      requests:
        memory: "64Mi"
        cpu: "50m"

Total pod requests:

CPU: 300m (250m + 50m)
Memory: 320Mi (256Mi + 64Mi)

Resource Requests vs. Namespace Quotas

Kubernetes allows setting resource quotas at the namespace level, which can limit the total resources requested by all pods in a namespace:

yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
  namespace: development
spec:
  hard:
    requests.cpu: "2"
    requests.memory: 2Gi

This quota ensures all pods in the "development" namespace can't request more than 2 CPU cores and 2 GiB of memory in total.

Summary

Kubernetes resource requests are a fundamental part of the resource management system that help ensure pods are scheduled on nodes with sufficient resources. By properly configuring resource requests, you:

Help the scheduler make better decisions
Ensure your applications have the resources they need
Improve the stability and efficiency of your Kubernetes cluster

Remember these key points:

Requests specify the minimum resources your container needs
They affect scheduling decisions and resource allocation
Setting appropriate requests requires understanding your application's resource usage
Always measure and monitor to fine-tune your resource requests

Additional Resources

For further learning:

Exercises

Deploy a simple application without resource requests, then add appropriate requests and observe any differences in behavior.
Create a namespace with resource quotas and test how it prevents over-provisioning of resources.
Use kubectl top pods to monitor the actual resource usage of your applications and compare it with your configured requests.
Create a multi-container pod with different resource requests for each container and observe how Kubernetes manages resources for the pod.
Deliberately set resource requests higher than your node capacity and observe what happens when you try to schedule the pod.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

What Are Kubernetes Requests?​

Key Concepts​

Resources in Kubernetes​

Requests vs. Limits​

How Resource Requests Work​

Scheduling Phase​

Running Phase​

Configuring Resource Requests​

Basic Syntax​

CPU Request Units​

Memory Request Units​

Practical Examples​

Example 1: Simple Web Application​

Example 2: Database Application​

Best Practices for Setting Requests​

1. Measure Actual Resource Usage​

2. Set Requests Based on Typical Workload​

3. Start Conservative and Adjust​

4. Consider Application Startup Needs​

5. Be Mindful of Cluster Capacity​

Common Mistakes to Avoid​

1. Setting Requests Too Low​

2. Setting Requests Too High​

3. Ignoring Resource Requests Entirely​

4. Using the Wrong Units​

Debugging Resource Request Issues​

Resource Requests in Multi-Container Pods​

Resource Requests vs. Namespace Quotas​

Summary​

Additional Resources​

Exercises​