Kubernetes Resource Management

Introduction

Resource management is one of the most critical aspects of running applications on Kubernetes. When you deploy applications in a Kubernetes cluster, you're essentially sharing a pool of compute resources (CPU, memory, storage) among all your workloads. Without proper resource management, some applications might consume too many resources, leaving others resource-starved and potentially causing performance issues or outright failures.

In this guide, we'll explore how Kubernetes manages compute resources, how to configure resource requests and limits for your containers, and best practices to ensure your applications run efficiently and reliably.

Understanding Kubernetes Resources

Kubernetes primarily manages two types of resources:

CPU - Measured in CPU units, where 1 CPU equals:
- 1 vCPU/Core for cloud providers
- 1 hyperthread on bare-metal Intel processors
Memory - Measured in bytes, typically expressed as:
- Mebibytes (Mi): 1 Mi = 2^20 bytes = 1,048,576 bytes
- Gibibytes (Gi): 1 Gi = 2^30 bytes = 1,073,741,824 bytes

Let's visualize how these resources are allocated in a Kubernetes cluster:

Resource Requests and Limits

Kubernetes uses two key concepts to manage container resources:

Resource Requests

A resource request specifies the minimum amount of resources that should be reserved for a container. When you specify a request for a container, the Kubernetes scheduler uses this information to decide which node to place the Pod on.

Resource Limits

A resource limit defines the maximum amount of resources that a container can use. If a container attempts to use more than its limit:

For CPU: The container will be throttled (not killed)
For Memory: The container might be terminated (OOMKilled) if it tries to use more memory

Configuring Resource Requests and Limits

Resource requests and limits are configured at the container level within a Pod specification:

apiVersion: v1
kind: Pod
metadata:
  name: resource-demo
spec:
  containers:
  - name: resource-demo-container
    image: nginx
    resources:
      requests:
        memory: "64Mi"
        cpu: "250m"
      limits:
        memory: "128Mi"
        cpu: "500m"

In this example:

The container requests 64 MiB of memory and 0.25 CPU cores
The container is limited to 128 MiB of memory and 0.5 CPU cores

Note: CPU resources are specified in "millicores". 1000m equals 1 full CPU core.

How to Set Appropriate Values

Setting appropriate resource requests and limits requires a balance:

Too low requests: Your application might not get the resources it needs
Too high requests: Resources might be reserved but not used, wasting cluster capacity
Too low limits: Your application might get throttled or OOMKilled
Too high limits: You risk having "noisy neighbors" that can consume excessive resources

Steps to determine appropriate values:

Profile your application to understand its resource usage patterns
Start with a baseline (e.g., set requests at 50% of expected usage)
Monitor resource usage and adjust accordingly
Consider peak-to-average ratios when setting limits

Quality of Service (QoS) Classes

Kubernetes assigns each Pod to one of three QoS classes based on its resource configuration:

Guaranteed: When requests and limits are set and they are equal for all containers
Burstable: When requests are set but they don't meet the criteria for Guaranteed
BestEffort: When neither requests nor limits are set

These classes determine the priority of Pods during resource constraints:

Let's look at examples of Pods with different QoS classes:

Guaranteed QoS Example

apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-pod
spec:
  containers:
  - name: guaranteed-container
    image: nginx
    resources:
      requests:
        memory: "256Mi"
        cpu: "500m"
      limits:
        memory: "256Mi"
        cpu: "500m"

Burstable QoS Example

apiVersion: v1
kind: Pod
metadata:
  name: burstable-pod
spec:
  containers:
  - name: burstable-container
    image: nginx
    resources:
      requests:
        memory: "128Mi"
        cpu: "250m"
      limits:
        memory: "256Mi"
        cpu: "500m"

BestEffort QoS Example

apiVersion: v1
kind: Pod
metadata:
  name: besteffort-pod
spec:
  containers:
  - name: besteffort-container
    image: nginx
    # No resources specified

Practical Example: Resource Management for a Web Application

Let's create a more real-world example for a simple web application with an API backend and a database:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-frontend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-frontend
  template:
    metadata:
      labels:
        app: web-frontend
    spec:
      containers:
      - name: nginx
        image: nginx:1.19
        resources:
          requests:
            memory: "64Mi"
            cpu: "100m"
          limits:
            memory: "128Mi"
            cpu: "200m"
        ports:
        - containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-backend
spec:
  replicas: 2
  selector:
    matchLabels:
      app: api-backend
  template:
    metadata:
      labels:
        app: api-backend
    spec:
      containers:
      - name: api-server
        image: my-api-image:v1
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        ports:
        - containerPort: 8080
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: database
spec:
  serviceName: "database"
  replicas: 1
  selector:
    matchLabels:
      app: database
  template:
    metadata:
      labels:
        app: database
    spec:
      containers:
      - name: postgres
        image: postgres:13
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: password

Resource Requirements Analysis:

Web Frontend: Lightweight, serving static content
- Low CPU and memory needs (100m CPU, 64Mi memory)
- Scales horizontally (3 replicas)
API Backend: Medium workload with business logic
- Moderate CPU and memory (250m CPU, 256Mi memory)
- Some headroom for traffic spikes (limits at 2x requests)
Database: Resource-intensive with data persistence
- Higher memory requirements (512Mi requested, 1Gi limit)
- CPU needs depend on query complexity

Monitoring Resource Usage

To effectively manage resources, you need to monitor actual usage. The Kubernetes Metrics Server collects resource metrics from Kubelets and makes them available through the Metrics API.

You can check resource usage with:

kubectl top pods

Example output:

NAME                            CPU(cores)   MEMORY(bytes)
web-frontend-75d9546fd6-8z9vk   12m          78Mi
web-frontend-75d9546fd6-qjvtx   10m          76Mi
web-frontend-75d9546fd6-xvd4z   11m          75Mi
api-backend-85d7fd564c-n8p5g    120m         180Mi
api-backend-85d7fd564c-t7pkx    115m         175Mi
database-0                      210m         420Mi

Based on this output, you might decide:

Frontend pods are well within their limits
API backend has comfortable headroom
Database usage is approaching the requested amount but still has buffer before hitting limits

Resource Management Best Practices

Always set resource requests for production workloads
Don't guess resource values - measure and monitor
Set memory limits to prevent pods from consuming too much memory
Be cautious with CPU limits as they can cause throttling
Aim for Guaranteed QoS for critical applications
Reserve resources for system components and Kubernetes itself
Use namespace resource quotas to partition your cluster
Set default resource requests/limits using LimitRange
Regularly review and adjust resource configurations

Namespace Resource Quotas

You can limit resources at the namespace level:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: team-space
spec:
  hard:
    requests.cpu: "4"
    requests.memory: 4Gi
    limits.cpu: "8"
    limits.memory: 8Gi
    pods: "20"

This quota ensures that all pods in the "team-space" namespace cannot collectively request more than 4 CPU cores and 4GiB of memory.

Default Resource Limits with LimitRange

You can set default resource values for containers in a namespace:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: team-space
spec:
  limits:
  - default:
      cpu: "300m"
      memory: "256Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"
    type: Container

With this LimitRange, any container created without specified resource requests/limits will get these default values.

Common Mistakes to Avoid

Not setting any resources - leads to resource contention
Setting requests = limits for CPU - can lead to throttling
Setting memory limits too low - causes container restarts
Overestimating resources - wastes cluster capacity
Using the same values for all applications - different workloads have different needs

Summary

Effective resource management is crucial for running reliable and efficient Kubernetes workloads. By properly configuring resource requests and limits, you ensure that:

Your applications have the resources they need to run reliably
The Kubernetes scheduler can make informed placement decisions
Your cluster resources are used efficiently
Critical applications are protected during resource constraints

Remember these key points:

Requests are guaranteed minimums used for scheduling
Limits are enforced maximums that control resource usage
QoS classes determine eviction priority during resource pressure
Always monitor and adjust resource configurations based on actual usage

Additional Exercises

Resource Profiling Exercise:
- Deploy a sample application without resource specifications
- Monitor its resource usage for 24 hours with kubectl top pods
- Based on the observations, set appropriate requests and limits
QoS Classification Practice:
- Create three different pod specifications, each targeting a different QoS class
- Verify the QoS class with kubectl get pod <pod-name> -o yaml | grep qosClass
Resource Quota Challenge:
- Create a namespace with a specific resource quota
- Try to deploy workloads that would exceed the quota
- Modify the workloads to fit within the quota

Additional Resources

Kubernetes Documentation: Managing Resources for Containers
Kubernetes Best Practices: Resource Management
Kubernetes Patterns for Application Resource Management

💡 Found a typo or mistake? Click "Edit this page" to suggest a correction. Your feedback is greatly appreciated!

Introduction​

Understanding Kubernetes Resources​

Resource Requests and Limits​

Resource Requests​

Resource Limits​

Configuring Resource Requests and Limits​

How to Set Appropriate Values​

Steps to determine appropriate values:​

Quality of Service (QoS) Classes​

Guaranteed QoS Example​

Burstable QoS Example​

BestEffort QoS Example​

Practical Example: Resource Management for a Web Application​

Resource Requirements Analysis:​

Monitoring Resource Usage​

Resource Management Best Practices​

Namespace Resource Quotas​

Default Resource Limits with LimitRange​

Common Mistakes to Avoid​

Summary​

Additional Exercises​

Additional Resources​