Kubernetes High Availability

Introduction

High Availability (HA) is a critical concept in production Kubernetes environments. It refers to the ability of a system to continue operating without interruption when one or more of its components fail. In the context of Kubernetes, high availability means ensuring that your cluster remains operational even if some nodes, control plane components, or other parts of the infrastructure experience failures.

This guide will walk you through the key concepts, architecture patterns, and implementation steps to create highly available Kubernetes clusters that can withstand various types of failures while maintaining your applications' uptime.

Why High Availability Matters

Before diving into the technical details, let's understand why high availability is essential:

Minimize Downtime: Ensures your applications remain accessible even during infrastructure failures
Business Continuity: Prevents service interruptions that could impact users or revenue
SLA Compliance: Helps meet service level agreements that specify uptime requirements
Disaster Recovery: Provides resilience against various failure scenarios from hardware issues to zone outages

Understanding Kubernetes Cluster Architecture

To implement high availability, we first need to understand the standard Kubernetes architecture:

In a basic setup, you have:

Control Plane (Master) Components:
- API Server: The entry point for all REST commands
- etcd: The cluster's database that stores all cluster state
- Controller Manager: Runs controller processes
- Scheduler: Assigns workloads to nodes
Worker Nodes:
- kubelet: Ensures containers are running in a Pod
- kube-proxy: Maintains network rules
- Container Runtime: Software that runs containers (e.g., containerd)

A non-HA Kubernetes cluster typically has a single control plane node, which creates a single point of failure.

High Availability Architecture

A highly available Kubernetes cluster has the following key characteristics:

Multiple Control Plane Nodes: At least three control plane nodes to provide redundancy
Distributed etcd: Either stacked with control plane or external as a separate cluster
Load Balancer: Distributes traffic across the API servers
Multiple Worker Nodes: Distributed across failure domains

Here's what a typical HA architecture looks like:

HA Control Plane Components

API Server HA

The Kubernetes API server is stateless and can run in an active-active configuration:

Multiple instances can run simultaneously
Traffic is distributed via a load balancer
Requires a unique binding address for each instance

Example configuration in a kubeadm init file:

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.26.0
controlPlaneEndpoint: "kube-api-lb.example.com:6443"
apiServer:
  extraArgs:
    advertise-address: 0.0.0.0

etcd High Availability

etcd is the most critical component for HA as it stores all cluster state. It uses the Raft consensus algorithm and requires an odd number of nodes (typically 3, 5, or 7) to maintain quorum.

There are two deployment options:

Stacked etcd (easier): etcd runs on the same nodes as other control plane components
External etcd (more robust): etcd runs on dedicated nodes

Stacked etcd Configuration Example

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
etcd:
  local:
    dataDir: "/var/lib/etcd"
    extraArgs:
      initial-cluster: "cp1=https://10.0.1.1:2380,cp2=https://10.0.1.2:2380,cp3=https://10.0.1.3:2380"
      initial-cluster-state: new
      name: "cp1"
      listen-peer-urls: "https://10.0.1.1:2380"
      listen-client-urls: "https://10.0.1.1:2379,https://127.0.0.1:2379"
      advertise-client-urls: "https://10.0.1.1:2379"
      initial-advertise-peer-urls: "https://10.0.1.1:2380"

Controller Manager and Scheduler HA

These components run in an active-passive configuration:

Multiple instances run simultaneously, but only one is active (leader)
Leadership is determined through a leader election process
No load balancer needed as they only communicate with the API server

Implementing HA with kubeadm

Let's go through a step-by-step guide to create an HA cluster using kubeadm, which is the most common approach for beginners:

Prerequisites

At least 3 machines for control plane nodes
At least 2 machines for worker nodes
A load balancer (can be hardware or software like HAProxy/Nginx)
Proper network connectivity between all nodes

Step 1: Set Up the Load Balancer

Here's a simple example using HAProxy:

frontend kubernetes-frontend
    bind *:6443
    mode tcp
    option tcplog
    default_backend kubernetes-backend

backend kubernetes-backend
    mode tcp
    option tcp-check
    balance roundrobin
    server cp1 10.0.1.1:6443 check fall 3 rise 2
    server cp2 10.0.1.2:6443 check fall 3 rise 2
    server cp3 10.0.1.3:6443 check fall 3 rise 2

Step 2: Initialize the First Control Plane Node

Create a configuration file called kubeadm-config.yaml:

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.26.0
controlPlaneEndpoint: "kube-lb.example.com:6443"
networking:
  podSubnet: 192.168.0.0/16

Initialize the first control plane:

sudo kubeadm init --config=kubeadm-config.yaml --upload-certs

The output will include a command to join additional control plane nodes and worker nodes.

Step 3: Set Up kubectl on the First Control Plane

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Step 4: Install a Pod Network Add-on

For example, Calico:

kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Step 5: Join Additional Control Plane Nodes

Use the command from the output of the kubeadm init command:

sudo kubeadm join kube-lb.example.com:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash> \
  --control-plane \
  --certificate-key <cert-key>

Step 6: Join Worker Nodes

sudo kubeadm join kube-lb.example.com:6443 \
  --token <token> \
  --discovery-token-ca-cert-hash sha256:<hash>

Step 7: Verify Your HA Setup

Check the status of your nodes:

kubectl get nodes

Output example:

NAME    STATUS   ROLES           AGE     VERSION
cp1     Ready    control-plane   10m     v1.26.0
cp2     Ready    control-plane   8m      v1.26.0
cp3     Ready    control-plane   6m      v1.26.0
node1   Ready    <none>          4m      v1.26.0
node2   Ready    <none>          3m      v1.26.0

etcd Backup and Recovery

An essential part of high availability is having a backup and recovery plan for etcd.

Backing Up etcd

ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /backup/etcd-snapshot-$(date +%Y-%m-%d-%H-%M-%S).db

Restoring etcd from Backup

# Stop the kubelet and etcd container
sudo systemctl stop kubelet
sudo docker stop etcd

# Restore the snapshot
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key \
  --data-dir=/var/lib/etcd-backup \
  --initial-cluster="cp1=https://10.0.1.1:2380,cp2=https://10.0.1.2:2380,cp3=https://10.0.1.3:2380" \
  --initial-cluster-token="etcd-cluster-1" \
  --initial-advertise-peer-urls="https://10.0.1.1:2380" \
  snapshot restore /backup/etcd-snapshot.db

# Update configuration and restart services
sudo mv /var/lib/etcd /var/lib/etcd.old
sudo mv /var/lib/etcd-backup /var/lib/etcd
sudo systemctl start kubelet

Best Practices for Kubernetes HA

Use Odd Number of Control Plane Nodes: Always use 3, 5, or 7 control plane nodes to maintain proper quorum
Distribute Across Failure Domains: Spread nodes across availability zones
Network Redundancy: Implement redundant network paths
Regular Backups: Schedule automated etcd backups
Monitoring and Alerting: Implement monitoring for all HA components
Automated Recovery: Use tools like Cluster API for self-healing capabilities
Practice DR Scenarios: Regularly test failure scenarios

Practical Example: Testing Failure Scenarios

Let's simulate a control plane node failure to see high availability in action:

Check the current status:

kubectl get nodes
kubectl get pods -n kube-system

Simulate a control plane failure (do this in a test environment):

# On one of the control plane nodes
sudo systemctl stop kubelet
sudo systemctl stop docker

Verify the cluster remains functional:

kubectl get nodes
kubectl create deployment nginx --image=nginx --replicas=3
kubectl get pods

You should observe:

The failed node shows as NotReady
The leader election selects a new active controller and scheduler
New deployments continue to work

Advanced HA Configurations

Multi-Zone and Multi-Region Clusters

For enhanced availability, you can distribute your cluster across multiple availability zones or even regions:

Using External etcd Clusters

For maximum resilience, separate etcd from your control plane:

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
etcd:
  external:
    endpoints:
    - https://10.0.1.10:2379
    - https://10.0.1.11:2379
    - https://10.0.1.12:2379
    caFile: /etc/kubernetes/pki/etcd/ca.crt
    certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
    keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key

Summary

Kubernetes High Availability is essential for production deployments where uptime is critical. By implementing redundant control plane nodes, distributed etcd storage, proper load balancing, and following best practices, you can create resilient clusters that continue operating even when components fail.

Remember these key points:

HA requires at least 3 control plane nodes
etcd quorum is critical for cluster health
Load balancing the API server is necessary
Regular backups and testing are essential
Distributing components across failure domains enhances reliability

Additional Resources

Practice Exercises

Set up a 3-node HA Kubernetes cluster using kubeadm in a test environment
Create and restore an etcd backup
Simulate different failure scenarios (control plane node failure, network partition)
Design a multi-zone HA architecture for a production environment
Implement automated etcd backups using cron jobs

💡 Found a typo or mistake? Click "Edit this page" to suggest a correction. Your feedback is greatly appreciated!

Introduction​

Why High Availability Matters​

Understanding Kubernetes Cluster Architecture​

High Availability Architecture​

HA Control Plane Components​

API Server HA​

etcd High Availability​

Stacked etcd Configuration Example​

Controller Manager and Scheduler HA​

Implementing HA with kubeadm​

Prerequisites​

Step 1: Set Up the Load Balancer​

Step 2: Initialize the First Control Plane Node​

Step 3: Set Up kubectl on the First Control Plane​

Step 4: Install a Pod Network Add-on​

Step 5: Join Additional Control Plane Nodes​

Step 6: Join Worker Nodes​

Step 7: Verify Your HA Setup​

etcd Backup and Recovery​

Backing Up etcd​

Restoring etcd from Backup​

Best Practices for Kubernetes HA​

Practical Example: Testing Failure Scenarios​

Advanced HA Configurations​

Multi-Zone and Multi-Region Clusters​

Using External etcd Clusters​

Summary​

Additional Resources​

Practice Exercises​