Terraform for Kubernetes
Introduction
Kubernetes has become the de facto standard for container orchestration, but setting up and managing Kubernetes clusters can be complex and error-prone when done manually. Terraform, as an Infrastructure as Code (IaC) tool, provides a solution to this challenge by allowing you to define your Kubernetes infrastructure in code, making it reproducible, version-controlled, and automated.
In this guide, we'll explore how to use Terraform to provision and manage Kubernetes clusters and resources. Whether you're working with a cloud provider's managed Kubernetes service or setting up your own clusters, Terraform can streamline your workflow and ensure consistency across environments.
Prerequisites
Before diving into Terraform for Kubernetes, you should have:
- Basic understanding of Terraform concepts (providers, resources, variables)
- Familiarity with Kubernetes fundamentals
- Terraform CLI installed (version 0.12+)
- kubectl installed for interacting with clusters
- Access to a cloud provider account (AWS, GCP, Azure) or local Kubernetes environment
Understanding the Terraform Kubernetes Provider
Terraform offers multiple ways to interact with Kubernetes:
- Cloud provider-specific Kubernetes services: Using AWS EKS, Google GKE, or Azure AKS providers
- The Kubernetes provider: Managing resources within an existing cluster
- The Helm provider: Deploying applications using Helm charts
Let's examine each approach in detail.
Provisioning Managed Kubernetes Clusters
AWS EKS Cluster
Here's how to create an Amazon EKS cluster using Terraform:
provider "aws" {
region = "us-west-2"
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 19.0"
cluster_name = "my-eks-cluster"
cluster_version = "1.27"
vpc_id = "vpc-abcde12345"
subnet_ids = ["subnet-abcde12345", "subnet-12345abcde"]
eks_managed_node_groups = {
default = {
min_size = 1
max_size = 3
desired_size = 2
instance_types = ["t3.medium"]
}
}
tags = {
Environment = "development"
Application = "myapp"
}
}
# Output the kubeconfig for connecting to the cluster
output "kubeconfig" {
description = "kubectl configuration"
value = module.eks.kubeconfig
sensitive = true
}
After running terraform apply
, you'll have a fully functional EKS cluster.
Google GKE Cluster
For Google Cloud Platform, here's how to create a GKE cluster:
provider "google" {
project = "my-project-id"
region = "us-central1"
}
resource "google_container_cluster" "primary" {
name = "my-gke-cluster"
location = "us-central1"
# We can't create a completely empty cluster, so we create the smallest possible default node pool
# and immediately delete it
remove_default_node_pool = true
initial_node_count = 1
network = "default"
subnetwork = "default"
}
resource "google_container_node_pool" "primary_nodes" {
name = "primary-node-pool"
location = "us-central1"
cluster = google_container_cluster.primary.name
node_count = 2
node_config {
preemptible = true
machine_type = "e2-medium"
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/devstorage.read_only",
]
}
}
# Output the kubeconfig
output "kubeconfig" {
value = "Run: gcloud container clusters get-credentials ${google_container_cluster.primary.name} --region ${google_container_cluster.primary.location}"
}
Azure AKS Cluster
For Azure, you can provision an AKS cluster like this:
provider "azurerm" {
features {}
}
resource "azurerm_resource_group" "example" {
name = "aks-resource-group"
location = "East US"
}
resource "azurerm_kubernetes_cluster" "example" {
name = "my-aks-cluster"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
dns_prefix = "myakscluster"
default_node_pool {
name = "default"
node_count = 2
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
tags = {
Environment = "Production"
}
}
output "kube_config" {
value = azurerm_kubernetes_cluster.example.kube_config_raw
sensitive = true
}
Managing Kubernetes Resources with Terraform
Once you have a cluster, you can use the Kubernetes provider to manage resources within it:
provider "kubernetes" {
config_path = "~/.kube/config" # Path to your kubeconfig file
}
resource "kubernetes_namespace" "example" {
metadata {
name = "my-application"
}
}
resource "kubernetes_deployment" "example" {
metadata {
name = "web-app"
namespace = kubernetes_namespace.example.metadata[0].name
labels = {
app = "web-app"
}
}
spec {
replicas = 2
selector {
match_labels = {
app = "web-app"
}
}
template {
metadata {
labels = {
app = "web-app"
}
}
spec {
container {
image = "nginx:1.21"
name = "web"
port {
container_port = 80
}
resources {
limits = {
cpu = "0.5"
memory = "512Mi"
}
requests = {
cpu = "0.25"
memory = "256Mi"
}
}
}
}
}
}
}
resource "kubernetes_service" "example" {
metadata {
name = "web-app-service"
namespace = kubernetes_namespace.example.metadata[0].name
}
spec {
selector = {
app = kubernetes_deployment.example.spec[0].template[0].metadata[0].labels.app
}
port {
port = 80
target_port = 80
}
type = "ClusterIP"
}
}
Deploying Applications with Helm Provider
Helm charts are a popular way to package Kubernetes applications. Terraform can deploy them through the Helm provider:
provider "helm" {
kubernetes {
config_path = "~/.kube/config"
}
}
resource "helm_release" "prometheus" {
name = "prometheus"
repository = "https://prometheus-community.github.io/helm-charts"
chart = "prometheus"
namespace = "monitoring"
create_namespace = true
set {
name = "server.persistentVolume.enabled"
value = "false"
}
set {
name = "alertmanager.persistentVolume.enabled"
value = "false"
}
}
Creating a Complete Environment
Let's pull everything together to create a complete example that:
- Provisions a GKE cluster
- Deploys a Namespace and a ConfigMap
- Deploys a web application via Helm
# providers.tf
provider "google" {
project = var.project_id
region = var.region
}
provider "kubernetes" {
host = "https://${module.gke.endpoint}"
token = data.google_client_config.default.access_token
cluster_ca_certificate = base64decode(module.gke.ca_certificate)
}
provider "helm" {
kubernetes {
host = "https://${module.gke.endpoint}"
token = data.google_client_config.default.access_token
cluster_ca_certificate = base64decode(module.gke.ca_certificate)
}
}
# GKE cluster
module "gke" {
source = "terraform-google-modules/kubernetes-engine/google"
project_id = var.project_id
name = "my-app-cluster"
region = var.region
network = "default"
subnetwork = "default"
ip_range_pods = ""
ip_range_services = ""
remove_default_node_pool = true
node_pools = [
{
name = "default-node-pool"
machine_type = "e2-medium"
min_count = 1
max_count = 3
disk_size_gb = 30
autoscaling = true
auto_repair = true
auto_upgrade = true
},
]
}
# Application namespace
resource "kubernetes_namespace" "app" {
metadata {
name = "my-application"
}
depends_on = [module.gke]
}
# ConfigMap for application configuration
resource "kubernetes_config_map" "app_config" {
metadata {
name = "app-config"
namespace = kubernetes_namespace.app.metadata[0].name
}
data = {
"config.json" = jsonencode({
database = {
host = "db.example.com"
port = 5432
}
features = {
enableAuth = true
}
})
}
}
# Deploy web application with Helm
resource "helm_release" "web_app" {
name = "web-app"
repository = "https://charts.bitnami.com/bitnami"
chart = "nginx"
namespace = kubernetes_namespace.app.metadata[0].name
set {
name = "replicaCount"
value = 2
}
set {
name = "service.type"
value = "LoadBalancer"
}
}
Advanced Techniques
Using Terraform to Manage Multiple Environments
One of Terraform's strengths is managing multiple environments (dev, staging, production) with minimal code duplication:
# variables.tf
variable "environment" {
description = "Environment (dev, staging, prod)"
type = string
}
# environments/dev.tfvars
environment = "dev"
cluster_node_count = 1
enable_autoscaling = false
# environments/prod.tfvars
environment = "prod"
cluster_node_count = 3
enable_autoscaling = true
# main.tf
locals {
namespace = "my-app-${var.environment}"
}
resource "kubernetes_namespace" "app" {
metadata {
name = local.namespace
}
}
# Use with: terraform apply -var-file=environments/dev.tfvars
Kubernetes Custom Resource Definitions (CRDs)
Terraform can also manage custom resources:
resource "kubernetes_manifest" "prometheus_servicemonitor" {
manifest = {
apiVersion = "monitoring.coreos.com/v1"
kind = "ServiceMonitor"
metadata = {
name = "web-app-monitor"
namespace = kubernetes_namespace.app.metadata[0].name
}
spec = {
selector = {
matchLabels = {
app = "web-app"
}
}
endpoints = [{
port = "http"
interval = "15s"
}]
}
}
}
Dynamic Blocks for Pod Specifications
For complex pod configurations, dynamic blocks can be helpful:
resource "kubernetes_deployment" "web" {
# ... other deployment config ...
spec {
template {
spec {
dynamic "volume" {
for_each = var.config_maps
content {
name = "${volume.key}-config"
config_map {
name = volume.value
}
}
}
container {
# ... container config ...
dynamic "volume_mount" {
for_each = var.config_maps
content {
name = "${volume_mount.key}-config"
mount_path = "/etc/config/${volume_mount.key}"
read_only = true
}
}
}
}
}
}
}
Workflow Diagram
Here's a visualization of a typical Terraform and Kubernetes workflow:
Best Practices
1. Modularize Your Terraform Code
Break your infrastructure into logical modules:
module "k8s_cluster" {
source = "./modules/kubernetes-cluster"
# ...
}
module "k8s_monitoring" {
source = "./modules/kubernetes-monitoring"
# ...
depends_on = [module.k8s_cluster]
}
module "k8s_application" {
source = "./modules/kubernetes-application"
# ...
depends_on = [module.k8s_cluster]
}
2. Use State Management
For team environments, use remote state storage:
terraform {
backend "gcs" {
bucket = "my-terraform-state"
prefix = "terraform/state/kubernetes"
}
}
3. Pin Versions
Always pin provider and module versions:
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = "2.21.1"
}
helm = {
source = "hashicorp/helm"
version = "2.10.1"
}
}
}
4. Use Variables and Outputs
Parameterize your configurations:
variable "cluster_name" {
description = "Name of the Kubernetes cluster"
type = string
default = "my-cluster"
}
output "cluster_endpoint" {
description = "Endpoint for the Kubernetes API server"
value = module.gke.endpoint
}
5. Use Data Sources for Dynamic Configuration
data "kubernetes_service" "ingress_nginx" {
metadata {
name = "ingress-nginx-controller"
namespace = "ingress-nginx"
}
depends_on = [helm_release.ingress_nginx]
}
output "load_balancer_ip" {
value = data.kubernetes_service.ingress_nginx.status.0.load_balancer.0.ingress.0.ip
}
Common Issues and Solutions
Issue: Configuration Drift
Solution: Use `
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)