Terraform Azure AKS
Introduction
Azure Kubernetes Service (AKS) is a managed Kubernetes offering that simplifies deploying, managing, and scaling containerized applications on Azure. Terraform allows you to define your AKS infrastructure as code, making it reproducible, version-controlled, and easier to manage across environments.
In this guide, we'll explore how to:
- Deploy an AKS cluster using Terraform
- Configure key AKS features through Terraform
- Implement best practices for production-ready clusters
- Manage your AKS infrastructure through its lifecycle
Prerequisites
Before we begin, ensure you have:
- An Azure account with an active subscription
- Terraform installed (version 1.0.0+)
- Azure CLI installed and configured
- kubectl installed (for interacting with your cluster)
Project Structure
Let's start by creating a well-organized project structure:
terraform-aks-project/
├── main.tf # Main Terraform configuration
├── variables.tf # Input variables
├── outputs.tf # Output values
├── providers.tf # Provider configuration
└── terraform.tfvars # Variable values (not in version control)
Provider Configuration
First, let's set up our providers.tf
file:
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
}
}
provider "azurerm" {
features {}
}
Variables
Next, create a variables.tf
file to define the inputs:
variable "resource_group_name" {
description = "Name of the resource group"
type = string
}
variable "location" {
description = "Azure region for resources"
type = string
default = "East US"
}
variable "cluster_name" {
description = "Name of the AKS cluster"
type = string
}
variable "kubernetes_version" {
description = "Kubernetes version"
type = string
default = "1.27.3"
}
variable "node_count" {
description = "Number of nodes in the default node pool"
type = number
default = 2
}
variable "vm_size" {
description = "VM size for the nodes"
type = string
default = "Standard_DS2_v2"
}
variable "tags" {
description = "Tags to apply to resources"
type = map(string)
default = {}
}
Deploying a Basic AKS Cluster
Now, let's create a basic AKS cluster in our main.tf
:
# Create a resource group
resource "azurerm_resource_group" "aks_rg" {
name = var.resource_group_name
location = var.location
tags = var.tags
}
# Create AKS cluster
resource "azurerm_kubernetes_cluster" "aks" {
name = var.cluster_name
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
dns_prefix = var.cluster_name
kubernetes_version = var.kubernetes_version
default_node_pool {
name = "default"
node_count = var.node_count
vm_size = var.vm_size
os_disk_size_gb = 30
}
identity {
type = "SystemAssigned"
}
tags = var.tags
}
Outputs
Next, let's define the outputs we want in outputs.tf
:
output "kube_config" {
value = azurerm_kubernetes_cluster.aks.kube_config_raw
sensitive = true
}
output "cluster_name" {
value = azurerm_kubernetes_cluster.aks.name
}
output "resource_group_name" {
value = azurerm_resource_group.aks_rg.name
}
output "host" {
value = azurerm_kubernetes_cluster.aks.kube_config.0.host
sensitive = true
}
Advanced Configuration
Let's enhance our AKS cluster with more advanced features:
Multiple Node Pools
Node pools allow you to have different types of VMs in your cluster. Let's add a second node pool:
# Add a second node pool
resource "azurerm_kubernetes_cluster_node_pool" "cpu_intensive" {
name = "cpuintensive"
kubernetes_cluster_id = azurerm_kubernetes_cluster.aks.id
vm_size = "Standard_D4s_v3"
node_count = 1
# Labels to be applied to the nodes
node_labels = {
"workload_type" = "cpu-intensive"
}
# Taints to control pod scheduling
node_taints = ["dedicated=cpu-intensive:NoSchedule"]
tags = var.tags
}
Network Configuration
For production environments, you should customize the networking:
# Create a virtual network
resource "azurerm_virtual_network" "aks_vnet" {
name = "${var.cluster_name}-vnet"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
address_space = ["10.0.0.0/16"]
tags = var.tags
}
# Create a subnet for the AKS cluster
resource "azurerm_subnet" "aks_subnet" {
name = "${var.cluster_name}-subnet"
resource_group_name = azurerm_resource_group.aks_rg.name
virtual_network_name = azurerm_virtual_network.aks_vnet.name
address_prefixes = ["10.0.1.0/24"]
}
# Update AKS cluster with network configuration
resource "azurerm_kubernetes_cluster" "aks" {
# ... existing configuration ...
network_profile {
network_plugin = "azure"
dns_service_ip = "10.0.2.10"
docker_bridge_cidr = "172.17.0.1/16"
service_cidr = "10.0.2.0/24"
load_balancer_sku = "standard"
}
default_node_pool {
# ... existing configuration ...
vnet_subnet_id = azurerm_subnet.aks_subnet.id
}
}
RBAC and Azure AD Integration
For improved security, let's integrate with Azure Active Directory:
resource "azurerm_kubernetes_cluster" "aks" {
# ... existing configuration ...
role_based_access_control_enabled = true
azure_active_directory_role_based_access_control {
managed = true
admin_group_object_ids = ["xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"] # Replace with your AAD admin group ID
}
}
Deploying and Managing Your AKS Cluster
Let's walk through the steps to deploy and manage your AKS cluster:
Deployment Workflow
- Initialize your Terraform working directory:
terraform init
- Review the execution plan:
terraform plan -out=tfplan
- Apply the configuration:
terraform apply tfplan
The output should show the resources being created:
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Outputs:
cluster_name = "my-aks-cluster"
resource_group_name = "my-aks-resource-group"
- Configure kubectl to access your cluster:
az aks get-credentials --resource-group my-aks-resource-group --name my-aks-cluster
- Verify the connection:
kubectl get nodes
You should see output similar to:
NAME STATUS ROLES AGE VERSION
aks-default-26668310-vmss000000 Ready agent 5m v1.27.3
aks-default-26668310-vmss000001 Ready agent 5m v1.27.3
aks-cpuintensive-12345678-vmss000000 Ready agent 4m v1.27.3
Infrastructure Lifecycle Management
Updating the Cluster
To update your cluster (e.g., change node count or Kubernetes version):
- Update the variables or configuration in your Terraform files
- Run the plan and apply commands:
terraform plan -out=tfplan
terraform apply tfplan
Destroying the Cluster
When you're done, you can destroy the resources:
terraform destroy
Advanced Visualization
Let's visualize the key components of our AKS architecture:
Production Best Practices
When deploying AKS for production, consider these best practices:
- High Availability: Use multiple node pools across availability zones:
resource "azurerm_kubernetes_cluster" "aks" {
# ... existing configuration ...
default_node_pool {
# ... existing configuration ...
availability_zones = [1, 2, 3]
}
}
- Monitoring: Enable monitoring with Azure Monitor:
resource "azurerm_kubernetes_cluster" "aks" {
# ... existing configuration ...
oms_agent {
log_analytics_workspace_id = azurerm_log_analytics_workspace.aks.id
}
}
resource "azurerm_log_analytics_workspace" "aks" {
name = "${var.cluster_name}-logs"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
sku = "PerGB2018"
retention_in_days = 30
tags = var.tags
}
- Auto-scaling: Configure cluster autoscaler:
resource "azurerm_kubernetes_cluster" "aks" {
# ... existing configuration ...
default_node_pool {
# ... existing configuration ...
enable_auto_scaling = true
min_count = 1
max_count = 5
}
}
Complete Example
Here's a complete example incorporating all the features we've discussed:
# providers.tf
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
}
}
provider "azurerm" {
features {}
}
# variables.tf
# ... (as defined earlier) ...
# main.tf
resource "azurerm_resource_group" "aks_rg" {
name = var.resource_group_name
location = var.location
tags = var.tags
}
resource "azurerm_virtual_network" "aks_vnet" {
name = "${var.cluster_name}-vnet"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
address_space = ["10.0.0.0/16"]
tags = var.tags
}
resource "azurerm_subnet" "aks_subnet" {
name = "${var.cluster_name}-subnet"
resource_group_name = azurerm_resource_group.aks_rg.name
virtual_network_name = azurerm_virtual_network.aks_vnet.name
address_prefixes = ["10.0.1.0/24"]
}
resource "azurerm_log_analytics_workspace" "aks" {
name = "${var.cluster_name}-logs"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
sku = "PerGB2018"
retention_in_days = 30
tags = var.tags
}
resource "azurerm_kubernetes_cluster" "aks" {
name = var.cluster_name
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
dns_prefix = var.cluster_name
kubernetes_version = var.kubernetes_version
default_node_pool {
name = "default"
node_count = var.node_count
vm_size = var.vm_size
os_disk_size_gb = 30
vnet_subnet_id = azurerm_subnet.aks_subnet.id
availability_zones = [1, 2, 3]
enable_auto_scaling = true
min_count = 1
max_count = 5
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
dns_service_ip = "10.0.2.10"
docker_bridge_cidr = "172.17.0.1/16"
service_cidr = "10.0.2.0/24"
load_balancer_sku = "standard"
}
role_based_access_control_enabled = true
azure_active_directory_role_based_access_control {
managed = true
azure_rbac_enabled = true
}
oms_agent {
log_analytics_workspace_id = azurerm_log_analytics_workspace.aks.id
}
tags = var.tags
}
resource "azurerm_kubernetes_cluster_node_pool" "cpu_intensive" {
name = "cpupool"
kubernetes_cluster_id = azurerm_kubernetes_cluster.aks.id
vm_size = "Standard_D4s_v3"
node_count = 1
availability_zones = [1, 2, 3]
enable_auto_scaling = true
min_count = 1
max_count = 3
vnet_subnet_id = azurerm_subnet.aks_subnet.id
node_labels = {
"workload_type" = "cpu-intensive"
}
node_taints = ["dedicated=cpu-intensive:NoSchedule"]
tags = var.tags
}
# outputs.tf
# ... (as defined earlier) ...
Example Usage Scenario
Let's work through a realistic scenario: deploying a microservices application on our AKS cluster.
- Create
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)