Kubernetes A/B Testing
Introduction
A/B testing (also known as split testing) is a methodology for comparing two versions of an application or webpage against each other to determine which one performs better. In the context of Kubernetes, A/B testing allows you to route different portions of traffic to different versions of your application, enabling you to gather real user feedback while minimizing risk.
Unlike canary deployments, which typically route a small percentage of traffic to a new version and gradually increase it, A/B testing focuses on comparing specific variations to make data-driven decisions about features, UI changes, or performance improvements.
Prerequisites
Before diving into A/B testing with Kubernetes, you should have:
- A working Kubernetes cluster
kubectl
command-line tool configured- Basic understanding of Kubernetes Deployments and Services
- Familiarity with Kubernetes resource YAML files
How A/B Testing Works in Kubernetes
In Kubernetes, A/B testing is typically implemented through:
- Multiple Deployments: Different versions of your application running simultaneously
- Service with Selectors: A Service that routes traffic to both versions
- Traffic Splitting: Rules that determine how traffic is distributed
Let's visualize the basic architecture:
Implementing A/B Testing in Kubernetes
There are several ways to implement A/B testing in Kubernetes. We'll explore the most common approaches:
1. Using Service with Labels and Selectors
This is the simplest approach, using Kubernetes' built-in features.
Step 1: Create two deployments for different versions
# version-a.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-v1
labels:
app: myapp
version: v1
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: v1
template:
metadata:
labels:
app: myapp
version: v1
spec:
containers:
- name: myapp
image: myapp:v1
ports:
- containerPort: 8080
---
# version-b.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-v2
labels:
app: myapp
version: v2
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: v2
template:
metadata:
labels:
app: myapp
version: v2
spec:
containers:
- name: myapp
image: myapp:v2
ports:
- containerPort: 8080
Step 2: Create a service that targets both deployments
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: myapp-service
spec:
selector:
app: myapp # Only selecting on app, not version
ports:
- port: 80
targetPort: 8080
type: ClusterIP
This service will route traffic to all pods with the label app: myapp
, which includes both versions.
Step 3: Deploy and test
kubectl apply -f version-a.yaml
kubectl apply -f version-b.yaml
kubectl apply -f service.yaml
With this setup, traffic will be distributed roughly equally between both versions based on the number of replicas. To adjust the ratio, you can change the number of replicas in each deployment.
2. Using Istio for Advanced Traffic Splitting
For more precise control over traffic distribution, Istio (a service mesh) provides powerful features.
Step 1: Install Istio on your cluster
Follow the Istio installation guide.
Step 2: Create deployments as before
Create the same deployments as in the previous example.
Step 3: Create an Istio VirtualService for traffic splitting
# virtual-service.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: myapp-virtual-service
spec:
hosts:
- myapp-service
http:
- route:
- destination:
host: myapp-service
subset: v1
weight: 75
- destination:
host: myapp-service
subset: v2
weight: 25
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: myapp-destination-rule
spec:
host: myapp-service
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
This configuration routes 75% of traffic to version 1 and 25% to version 2.
Step 4: Apply the configuration
kubectl apply -f virtual-service.yaml
3. Using Argo Rollouts
Argo Rollouts provides Kubernetes-native deployment capabilities with progressive delivery features.
Step 1: Install Argo Rollouts
kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml
Step 2: Create a Rollout resource
# rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: myapp-rollout
spec:
replicas: 6
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:v1
ports:
- containerPort: 8080
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 1h}
- setWeight: 40
- pause: {duration: 1h}
- setWeight: 60
- pause: {duration: 1h}
- setWeight: 80
- pause: {duration: 1h}
This configuration gradually shifts traffic from the old version to the new version over time, allowing for comparison.
Collecting and Analyzing Metrics
To make informed decisions from your A/B test, you need to collect relevant metrics:
1. Add a metrics collection solution
Popular options include:
- Prometheus for gathering performance metrics
- Fluentd for logging
- Jaeger for tracing
2. Deploy Prometheus and Grafana
# prometheus.yaml (simplified)
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
containers:
- name: prometheus
image: prom/prometheus:v2.30.3
ports:
- containerPort: 9090
---
apiVersion: v1
kind: Service
metadata:
name: prometheus
spec:
selector:
app: prometheus
ports:
- port: 9090
type: ClusterIP
3. Configure services to expose metrics
Add Prometheus annotations to your deployment:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
4. Create custom dashboards to compare versions
For Grafana, create dashboards that show key metrics for each version side by side:
- Response time
- Error rate
- Conversion rate
- User engagement
Real-world Example: A/B Testing a New User Interface
Let's walk through a complete example of A/B testing a new user interface for a web application.
Scenario
Your team has developed a new UI for your web application and wants to test it with real users to see if it increases the conversion rate.
Step 1: Prepare two versions of the application
# original-ui.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp-original
labels:
app: webapp
version: original
spec:
replicas: 4
selector:
matchLabels:
app: webapp
version: original
template:
metadata:
labels:
app: webapp
version: original
spec:
containers:
- name: webapp
image: mycompany/webapp:original
ports:
- containerPort: 80
---
# new-ui.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webapp-new
labels:
app: webapp
version: new
spec:
replicas: 4
selector:
matchLabels:
app: webapp
version: new
template:
metadata:
labels:
app: webapp
version: new
spec:
containers:
- name: webapp
image: mycompany/webapp:new-ui
ports:
- containerPort: 80
Step 2: Create a service for traffic distribution
# webapp-service.yaml
apiVersion: v1
kind: Service
metadata:
name: webapp-service
spec:
selector:
app: webapp
ports:
- port: 80
targetPort: 80
type: ClusterIP
Step 3: Implement traffic splitting with Istio
# webapp-split.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: webapp-virtual-service
spec:
hosts:
- webapp-service
http:
- route:
- destination:
host: webapp-service
subset: original
weight: 50
- destination:
host: webapp-service
subset: new
weight: 50
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: webapp-destination-rule
spec:
host: webapp-service
subsets:
- name: original
labels:
version: original
- name: new
labels:
version: new
Step 4: Deploy analytics tools and tracking
# analytics.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: analytics-collector
spec:
replicas: 1
selector:
matchLabels:
app: analytics
template:
metadata:
labels:
app: analytics
spec:
containers:
- name: analytics
image: mycompany/analytics:1.0
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: analytics-service
spec:
selector:
app: analytics
ports:
- port: 80
targetPort: 8080
type: ClusterIP
Step 5: Configure client-side code to report metrics
// Include in your application frontend
function reportConversion(version) {
fetch('/api/analytics/conversion', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
version: version,
timestamp: new Date().toISOString(),
userAgent: navigator.userAgent,
// Additional data
}),
});
}
// Call when user completes desired action
document.getElementById('signup-button').addEventListener('click', () => {
reportConversion(document.body.getAttribute('data-version'));
});
Step 6: Run the test for a statistically significant period
Typically, A/B tests should run for at least 1-2 weeks to account for daily and weekly fluctuations.
Step 7: Analyze results and make decisions
After collecting sufficient data, analyze the metrics to determine which version performed better. Based on the results, you can:
- Roll out the winning version to all users
- Refine further and run additional tests
- Revert to the original if the new version doesn't show improvements
Best Practices for Kubernetes A/B Testing
1. Use consistent metrics
Define clear KPIs (Key Performance Indicators) before starting the test:
- Conversion rates
- Engagement metrics
- Performance metrics (load time, response time)
- Error rates
2. Ensure proper randomization
Users should be randomly assigned to either version A or B to avoid bias.
3. Run tests for sufficient time
Short test periods may yield unreliable results due to:
- Daily traffic patterns
- Weekly usage patterns
- External factors (marketing campaigns, etc.)
4. Control for external variables
Try to isolate the impact of your changes by:
- Running tests during a stable period
- Avoiding major holidays or special events
- Not making other significant changes during the test
5. Use feature flags for client-side A/B testing
For frontend changes, consider using feature flags in addition to routing:
// Example of a feature flag implementation
if (userFeatureFlags.newUI === true) {
// Show new UI components
} else {
// Show original UI components
}
6. Set up proper monitoring and alerting
Ensure you can detect if either version experiences issues during the test:
# alert-rule.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: ab-test-alerts
spec:
groups:
- name: ab-testing
rules:
- alert: VersionErrorRateHigh
expr: sum(rate(http_requests_total{status=~"5.."}[5m])) by (version) / sum(rate(http_requests_total[5m])) by (version) > 0.05
for: 1m
labels:
severity: warning
annotations:
summary: "High error rate detected in version {{ $labels.version }}"
description: "Error rate for version {{ $labels.version }} is above 5% for the last 5 minutes"
Common Challenges and Solutions
Challenge 1: Session persistence
Problem: Users might be confused if they see different versions on different visits.
Solution: Use session-based routing or cookies to maintain version consistency:
# istio-session-affinity.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: webapp-virtual-service
spec:
hosts:
- webapp-service
http:
- match:
- headers:
cookie:
regex: ".*version=original.*"
route:
- destination:
host: webapp-service
subset: original
- match:
- headers:
cookie:
regex: ".*version=new.*"
route:
- destination:
host: webapp-service
subset: new
- route:
- destination:
host: webapp-service
subset: original
weight: 50
- destination:
host: webapp-service
subset: new
weight: 50
Challenge 2: Statistical significance
Problem: Determining when you have enough data to make a decision.
Solution: Use statistical analysis tools and ensure sample sizes are large enough:
# Example statistical analysis
from scipy import stats
import numpy as np
# Conversion data
version_a_conversions = 120
version_a_total = 1000
version_b_conversions = 150
version_b_total = 1000
# Calculate conversion rates
rate_a = version_a_conversions / version_a_total
rate_b = version_b_conversions / version_b_total
# Perform statistical test
z_score, p_value = stats.proportions_ztest(
[version_a_conversions, version_b_conversions],
[version_a_total, version_b_total]
)
print(f"Conversion rate A: {rate_a*100:.2f}%")
print(f"Conversion rate B: {rate_b*100:.2f}%")
print(f"p-value: {p_value:.4f}")
if p_value < 0.05:
print("Result is statistically significant")
else:
print("Result is not statistically significant yet")
Challenge 3: Complex deployments
Problem: Managing multiple versions can become complex.
Solution: Use GitOps tools like Argo CD or Flux to manage deployment configurations:
# argocd-application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: ab-test-webapp
spec:
project: default
source:
repoURL: https://github.com/mycompany/webapp
targetRevision: HEAD
path: k8s/ab-test
destination:
server: https://kubernetes.default.svc
namespace: webapp
syncPolicy:
automated:
prune: true
selfHeal: true
Combining A/B Testing with Other Deployment Strategies
A/B testing can be combined with other advanced deployment strategies for more sophisticated testing:
A/B Testing + Canary Deployments
- Use canary deployments to gradually roll out a new version to a small percentage of users
- Once stable, use A/B testing to compare specific metrics between versions
# combined-strategy.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: webapp-rollout
spec:
replicas: 10
selector:
matchLabels:
app: webapp
template:
metadata:
labels:
app: webapp
spec:
containers:
- name: webapp
image: mycompany/webapp:v2
strategy:
canary:
steps:
- setWeight: 10
- pause: {duration: 1h}
- setWeight: 50 # A/B testing phase
- pause: {duration: 72h} # Collect metrics for comparison
- setWeight: 100 # Full rollout if successful
A/B Testing + Feature Flags
Use Kubernetes for routing traffic at the service level, and feature flags for more granular testing:
# config-map.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: feature-flags
data:
flags.json: |
{
"newCheckout": {
"enabled": true,
"audience": {
"percentage": 50,
"include": ["beta-users"]
}
},
"newNavbar": {
"enabled": true,
"audience": {
"percentage": 25
}
}
}
Summary
A/B testing in Kubernetes provides a powerful way to compare different versions of your application with real users. By leveraging Kubernetes' routing capabilities along with more advanced tools like Istio, you can implement sophisticated testing strategies to make data-driven decisions.
Key points to remember:
- Kubernetes native resources can provide basic A/B testing capabilities
- Service meshes like Istio offer more precise traffic control
- Proper metrics collection and analysis are essential
- Consider user experience during the test
- Run tests for statistically significant periods
- Combine with other deployment strategies for more sophisticated workflows
Exercises
- Set up a basic A/B test using only Kubernetes native resources (Deployments and Services).
- Implement an A/B test using Istio, with a 70/30 traffic split.
- Create a monitoring dashboard that compares key metrics between two versions.
- Write a script that determines if your A/B test results are statistically significant.
- Design an A/B testing strategy for a mobile application backend that preserves session affinity.
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)