Kong Monitoring Dashboard

Introduction

Monitoring is a critical aspect of managing API gateways in production environments. Kong, a popular open-source API gateway, generates valuable metrics that can help you understand its performance, identify bottlenecks, and troubleshoot issues. In this tutorial, we'll explore how to set up a comprehensive monitoring dashboard for Kong API Gateway, allowing you to visualize important metrics and ensure your API infrastructure is running smoothly.

A monitoring dashboard provides a centralized view of your Kong instance's health and performance, making it easier to:

Track API request rates and response times
Monitor error rates and status codes
Identify performance bottlenecks
Set up alerts for potential issues
Make data-driven decisions about scaling and optimization

Prerequisites

Before setting up your Kong monitoring dashboard, ensure you have the following:

A running Kong instance (either Community or Enterprise edition)
Basic familiarity with Kong configuration
Docker and Docker Compose (for the example setup)
Basic understanding of monitoring concepts

Monitoring Architecture

Kong can be monitored using various tools, but a common setup involves:

Prometheus: To collect and store metrics from Kong
Grafana: To visualize the metrics in customizable dashboards

Here's a simple architecture diagram of our monitoring setup:

Setting Up Prometheus for Kong

Step 1: Enable the Prometheus Plugin in Kong

First, we need to enable the Prometheus plugin in Kong to expose metrics. You can do this using Kong's admin API:

curl -X POST http://localhost:8001/plugins/ \
    --data "name=prometheus" \
    --data "config.status_code_metrics=true" \
    --data "config.latency_metrics=true" \
    --data "config.upstream_health_metrics=true" \
    --data "config.bandwidth_metrics=true"

This command enables the Prometheus plugin with several metric types:

Status code metrics (counts of different HTTP response codes)
Latency metrics (request and response processing times)
Upstream health metrics (health of backend services)
Bandwidth metrics (request and response sizes)

Step 2: Set Up Prometheus

Create a prometheus.yml configuration file:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'kong'
    static_configs:
      - targets: ['kong:8001']
    metrics_path: /metrics

This configuration tells Prometheus to scrape metrics from Kong's admin API every 15 seconds.

Setting Up Grafana

Step 1: Create a Docker Compose File

Create a docker-compose.yml file for setting up Prometheus and Grafana:

version: '3'
services:
  prometheus:
    image: prom/prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    networks:
      - monitoring

  grafana:
    image: grafana/grafana
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana-storage:/var/lib/grafana
    networks:
      - monitoring
    depends_on:
      - prometheus

networks:
  monitoring:

volumes:
  grafana-storage:

Step 2: Start the Monitoring Stack

Run the following command to start Prometheus and Grafana:

docker-compose up -d

Step 3: Configure Grafana

Access Grafana at http://localhost:3000 (default credentials: admin/admin)
Add Prometheus as a data source:
- Go to Configuration > Data Sources > Add data source
- Select Prometheus
- Set the URL to http://prometheus:9090
- Click "Save & Test"

Creating a Kong Dashboard in Grafana

Now that we have our monitoring stack set up, let's create a comprehensive dashboard for Kong metrics.

Step 1: Import a Pre-built Dashboard

Grafana has a marketplace with pre-built dashboards. You can import a Kong dashboard:

Go to Dashboards > Import
Enter dashboard ID 7424 (a popular Kong dashboard)
Select your Prometheus data source
Click "Import"

Alternatively, you can create your own custom dashboard with the following panels:

Step 2: Create Custom Dashboard Panels

Request Rate Panel

{
  "title": "Request Rate",
  "type": "graph",
  "datasource": "Prometheus",
  "targets": [
    {
      "expr": "sum(rate(kong_http_requests_total[1m]))",
      "legendFormat": "Requests/sec"
    }
  ]
}

Latency Panel

{
  "title": "Request Latency",
  "type": "graph",
  "datasource": "Prometheus",
  "targets": [
    {
      "expr": "histogram_quantile(0.95, sum(rate(kong_latency_bucket[1m])) by (le))",
      "legendFormat": "P95 Latency"
    },
    {
      "expr": "histogram_quantile(0.50, sum(rate(kong_latency_bucket[1m])) by (le))",
      "legendFormat": "P50 Latency"
    }
  ]
}

HTTP Status Codes Panel

{
  "title": "HTTP Status Codes",
  "type": "graph",
  "datasource": "Prometheus",
  "targets": [
    {
      "expr": "sum(rate(kong_http_status{code=~\"2..\"}[1m]))",
      "legendFormat": "2xx"
    },
    {
      "expr": "sum(rate(kong_http_status{code=~\"4..\"}[1m]))",
      "legendFormat": "4xx"
    },
    {
      "expr": "sum(rate(kong_http_status{code=~\"5..\"}[1m]))",
      "legendFormat": "5xx"
    }
  ]
}

Key Metrics to Monitor

When monitoring Kong, pay attention to these key metrics:

1. Request Volume

Track the total number of requests processed by Kong:

sum(rate(kong_http_requests_total[1m]))

This metric helps you understand traffic patterns and plan for capacity.

2. Error Rates

Monitor error rates by HTTP status code:

sum(rate(kong_http_status{code=~"5.."}[1m])) / sum(rate(kong_http_requests_total[1m]))

A sudden increase in error rates could indicate issues with your backend services.

3. Latency

Track request latency at different percentiles:

histogram_quantile(0.95, sum(rate(kong_latency_bucket[1m])) by (le))

High latency could indicate performance issues that need attention.

4. Connection Pool Usage

For Kong Enterprise, monitor connection pool usage:

kong_upstream_target_active

This helps identify if you need to adjust connection pool settings.

Setting Up Alerts

Proactive monitoring involves setting up alerts for critical conditions. Here's how to configure alerts in Grafana:

Create a new alert rule:
- Go to the panel where you want to add an alert
- Click Edit > Alert
- Define the condition (e.g., when error rate > 5% for 5 minutes)
- Set notification channels (email, Slack, etc.)

Example alert conditions:

High error rate: sum(rate(kong_http_status{code=~"5.."}[5m])) / sum(rate(kong_http_requests_total[5m])) > 0.05
High latency: histogram_quantile(0.95, sum(rate(kong_latency_bucket[5m])) by (le)) > 500
Low request rate: sum(rate(kong_http_requests_total[5m])) < 10

Real-world Example: Monitoring an E-commerce API Gateway

Let's consider a practical example of monitoring Kong for an e-commerce platform:

Scenario

An e-commerce site uses Kong to manage traffic to various microservices:

Product catalog service
Shopping cart service
Payment processing service
User authentication service

Monitoring Approach

Service-specific dashboards: Create separate dashboard sections for each service to track service-specific metrics.
Business-critical path monitoring: Set up dedicated monitoring for the checkout process, as it's critical for revenue.
Traffic patterns by time: Monitor request patterns during peak shopping hours vs. off-hours.

Implementation Example

Configure Kong to add service-specific tags to metrics:

curl -X POST http://localhost:8001/services/product-catalog/plugins \
    --data "name=prometheus" \
    --data "config.status_code_metrics=true" \
    --data "config.latency_metrics=true" \
    --data "config.service_identifier=product-catalog"

Then, create Grafana queries to filter by service:

sum(rate(kong_http_requests_total{service="product-catalog"}[1m]))

Troubleshooting Common Issues

Problem: High Latency

Possible causes:

Backend service performance issues
Network problems
Kong configuration issues (e.g., too many plugins)

Diagnosis steps:

Check if latency is specific to certain routes or services
Compare Kong proxy latency vs. upstream latency
Review Kong error logs

Problem: High Error Rates

Possible causes:

Backend service failures
Rate limiting kicking in
Authentication issues

Diagnosis steps:

Check specific HTTP status codes (401, 429, 502, etc.)
Look for patterns in error distribution across services
Correlate with any recent changes

Best Practices for Kong Monitoring

Monitor at multiple levels:
- Kong gateway metrics
- Underlying infrastructure (CPU, memory, network)
- Backend services health
Set appropriate retention periods:
- High-resolution metrics: 24 hours
- Aggregated metrics: 30-90 days
Create role-specific dashboards:
- Operations team: Focus on health and performance
- Development team: Focus on service-specific metrics
- Management: Focus on SLA compliance and trends
Automate responses to common issues:
- Set up auto-scaling based on request volume
- Configure automatic service failover

Advanced Topics

Kong Vitals (Enterprise Feature)

Kong Enterprise includes a built-in monitoring solution called Kong Vitals, which provides:

Pre-built dashboards in the Kong Manager UI
Health metrics for nodes, services, and routes
Anomaly detection capabilities

Distributed Tracing

For more detailed request tracking, integrate Kong with distributed tracing systems:

curl -X POST http://localhost:8001/plugins/ \
    --data "name=zipkin" \
    --data "config.http_endpoint=http://zipkin:9411/api/v2/spans" \
    --data "config.sample_ratio=1"

This allows you to trace requests as they flow through various services.

Summary

In this tutorial, we've learned how to:

Set up Prometheus and Grafana for monitoring Kong API Gateway
Configure the Prometheus plugin in Kong to expose metrics
Create custom dashboards to visualize key performance indicators
Set up alerts for proactive monitoring
Troubleshoot common issues using monitoring data

Effective monitoring is essential for running Kong in production environments. By implementing a comprehensive monitoring dashboard, you can ensure optimal performance, quickly identify issues, and make data-driven decisions about your API infrastructure.

Additional Resources

Exercises

Set up a monitoring dashboard for your local Kong instance following the steps in this tutorial.
Create a custom Grafana dashboard panel to monitor the rate of specific HTTP status codes (e.g., 429 Too Many Requests).
Configure alerts for high latency and error rates.
Simulate a backend service failure and observe how it appears in your monitoring dashboard.
Add service-specific monitoring for at least two different Kong services.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Prerequisites​

Monitoring Architecture​

Setting Up Prometheus for Kong​

Step 1: Enable the Prometheus Plugin in Kong​

Step 2: Set Up Prometheus​

Setting Up Grafana​

Step 1: Create a Docker Compose File​

Step 2: Start the Monitoring Stack​

Step 3: Configure Grafana​

Creating a Kong Dashboard in Grafana​

Step 1: Import a Pre-built Dashboard​

Step 2: Create Custom Dashboard Panels​

Request Rate Panel​

Latency Panel​

HTTP Status Codes Panel​

Key Metrics to Monitor​

1. Request Volume​

2. Error Rates​

3. Latency​

4. Connection Pool Usage​

Setting Up Alerts​

Real-world Example: Monitoring an E-commerce API Gateway​

Scenario​

Monitoring Approach​

Implementation Example​

Troubleshooting Common Issues​

Problem: High Latency​

Problem: High Error Rates​

Best Practices for Kong Monitoring​

Advanced Topics​

Kong Vitals (Enterprise Feature)​

Distributed Tracing​

Summary​

Additional Resources​

Exercises​