Kong OpenTelemetry

Introduction

OpenTelemetry is an open-source observability framework that helps developers instrument, generate, collect, and export telemetry data (metrics, logs, and traces) to help analyze software performance and behavior. When integrated with Kong API Gateway, OpenTelemetry provides powerful insights into API traffic, performance bottlenecks, and potential issues.

In this guide, we'll explore how to set up and configure Kong API Gateway with OpenTelemetry, allowing you to monitor and troubleshoot your API infrastructure effectively.

What is OpenTelemetry?

OpenTelemetry (often abbreviated as OTel) is a Cloud Native Computing Foundation (CNCF) project that provides a collection of tools, APIs, and SDKs to instrument, generate, collect, and export telemetry data for analysis. The three main types of telemetry data are:

Traces: Records of requests as they flow through your system
Metrics: Numerical measurements collected at regular intervals
Logs: Timestamped records of discrete events

Prerequisites

Before we start, make sure you have:

Kong Gateway installed (version 2.8 or later)
Basic understanding of Kong configuration
Access to an OpenTelemetry backend (Jaeger, Zipkin, Prometheus, etc.)
Kong Admin API access

Installing the Kong OpenTelemetry Plugin

Kong integrates with OpenTelemetry through a plugin. Let's set it up:

1. Install the OpenTelemetry Plugin

If you're using Kong Gateway Enterprise, the plugin is included. For Kong Gateway OSS, you need to install it:

# Using luarocks
$ luarocks install kong-plugin-opentelemetry

# Or add it to your Kong configuration
$ export KONG_PLUGINS=bundled,opentelemetry

2. Configure the Plugin

You can enable the plugin globally or for specific services/routes:

# Enable globally
$ curl -X POST http://localhost:8001/plugins/ \
  --data "name=opentelemetry" \
  --data "config.endpoint=http://otel-collector:4318/v1/traces" \
  --data "config.headers.content-type=application/json"

Or add it to a specific service:

# Enable for a specific service
$ curl -X POST http://localhost:8001/services/your-service-name/plugins \
  --data "name=opentelemetry" \
  --data "config.endpoint=http://otel-collector:4318/v1/traces" \
  --data "config.headers.content-type=application/json"

Configuration Options

The OpenTelemetry plugin offers several configuration options:

Parameter	Description	Default
`endpoint`	URL to your OpenTelemetry collector	Required
`headers`	Additional HTTP headers to add to the OpenTelemetry exporter requests	`{}`
`batch_span_count`	Maximum spans to batch before sending	`200`
`batch_flush_delay`	Maximum delay in seconds before sending spans	`3`
`sampler`	Trace sampling strategy (always_on, always_off, traceidratio)	`always_on`
`sampling_rate`	Trace sampling rate when using traceidratio sampler	`1.0`

Declarative Configuration (Kong.yml)

If you prefer declarative configuration, here's how to enable the OpenTelemetry plugin in your kong.yml file:

_format_version: "2.1"
_transform: true

services:
- name: example-service
  url: http://example.com
  plugins:
  - name: opentelemetry
    config:
      endpoint: http://otel-collector:4318/v1/traces
      batch_span_count: 100
      batch_flush_delay: 2
      sampler: traceidratio
      sampling_rate: 0.5
      headers:
        content-type: application/json

Collecting and Viewing Telemetry Data

Once the plugin is configured, Kong will automatically generate spans for each request that passes through the gateway. Here's what happens:

A request comes to Kong
Kong processes the request and forwards it to the upstream service
The OpenTelemetry plugin creates spans with details about the request
Spans are batched and sent to your OpenTelemetry collector
The collector processes and forwards the data to your observability backend

Example: Viewing Traces in Jaeger

If you're using Jaeger as your tracing backend, you can access the Jaeger UI (typically at http://localhost:16686) to view and analyze traces:

Select "kong" from the service dropdown
Set your desired time range
Click "Find Traces"
Explore the trace data, including:
- Request duration
- HTTP status codes
- Request paths
- Error information

Understanding Span Data

The Kong OpenTelemetry plugin generates the following span attributes for each request:

http.method: The HTTP method used (GET, POST, etc.)
http.route: The Kong route that matched the request
http.status_code: The HTTP response status code
http.url: The full URL of the request
kong.consumer: The Kong consumer (if authenticated)
kong.service: The Kong service that processed the request
error: True if there was an error processing the request

Advanced Configuration: Customizing Spans

For more detailed telemetry, you can customize the spans generated by Kong. Here's an example of adding custom attributes:

-- Custom plugin extending OpenTelemetry functionality
local opentelemetry = require("kong.plugins.opentelemetry.handler")
local CustomHandler = {}

function CustomHandler:access(conf)
  -- Add custom span attributes
  kong.ctx.plugin.opentelemetry_span:set_attribute("business.customer_id", kong.request.get_header("X-Customer-ID"))
  opentelemetry.access(self, conf)
end

return CustomHandler

Practical Example: Monitoring API Latency

Let's walk through a real-world example of using Kong OpenTelemetry to monitor API latency:

Set up Kong with OpenTelemetry:

$ curl -X POST http://localhost:8001/plugins/ \
  --data "name=opentelemetry" \
  --data "config.endpoint=http://otel-collector:4318/v1/traces"

Create a test service and route:

$ curl -X POST http://localhost:8001/services/ \
  --data "name=example-api" \
  --data "url=https://httpbin.org"

$ curl -X POST http://localhost:8001/services/example-api/routes \
  --data "paths[]=/test"

Generate some test traffic:

$ for i in {1..10}; do curl -s http://localhost:8000/test/get > /dev/null; done

Analyze the results in your tracing backend:

In your tracing UI (like Jaeger), you'll see spans for each request showing:

Total request duration
Time spent in each Kong plugin
Upstream service latency
Any errors encountered

Example trace output from Jaeger:

Trace: 7f0c1de76ad2d481
  ├─ kong.request [192ms]
  │  ├─ kong.rewrite [2ms]
  │  ├─ kong.access [15ms]
  │  │  └─ opentelemetry.access [5ms]
  │  ├─ kong.upstream [170ms]
  │  │  └─ httpbin.org request [168ms]
  │  └─ kong.response [5ms]

Troubleshooting Common Issues

1. Spans not appearing in your backend

If you're not seeing spans in your observability backend:

Check that your OpenTelemetry collector endpoint is correct
Verify the collector is running and accessible from Kong
Examine Kong's error logs for any connection issues
Test collector connectivity: curl -v http://your-collector:4318/v1/traces

2. Performance impact

If you notice performance degradation after enabling OpenTelemetry:

Increase batch_span_count to reduce export frequency
Use sampling to reduce the number of traces (set sampler to traceidratio and sampling_rate to less than 1.0)
Ensure your collector has sufficient resources

Best Practices

Start with sampling: In production environments, start with a lower sampling rate and increase as needed.
Focus on critical paths: Apply the plugin to your most important services first.
Use context propagation: Ensure your upstream services also use OpenTelemetry for end-to-end visibility.
Monitor the collector: Your OpenTelemetry collector should be monitored to avoid becoming a bottleneck.
Regularly review your traces: Use the trace data to identify and resolve performance issues proactively.

Summary

Kong's OpenTelemetry integration provides powerful observability capabilities for your API infrastructure. By following this guide, you've learned how to:

Set up the OpenTelemetry plugin in Kong
Configure various sampling and batching options
View and analyze trace data
Troubleshoot common issues
Apply best practices for production deployments

With these tools, you can gain deeper insights into your API performance, quickly identify bottlenecks, and ensure the reliability of your services.

Additional Resources

Exercises

Set up Kong with OpenTelemetry and connect it to Jaeger
Create a simple API with multiple services and observe how requests flow through them
Experiment with different sampling rates and observe the impact on trace collection
Implement a custom plugin that adds business-specific attributes to spans
Use trace data to identify and fix a performance bottleneck in your API

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

What is OpenTelemetry?​

Prerequisites​

Installing the Kong OpenTelemetry Plugin​

1. Install the OpenTelemetry Plugin​

2. Configure the Plugin​

Configuration Options​

Declarative Configuration (Kong.yml)​

Collecting and Viewing Telemetry Data​

Example: Viewing Traces in Jaeger​

Understanding Span Data​

Advanced Configuration: Customizing Spans​

Practical Example: Monitoring API Latency​

Troubleshooting Common Issues​

1. Spans not appearing in your backend​

2. Performance impact​

Best Practices​

Summary​

Additional Resources​

Exercises​