Loki Metrics
Introduction
Loki, like any system, needs to be monitored to ensure it's performing optimally. Metrics provide insights into Loki's health, performance, and resource usage. In this guide, we'll explore the various metrics exposed by Loki, how to collect them, visualize them in Grafana, and set up meaningful alerts to proactively identify issues before they impact your system.
Understanding Loki Metrics
Loki exposes a variety of metrics in Prometheus format, making it easy to integrate with Prometheus for collection and Grafana for visualization. These metrics are crucial for understanding how Loki is performing, identifying bottlenecks, and ensuring the reliability of your log management system.
Types of Loki Metrics
Loki metrics can be broadly categorized into several groups:
- Component Metrics: Specific to individual Loki components (Distributor, Ingester, Querier, etc.)
- Request Metrics: Related to HTTP requests processing
- Operational Metrics: Covering memory usage, CPU, goroutines, etc.
- Storage Metrics: Tracking storage operations and performance
- Query Performance Metrics: Measuring query execution times and resource usage
Collecting Loki Metrics
Loki exposes metrics via an HTTP endpoint (by default at /metrics
). Here's how to configure Prometheus to scrape these metrics:
scrape_configs:
- job_name: loki
static_configs:
- targets: ['loki:3100']
When using Loki in microservices mode, you'll need to scrape metrics from each component:
scrape_configs:
- job_name: loki-distributor
static_configs:
- targets: ['loki-distributor:3100']
- job_name: loki-ingester
static_configs:
- targets: ['loki-ingester:3100']
- job_name: loki-querier
static_configs:
- targets: ['loki-querier:3100']
# Add more components as needed
Key Loki Metrics to Monitor
Ingestion Metrics
These metrics help you understand how efficiently Loki is ingesting logs:
loki_distributor_bytes_received_total
: Total bytes received per tenantloki_distributor_lines_received_total
: Total lines received per tenantloki_ingester_chunks_created_total
: Number of chunks created in the ingesterloki_ingester_chunks_stored_total
: Total chunks stored in the ingester
Example Dashboard Query
sum(rate(loki_distributor_lines_received_total[5m])) by (tenant)
This shows the rate of log lines being ingested per tenant over a 5-minute window.
Query Performance Metrics
These metrics help you understand query performance:
loki_querier_request_duration_seconds
: Time spent processing query requestsloki_querier_query_seconds
: Time spent executing queriesloki_querier_chunk_fetch_duration_seconds
: Time spent fetching chunks
Example Dashboard Query
histogram_quantile(0.99, sum(rate(loki_querier_request_duration_seconds_bucket[5m])) by (le, method))
This shows the 99th percentile query latency, broken down by request method.
Storage Metrics
Storage metrics help you understand how Loki interacts with its backend storage:
loki_chunk_store_index_entries_per_chunk
: Number of index entries per chunkloki_chunk_operations_total
: Total number of chunk operations by operation typeloki_chunk_store_chunk_downloads_total
: Total number of chunk downloads
Resource Usage Metrics
These metrics help you understand resource consumption:
go_memstats_alloc_bytes
: Current memory usageprocess_cpu_seconds_total
: Total user and system CPU time spentprocess_resident_memory_bytes
: Resident memory size
Creating a Loki Metrics Dashboard
Let's create a simple Grafana dashboard to monitor key Loki metrics:
Sample Dashboard JSON
Here's a starter configuration for a Loki metrics dashboard (simplified):
{
"title": "Loki Monitoring Dashboard",
"panels": [
{
"title": "Log Lines Ingested",
"type": "graph",
"targets": [
{
"expr": "sum(rate(loki_distributor_lines_received_total[5m])) by (tenant)",
"legendFormat": "{{tenant}}"
}
]
},
{
"title": "Query Latency (99th Percentile)",
"type": "graph",
"targets": [
{
"expr": "histogram_quantile(0.99, sum(rate(loki_querier_request_duration_seconds_bucket[5m])) by (le, method))",
"legendFormat": "{{method}}"
}
]
},
{
"title": "Memory Usage",
"type": "graph",
"targets": [
{
"expr": "process_resident_memory_bytes{job=~\"loki.*\"}",
"legendFormat": "{{job}}"
}
]
}
]
}
Setting Up Alerts for Loki Metrics
Alerting is crucial for proactive monitoring. Here are some essential alerts you should consider: