Metadata Handling in Grafana Loki
Introduction
Metadata in Grafana Loki refers to additional contextual information attached to log entries beyond their core content. While labels help categorize and filter logs, metadata provides enriched information about each log entry. Effectively handling metadata allows you to create more powerful queries, derive deeper insights, and build more comprehensive dashboards.
In this guide, we'll explore how Grafana Loki manages metadata, how to extract and query it, and best practices for metadata handling in production environments.
Understanding Metadata in Loki
What is Metadata?
In Loki, metadata consists of:
- Structured data embedded within log entries
- Extracted fields derived from log parsing
- System-generated information like timestamps and source identifiers
Unlike labels which are indexed and used for filtering, metadata is typically stored with the log content and extracted at query time.
Extracting Metadata with LogQL
Loki's query language (LogQL) provides powerful tools for extracting and working with metadata.
Basic Metadata Extraction
The simplest way to extract metadata is using the |
pipe operator with extraction functions:
{app="myapp"} | json
This extracts all JSON fields from matching log lines, making them available as metadata.
For specific fields:
{app="myapp"} | json field="user_id", field2="response_time"
Example: Extracting HTTP Status Codes
Consider these JSON logs:
{"timestamp": "2023-10-15T14:22:10Z", "level": "info", "message": "Request processed", "status": 200, "path": "/api/users", "duration_ms": 45}
{"timestamp": "2023-10-15T14:22:11Z", "level": "error", "message": "Request failed", "status": 500, "path": "/api/orders", "duration_ms": 132}
To extract status codes:
{app="web-server"} | json | status >= 400
Output:
{app="web-server"} status=500 level="error" message="Request failed" path="/api/orders" duration_ms=132
Advanced Pattern Extraction
For non-JSON logs, you can use regex patterns:
{app="myapp"} | pattern `<pattern>`
For example, to extract information from a standard log format:
{app="backend"} | pattern `[<timestamp>] <level>: <message> (user=<user_id>)`
Transforming and Manipulating Metadata
Once extracted, you can transform metadata for analysis:
Mathematical Operations
{app="web-server"} | json | duration_ms > 100 | duration_sec = duration_ms / 1000
This creates a new duration_sec
field by converting milliseconds to seconds.
String Manipulations
{app="auth-service"} | json | user_email = lower(email) | user_domain = label_format("{{.user_email}}", "{{.user_email | regexp `@(.+)$` `$1`}}")
Practical Examples
Example 1: Analyzing API Performance
Let's analyze API endpoint performance using metadata:
{app="api-gateway"}
| json
| path != ""
| unwrap duration_ms
| by(path)
| quantile=0.95
This query:
- Filters for API gateway logs
- Extracts JSON fields
- Ensures path is not empty
- Uses the duration_ms field for analysis
- Groups by path
- Calculates the 95th percentile response time
Example 2: Error Rate by Service and Version
sum by(service, version) (
rate({env="production"} | json | level="error" [5m])
) /
sum by(service, version) (
rate({env="production"} | json [5m])
)
This calculates the error rate per service and version by dividing the error count by the total log count.
Example 3: User Activity Tracking
For a login service with logs like:
{"timestamp": "2023-10-15T10:45:12Z", "level": "info", "msg": "User login", "user_id": "u-123", "success": true, "location": {"country": "US", "city": "San Francisco"}}
We can track login attempts by location:
{app="login-service"}
| json
| success="true"
| line_format "{{.user_id}} logged in from {{.location.city}}, {{.location.country}}"
Metadata Handling Best Practices
Performance Considerations
- Extract only what you need: Extracting all fields from high-volume logs can impact performance
- Pre-extract common fields: Use log processors to extract common fields before they reach Loki
- Use labels for high-cardinality data: For data you'll frequently filter on
Structuring Your Logs for Optimal Metadata Handling
- Consistent format: Use consistent JSON structures or log formats
- Avoid deeply nested structures: Flatten when possible for easier extraction
- Include contextual information: Add service version, environment, and other relevant metadata
Avoiding Common Pitfalls
- Over-extraction: Extracting too many fields adds processing overhead
- High cardinality in labels: Keep high-cardinality data as metadata, not labels
- Complex regex patterns: Complex patterns are CPU-intensive and can slow queries
Integration with Other Grafana Components
Extracted metadata can be used with other Grafana components:
-
Alerting: Create alerts based on metadata patterns
sqlsum(rate({app="payment-service"} | json | status >= 500 [5m])) > 10
-
Dashboards: Build dashboards with metadata-based visualizations
sqlsum by(path) (rate({app="api"} | json | status >= 400 and status < 500 [5m]))
-
Derived metrics: Create metrics from log metadata for long-term storage in Prometheus
Summary
Effective metadata handling is crucial for getting the most value from your logs in Grafana Loki. By properly extracting, transforming, and analyzing metadata, you can gain deeper insights into your applications and infrastructure.
Key points to remember:
- Metadata provides context and detail beyond basic log content
- LogQL offers powerful extraction capabilities for JSON and pattern-based logs
- Transformation functions let you derive new insights from existing metadata
- Balance extraction needs with performance considerations
- Use metadata strategically alongside labels for optimal performance
Further Exercises
- Extract metadata from logs in different formats (JSON, key-value pairs, XML)
- Build a dashboard showing API performance based on extracted duration metadata
- Create an alert that triggers when error rates exceed a threshold for a specific user segment
- Implement a log preprocessing pipeline that extracts common metadata before ingestion
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)