Tenant Configuration
Introduction
In a multi-tenant Grafana Loki deployment, proper tenant configuration is essential for maintaining separation between different users or organizations sharing the same Loki infrastructure. Tenant configuration allows administrators to define resource limits, retention policies, and access controls on a per-tenant basis, ensuring fair resource allocation and data isolation.
This guide will walk you through the process of configuring tenants in Loki, helping you understand the key parameters and best practices for managing a multi-tenant environment.
Understanding Tenant IDs in Loki
In Loki, each tenant is identified by a unique identifier known as a tenant ID. This ID is used throughout Loki's internal systems to segregate data and apply tenant-specific configurations.
The X-Scope-OrgID Header
Loki uses the X-Scope-OrgID
HTTP header to identify tenants in API requests. Every request made to Loki must include this header to specify which tenant's data is being accessed or modified.
For example, when pushing logs to Loki:
curl -H "Content-Type: application/json" \
-H "X-Scope-OrgID: tenant-123" \
-X POST \
-d '{
"streams": [
{
"stream": {
"job": "app",
"level": "info"
},
"values": [
[ "1630000000000000000", "log message for tenant-123" ]
]
}
]
}' \
http://localhost:3100/loki/api/v1/push
In this example, the log data is associated with the tenant identified by tenant-123
.
Tenant Configuration in Loki's Runtime Configuration
Tenant-specific configurations are defined in Loki's runtime configuration file, typically specified with the -runtime-config.file
flag when starting Loki.
Here's an example of a runtime configuration file with tenant-specific settings:
# runtime-config.yaml
overrides:
tenant-123:
ingestion_rate_mb: 10
ingestion_burst_size_mb: 20
max_streams_per_user: 10000
max_chunks_per_query: 1000000
retention_period: 744h
tenant-456:
ingestion_rate_mb: 5
ingestion_burst_size_mb: 10
max_streams_per_user: 5000
max_chunks_per_query: 500000
retention_period: 168h
Key Tenant Configuration Parameters
Let's explore the main parameters you can configure for each tenant:
Ingestion Limits
ingestion_rate_mb: 10
ingestion_burst_size_mb: 20
max_line_size_bytes: 256000
max_line_size: 256000 # Deprecated in favor of max_line_size_bytes
max_streams_per_user: 10000
max_global_streams_per_user: 10000
max_chunks_per_user: 1000000
- ingestion_rate_mb: Maximum ingestion rate in MB per second
- ingestion_burst_size_mb: Maximum burst size for ingestion in MB
- max_line_size_bytes: Maximum size of a single log line in bytes
- max_streams_per_user: Maximum number of active streams per tenant
- max_global_streams_per_user: Maximum number of streams across all indices per tenant
- max_chunks_per_user: Maximum number of chunks that can exist in a store for a tenant
Query Limits
max_chunks_per_query: 1000000
max_query_series: 500
max_query_lookback: 720h
max_query_length: 24h
max_query_parallelism: 32
query_timeout: 1m
query_split_duration: 30m
- max_chunks_per_query: Maximum number of chunks that can be fetched by a single query
- max_query_series: Maximum number of series a query can return
- max_query_lookback: Maximum duration into the past a query can look
- max_query_length: Maximum time range for a query
- max_query_parallelism: Maximum number of parallel query workers
- query_timeout: Maximum duration for a query to execute
- query_split_duration: Split queries by time interval for improved performance
Retention Settings
retention_period: 744h
- retention_period: How long log data is kept before being deleted (e.g., 744h = 31 days)
Cardinality Limits
cardinality_limit: 100000
max_label_name_length: 1024
max_label_value_length: 2048
max_label_names_per_series: 30
- cardinality_limit: Maximum number of active series per tenant
- max_label_name_length: Maximum length for label names
- max_label_value_length: Maximum length for label values
- max_label_names_per_series: Maximum number of label names per series
Default Tenant Configuration
When no tenant-specific overrides are defined, Loki applies default limits specified in the limits_config
section of the main configuration:
limits_config:
ingestion_rate_mb: 4
ingestion_burst_size_mb: 6
max_streams_per_user: 10000
max_chunks_per_query: 1000000
retention_period: 744h
Dynamic Tenant Configuration
Loki supports dynamic runtime configuration, allowing you to update tenant settings without restarting the service. You can reload the runtime configuration by sending a POST
request to the /-/reload
endpoint:
curl -X POST http://localhost:3100/-/reload
Practical Example: Multi-Environment Tenant Configuration
Let's look at a practical example where we configure Loki for multiple environments within an organization:
# runtime-config.yaml
overrides:
# Production environment with higher limits
prod:
ingestion_rate_mb: 20
ingestion_burst_size_mb: 30
max_streams_per_user: 50000
max_chunks_per_query: 2000000
retention_period: 2160h # 90 days
max_query_length: 72h
max_query_parallelism: 32
query_timeout: 5m
# Staging environment with moderate limits
staging:
ingestion_rate_mb: 10
ingestion_burst_size_mb: 15
max_streams_per_user: 20000
max_chunks_per_query: 1000000
retention_period: 720h # 30 days
max_query_length: 48h
max_query_parallelism: 16
query_timeout: 2m
# Development environment with lower limits
dev:
ingestion_rate_mb: 5
ingestion_burst_size_mb: 8
max_streams_per_user: 10000
max_chunks_per_query: 500000
retention_period: 168h # 7 days
max_query_length: 24h
max_query_parallelism: 8
query_timeout: 1m
To visualize how these tenants are isolated:
Configuring Promtail for Multi-Tenancy
When sending logs from Promtail (Loki's log collection agent) to a multi-tenant Loki setup, you need to configure Promtail to use the appropriate tenant ID:
# promtail-config.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
tenant_id: prod
basic_auth:
username: loki
password: secret
For more complex scenarios, you can use relabeling to dynamically set the tenant ID based on labels:
clients:
- url: http://loki:3100/loki/api/v1/push
tenant_id: ${NAMESPACE}
scrape_configs:
- job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_namespace]
target_label: NAMESPACE
In this example, the Kubernetes namespace is used as the tenant ID, allowing for separate tenants per namespace.
Best Practices for Tenant Configuration
-
Start with Conservative Limits: Begin with conservative resource limits and gradually increase them as needed based on actual usage patterns.
-
Align Tenants with Organizational Structure: Map tenants to your organizational structure—for example, by department, team, or environment (prod, staging, dev).
-
Monitor Tenant Resource Usage: Implement monitoring to track each tenant's resource consumption, helping identify when limits need adjustment.
-
Use Federation for Large-Scale Deployments: For very large deployments, consider running separate Loki instances for high-traffic tenants and using federation for querying across them.
-
Document Tenant Configurations: Maintain clear documentation of tenant configurations, including why specific limits were chosen.
-
Consider Automation: For environments with many tenants, use automation to generate and update tenant configurations based on templates.
-
Implement Regular Reviews: Regularly review tenant configurations to ensure they remain appropriate as usage patterns evolve.
Troubleshooting Tenant Issues
Rate Limiting Errors
If users encounter 429 Too Many Requests
errors, it indicates they've exceeded their ingestion rate limits:
Error: server returned HTTP status 429 Too Many Requests: Ingestion rate limit (10.00MB/sec) exceeded
Solutions:
- Increase the
ingestion_rate_mb
andingestion_burst_size_mb
values for the affected tenant - Implement backpressure handling in your log shipping configuration
- Consider log sampling for high-volume sources
Query Performance Issues
If queries are timing out or performing poorly:
Solutions:
- Adjust
max_chunks_per_query
ormax_query_series
limits - Increase
query_timeout
for complex queries - Use more specific label selectors to reduce the scope of queries
- Split large time ranges into smaller queries
Summary
Proper tenant configuration is crucial for managing a multi-tenant Loki deployment. By understanding and applying tenant-specific limits for ingestion, querying, and retention, you can ensure fair resource allocation and data isolation between different users or organizations sharing the same Loki infrastructure.
Key takeaways:
- Each tenant has a unique ID specified via the
X-Scope-OrgID
header - Tenant configurations are defined in Loki's runtime configuration file
- Important parameters include ingestion limits, query limits, and retention periods
- Dynamic configuration allows updating tenant settings without service restarts
- Tenant configurations should align with your organizational structure and usage patterns
Additional Resources and Exercises
Exercises
-
Set up a Loki instance with three different tenants, each with distinct resource limits and retention periods.
-
Configure Promtail to send logs to Loki using different tenant IDs based on the source application.
-
Create a monitoring dashboard to track resource usage per tenant, helping identify when limits need adjustment.
-
Implement a script that generates tenant configurations based on a template and a list of tenant properties.
Further Reading
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)