Storage Retention Policies
Introduction
Storage retention policies in Grafana Loki define how long log data is kept before being automatically deleted. Implementing effective retention policies is essential for:
- Managing storage costs
- Complying with data governance requirements
- Optimizing query performance
- Ensuring relevant data is available when needed
In this guide, we'll explore how Loki handles log retention, how to configure different retention strategies, and best practices for implementing retention policies that balance performance, compliance, and cost.
Understanding Log Retention in Loki
Loki stores log data in chunks, which are collections of compressed log entries. As these chunks age, they move through different storage tiers and can eventually be deleted based on your retention configuration.
Retention Types
Loki supports several types of retention rules:
- Time-based retention: Delete data older than a specified time period
- Stream-based retention: Apply different retention periods to different log streams
- Label-based retention: Define retention based on log labels
- Storage-based retention: Delete oldest data when storage limits are reached
Configuring Basic Retention Policies
The simplest way to configure retention in Loki is through time-based policies in your Loki configuration file.
limits_config:
retention_period: 744h # Retain logs for 31 days
This configuration will remove any chunks older than the specified period. The value can be specified in hours (h
), days (d
), or any other Go duration format.
Retention in Multi-tenant Environments
For multi-tenant setups, you can set different retention periods for different tenants:
limits_config:
retention_period: 744h # Default retention period
per_tenant_override_config: /etc/loki/retention-overrides.yaml
# Or define directly in the main config
per_tenant_override:
customer1:
retention_period: 1440h # 60 days for customer1
customer2:
retention_period: 168h # 7 days for customer2
Stream Selector-Based Retention
Loki supports more granular retention policies using stream selectors in the compactor
configuration:
compactor:
retention_enabled: true
retention_delete_worker_count: 150
retention_delete_delay: 2h
working_directory: /loki/compactor
shared_store: s3
compaction_interval: 10m
retention_delete_delay: 2h
retention_config:
- discard_after: 744h # 31 days
selector: '{namespace="production"}'
- discard_after: 168h # 7 days
selector: '{namespace="development"}'
- discard_after: 336h # 14 days
selector: '{env="staging"}'
This configuration sets different retention periods based on the stream selectors:
- Production logs are kept for 31 days
- Development logs are kept for 7 days
- Staging logs are kept for 14 days
Visualizing Retention Policies
A retention policy workflow can be visualized using the following diagram:
Advanced Retention Strategies
Implementing Storage-Based Retention
To prevent unbounded growth, you can configure storage limits:
limits_config:
retention_period: 744h # Time-based retention
retention_stream_limit_bytes: 1073741824 # 1GB per stream max
Implementing Compliance-Focused Retention
For regulatory compliance, certain logs might need longer retention:
compactor:
retention_enabled: true
retention_config:
- discard_after: 8760h # 1 year
selector: '{category="audit"}'
- discard_after: 744h # 31 days
selector: '{category!="audit"}'
Real-World Examples
Scenario 1: Cost-Optimized Retention
A cloud-native application might implement tiered retention to optimize costs:
compactor:
retention_enabled: true
retention_config:
- discard_after: 24h # 1 day only
selector: '{level="debug"}'
- discard_after: 168h # 1 week
selector: '{level="info"}'
- discard_after: 744h # 31 days
selector: '{level="warn"}'
- discard_after: 2160h # 90 days
selector: '{level="error"}'
In this example:
- Debug logs are kept for only 24 hours
- Info logs are kept for a week
- Warning logs are retained for a month
- Error logs are saved for 90 days
Scenario 2: Regulatory Compliance
A financial institution might configure:
compactor:
retention_enabled: true
retention_config:
- discard_after: 17520h # 2 years
selector: '{domain=~"payment|account|transaction"}'
- discard_after: 744h # 31 days
selector: '{domain=~"ui|frontend"}'
Implementation Best Practices
- Start conservative: Begin with longer retention periods and adjust downward
- Monitor storage growth: Track your storage usage over time to fine-tune policies
- Use log levels effectively: Configure applications to use appropriate log levels
- Create logical log streams: Group related logs to apply coherent retention policies
- Document your retention policies: Create clear documentation for compliance audits
Monitoring Your Retention Policies
Use Grafana to monitor the effectiveness of your retention policies:
rate({job="loki-compactor"}[5m] |= "deleted")
This query shows the rate of log deletion by the compactor.
Common Pitfalls
- Retention without compaction: Ensure the compactor is properly configured and running
- Overlapping rules: If multiple rules match a stream, the longest retention period applies
- High cardinality: Too many unique label combinations can lead to excessive storage
- No default rule: Always include a catch-all rule for logs that don't match specific selectors
Practical Exercise: Implementing Tiered Retention
Try implementing a tiered retention policy for a sample application:
- Configure retention for different environments (dev, staging, prod)
- Set up log level-based retention (debug, info, warn, error)
- Create a special retention rule for security-related logs
- Monitor the storage usage before and after implementation
Summary
Storage retention policies are crucial for managing the lifecycle of your log data in Grafana Loki. By properly configuring retention rules based on time, stream selectors, or storage limits, you can:
- Control storage costs
- Meet compliance requirements
- Maintain system performance
- Keep relevant data accessible for analysis
Remember that the best retention policy balances your operational needs, compliance requirements, and resource constraints.
Additional Resources
Next Steps
- Learn about configuring storage backends for Loki
- Explore advanced compaction strategies
- Understand how to implement log lifecycle management across your organization
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)