Index Storage in Grafana Loki

Introduction

In Grafana Loki's architecture, one of the most critical components is the index storage system. While log data (chunks) contains the actual content of your logs, the index is what makes those logs quickly searchable and retrievable. Think of index storage as the "table of contents" for your logging system - without it, finding specific logs would require scanning through all data sequentially, which would be prohibitively slow at scale.

This guide will help you understand how Loki's index storage works, how to configure it, and how to optimize it for your specific use case.

What is Index Storage?

Index storage in Loki is a database that maintains the relationship between your log labels (metadata) and the actual chunks of log data. It maps the combination of labels and timestamps to where the actual log data is stored.

The index contains critical information such as:

Label sets for each log stream
Time ranges for chunks
References to where actual log chunks are stored
Series information

Index Storage Options

Loki supports multiple backend options for storing index data:

1. Local Storage

For testing, development, or small deployments, you can use the filesystem-based local storage for your index:

storage_config:
  boltdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/index_cache
    cache_ttl: 24h
    shared_store: filesystem
    filesystem:
      directory: /loki/index_shared

This configuration:

Stores active indexes in /loki/index
Maintains a cache in /loki/index_cache with a 24-hour TTL
Uses the local filesystem for shared storage at /loki/index_shared

2. Cloud Storage Options

For production deployments, you'll typically use a cloud storage provider:

AWS S3

storage_config:
  boltdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/index_cache
    cache_ttl: 24h
    shared_store: s3
  aws:
    s3: s3://region/bucket-name
    s3forcepathstyle: true

Google Cloud Storage

storage_config:
  boltdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/index_cache
    cache_ttl: 24h
    shared_store: gcs
  gcs:
    bucket_name: loki-index-bucket

Azure Blob Storage

storage_config:
  boltdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/index_cache
    cache_ttl: 24h
    shared_store: azure
  azure:
    account_name: loki_storage
    account_key: <account_key>
    container_name: loki-index

BoltDB Shipper - Loki's Default Index Store

The most commonly used index storage implementation in Loki is BoltDB Shipper. This index store uses BoltDB (a simple, fast, key-value store) for active indexes, and periodically uploads these index files to the configured shared storage.

How BoltDB Shipper Works

Write Path:
- Ingesters write logs to local BoltDB files in active_index_directory
- Periodically (default: 15 minutes), these files are uploaded to the configured shared store
- This approach improves write performance by batching index updates
Read Path:
- Queriers download index files from shared storage
- Files are cached in cache_location to improve read performance
- The cache is managed based on the configured cache_ttl

Configuring Index Storage

Let's look at a complete configuration example with explanations:

storage_config:
  # BoltDB Shipper configuration
  boltdb_shipper:
    # Directory to store active index files (not yet uploaded)
    active_index_directory: /loki/index
    
    # Directory to cache downloaded index files
    cache_location: /loki/index_cache
    
    # How long to keep index files in cache
    cache_ttl: 24h
    
    # How often to upload index files to shared storage
    resync_interval: 5m
    
    # The storage backend to use for shared index files
    shared_store: s3
  
  # Configuration for the chosen storage backend
  aws:
    s3: s3://us-east-1/loki-index
    region: us-east-1
    access_key_id: ${AWS_ACCESS_KEY_ID}
    secret_access_key: ${AWS_SECRET_ACCESS_KEY}

Index Storage Performance Optimization

Optimizing your index storage is crucial for Loki's performance. Here are some key considerations:

1. Cardinality Management

High cardinality (too many unique label combinations) is the most common performance issue with Loki indexes:

# In limits_config section
limits_config:
  max_label_name_length: 1024
  max_label_value_length: 2048
  max_label_names_per_series: 30
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  cardinality_limit: 100000

2. Cache Tuning

Properly sized caches improve query performance:

# In chunk_store_config section
chunk_store_config:
  chunk_cache_config:
    enabled: true
    size: 1GB
  write_dedupe_cache_config:
    enabled: true
    size: 512MB

3. Compaction

Index compaction helps reduce storage size and improve query performance:

compactor:
  working_directory: /loki/compactor
  shared_store: s3
  compaction_interval: 2h
  retention_enabled: true

Practical Example: Setting Up Index Storage in a Distributed Environment

Let's walk through a practical example of setting up Loki with appropriate index storage for a medium-sized production environment:

First, we'll define our storage configuration in a file called loki-config.yaml:

storage_config:
  boltdb_shipper:
    active_index_directory: /loki/index
    cache_location: /loki/index_cache
    cache_ttl: 48h
    shared_store: s3
  aws:
    s3: s3://us-west-2/loki-index-storage
    region: us-west-2
    sse_encryption: true

schema_config:
  configs:
    - from: 2022-01-01
      store: boltdb-shipper
      object_store: s3
      schema: v12
      index:
        prefix: loki_index_
        period: 24h

chunk_store_config:
  chunk_cache_config:
    enabled: true
    size: 2GB
  write_dedupe_cache_config:
    enabled: true
    size: 1GB

compactor:
  working_directory: /loki/compactor
  shared_store: s3
  compaction_interval: 3h

Then, we can deploy this using Docker:

docker run -d --name loki \
  -v $(pwd)/loki-config.yaml:/etc/loki/config.yaml \
  -v loki-data:/loki \
  -p 3100:3100 \
  -e AWS_ACCESS_KEY_ID=your_access_key \
  -e AWS_SECRET_ACCESS_KEY=your_secret_key \
  grafana/loki:latest \
  -config.file=/etc/loki/config.yaml

Monitoring Index Storage

To ensure your index storage is functioning properly, monitor these key metrics:

Index Gateway Metrics:
- loki_boltdb_shipper_request_duration_seconds - Time taken for index operations
- loki_boltdb_shipper_request_errors_total - Count of index operation errors
Query Performance:
- loki_query_frontend_query_seconds - Time taken for queries
- loki_index_entries_per_chunk - Index density

You can set up a Grafana dashboard to monitor these metrics:

# Example Prometheus rules for index monitoring
groups:
- name: loki-index-alerts
  rules:
  - alert: LokiIndexUploadFailures
    expr: rate(loki_boltdb_shipper_upload_failures_total[5m]) > 0
    for: 15m
    labels:
      severity: warning
    annotations:
      description: "Loki is failing to upload index files to storage"

Common Issues and Troubleshooting

Here are some common issues related to index storage and how to address them:

1. High Cardinality Problems

Symptom: Slow queries, high memory usage, index size growing rapidly

Solutions:

Audit your label usage and eliminate unnecessary high-cardinality labels
Implement dynamic labels for high-cardinality values
Configure cardinality limits in Loki

2. Index Upload Failures

Symptom: Errors in logs like failed to upload index with key...

Solutions:

Check storage permissions
Verify network connectivity
Ensure sufficient disk space for temporary files

3. Slow Queries

Symptom: Queries take a long time to complete

Solutions:

Increase cache sizes
Add more querier instances
Optimize your label queries to be more specific
Enable compaction to optimize index files

Summary

Index storage is a vital component of Loki's architecture that enables efficient log retrieval. In this guide, we've covered:

What index storage is and why it's important
Available storage backend options (local, S3, GCS, Azure)
How BoltDB Shipper works
Configuration options and best practices
Performance optimization techniques
Practical deployment examples
Monitoring and troubleshooting

By properly configuring and optimizing your index storage, you can ensure that Loki performs well even as your logging volume grows.

Additional Resources

Practice Exercises

Set up a local Loki instance with filesystem-based index storage and monitor its performance.
Configure Loki with S3 as the index storage backend and test uploading and retrieving logs.
Analyze the impact of different cache TTL values on query performance by running benchmarks.
Create a Grafana dashboard to monitor the key index storage metrics mentioned in this guide.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

What is Index Storage?​

Index Storage Options​

1. Local Storage​

2. Cloud Storage Options​

AWS S3​

Google Cloud Storage​

Azure Blob Storage​

BoltDB Shipper - Loki's Default Index Store​

How BoltDB Shipper Works​

Configuring Index Storage​

Index Storage Performance Optimization​

1. Cardinality Management​

2. Cache Tuning​

3. Compaction​

Practical Example: Setting Up Index Storage in a Distributed Environment​

Monitoring Index Storage​

Common Issues and Troubleshooting​

1. High Cardinality Problems​

2. Index Upload Failures​

3. Slow Queries​

Summary​

Additional Resources​

Practice Exercises​

Introduction

What is Index Storage?

Index Storage Options

1. Local Storage

2. Cloud Storage Options

AWS S3

Google Cloud Storage

Azure Blob Storage

BoltDB Shipper - Loki's Default Index Store

How BoltDB Shipper Works

Configuring Index Storage

Index Storage Performance Optimization

1. Cardinality Management

2. Cache Tuning

3. Compaction

Practical Example: Setting Up Index Storage in a Distributed Environment

Monitoring Index Storage

Common Issues and Troubleshooting

1. High Cardinality Problems

2. Index Upload Failures

3. Slow Queries

Summary

Additional Resources

Practice Exercises