Bytes Over Time Function

Introduction

The bytes_over_time function is a powerful tool within LogQL (Loki's Query Language) that allows you to analyze the volume of log data over specified time windows. This function helps you understand data ingestion patterns, identify abnormal spikes in log volume, and plan for storage capacity.

In this guide, we'll explore how bytes_over_time works, how to use it effectively, and practical applications in monitoring and troubleshooting scenarios.

What is bytes_over_time?

The bytes_over_time function measures the uncompressed size of log lines (in bytes) within a specified time range. It belongs to the family of range vector functions in LogQL, which operate over a selected time window for each log stream.

Basic Syntax

bytes_over_time({log selector} [time range])

Where:

{log selector} is a LogQL stream selector that filters the logs you want to analyze
[time range] is the duration over which to calculate the byte count (e.g., [1h], [24h])

How It Works

When you execute a bytes_over_time query, Loki:

Filters log streams based on your log selector
For each stream, calculates the total uncompressed size of log entries within the specified time window
Returns a time series showing the byte count over time

This function is particularly useful for:

Tracking log volume trends
Identifying unexpected increases in logging activity
Capacity planning for log storage

Basic Usage Examples

Example 1: Total log volume by application

sum(bytes_over_time({app="frontend"}[1h]))

This query calculates the total uncompressed bytes of logs from the "frontend" application over each 1-hour window.

Output example:

{app="frontend"} 2.34MB @1630000000
{app="frontend"} 2.12MB @1630003600
{app="frontend"} 3.45MB @1630007200

Example 2: Comparing log volume across services

bytes_over_time({environment="production", app=~"auth|api|database"}[5m])

This query returns the log volume for multiple services (auth, api, and database) in the production environment over 5-minute intervals.

Common Use Cases

1. Detecting Log Volume Anomalies

Let's say you want to detect unusual spikes in logging activity that might indicate a problem:

bytes_over_time({app="payment-service"}[30m]) > 10*1024*1024

This alert would trigger if the payment service generates more than 10MB of logs in a 30-minute window, which might indicate an issue with the service.

2. Tracking Data Volume by Log Level

To understand which log levels are contributing most to your storage costs:

sum by (level) (bytes_over_time({app="backend"} | pattern `<_> level=<level> <_>` [1h]))

This query breaks down log volume by severity level, helping you identify if excessive DEBUG or INFO logs are filling your storage.

3. Capacity Planning

For capacity planning purposes, you can analyze long-term trends:

sum(bytes_over_time({namespace="default"}[1d])) by (app)

This query shows daily log volume for each application in the default namespace, helping you plan storage requirements.

Combining with Other Functions

The bytes_over_time function becomes even more powerful when combined with other LogQL functions.

Example: Rate of Change in Log Volume

rate(sum(bytes_over_time({app="web-server"}[5m]))[30m])

This query calculates how quickly your log volume is changing over a 30-minute window, which can help identify gradually increasing logging behavior.

Example: Comparing to Historical Patterns

sum(bytes_over_time({app="database"}[1h])) 
/ 
sum(bytes_over_time({app="database"}[1h] offset 7d))

This query compares current log volume to the same period one week ago, helping identify seasonal patterns or unexpected changes.

Best Practices

When using bytes_over_time, keep these tips in mind:

Choose appropriate time windows: Too small windows may not show meaningful patterns, while too large windows could mask short-term spikes.
Use labels effectively: Aggregate and group by meaningful labels to get actionable insights.
Be aware of cardinality: High-cardinality labels can explode the number of time series, affecting performance.
Consider data retention policies: Make sure your queries don't exceed your configured retention period.
Compare with other metrics: For complete understanding, correlate log volume with other metrics like request count or error rates.

Troubleshooting with bytes_over_time

Scenario: Debugging Sudden Storage Increases

Imagine your Loki storage costs have unexpectedly increased. You can use bytes_over_time to investigate:

sum by (namespace, app) (bytes_over_time({namespace=~".*"}[1h]))

This query breaks down log volume by namespace and application, helping you identify which components are generating excessive logs.

Scenario: Identifying Chatty Containers

To find containers that are generating more logs than expected:

topk(5, sum by (container) (bytes_over_time({pod=~".+"}[1h])))

This returns the top 5 containers by log volume, which can help you target log verbosity configuration.

Summary

The bytes_over_time function is an essential tool for understanding log data volume patterns in Grafana Loki. By measuring the uncompressed size of logs over specific time windows, it helps you:

Track log volume trends
Identify services generating excessive logs
Plan for storage capacity
Detect anomalies in logging behavior

Mastering this function allows you to better manage your logging infrastructure and gain insights into application behavior through log volume metrics.

Practice Exercises

Write a LogQL query to find which of your applications generated the most log data in the past 24 hours.
Create a Grafana dashboard panel that shows the top 3 namespaces by log volume over time.
Write an alert expression that triggers when any service increases its log volume by more than 200% compared to its average from the previous week.
Create a query that compares the log volume distribution across different environments (dev, staging, production).

Additional Resources

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

What is bytes_over_time?​

Basic Syntax​

How It Works​

Basic Usage Examples​

Example 1: Total log volume by application​

Example 2: Comparing log volume across services​

Common Use Cases​

1. Detecting Log Volume Anomalies​

2. Tracking Data Volume by Log Level​

3. Capacity Planning​

Combining with Other Functions​

Example: Rate of Change in Log Volume​

Example: Comparing to Historical Patterns​

Best Practices​

Troubleshooting with bytes_over_time​

Scenario: Debugging Sudden Storage Increases​

Scenario: Identifying Chatty Containers​

Summary​

Practice Exercises​

Additional Resources​