Bytes Over Time Function
Introduction
The bytes_over_time
function is a powerful tool within LogQL (Loki's Query Language) that allows you to analyze the volume of log data over specified time windows. This function helps you understand data ingestion patterns, identify abnormal spikes in log volume, and plan for storage capacity.
In this guide, we'll explore how bytes_over_time
works, how to use it effectively, and practical applications in monitoring and troubleshooting scenarios.
What is bytes_over_time?
The bytes_over_time
function measures the uncompressed size of log lines (in bytes) within a specified time range. It belongs to the family of range vector functions in LogQL, which operate over a selected time window for each log stream.
Basic Syntax
bytes_over_time({log selector} [time range])
Where:
{log selector}
is a LogQL stream selector that filters the logs you want to analyze[time range]
is the duration over which to calculate the byte count (e.g.,[1h]
,[24h]
)
How It Works
When you execute a bytes_over_time
query, Loki:
- Filters log streams based on your log selector
- For each stream, calculates the total uncompressed size of log entries within the specified time window
- Returns a time series showing the byte count over time
This function is particularly useful for:
- Tracking log volume trends
- Identifying unexpected increases in logging activity
- Capacity planning for log storage
Basic Usage Examples
Example 1: Total log volume by application
sum(bytes_over_time({app="frontend"}[1h]))
This query calculates the total uncompressed bytes of logs from the "frontend" application over each 1-hour window.
Output example:
{app="frontend"} 2.34MB @1630000000
{app="frontend"} 2.12MB @1630003600
{app="frontend"} 3.45MB @1630007200
Example 2: Comparing log volume across services
bytes_over_time({environment="production", app=~"auth|api|database"}[5m])
This query returns the log volume for multiple services (auth, api, and database) in the production environment over 5-minute intervals.
Common Use Cases
1. Detecting Log Volume Anomalies
Let's say you want to detect unusual spikes in logging activity that might indicate a problem:
bytes_over_time({app="payment-service"}[30m]) > 10*1024*1024
This alert would trigger if the payment service generates more than 10MB of logs in a 30-minute window, which might indicate an issue with the service.
2. Tracking Data Volume by Log Level
To understand which log levels are contributing most to your storage costs:
sum by (level) (bytes_over_time({app="backend"} | pattern `<_> level=<level> <_>` [1h]))
This query breaks down log volume by severity level, helping you identify if excessive DEBUG or INFO logs are filling your storage.
3. Capacity Planning
For capacity planning purposes, you can analyze long-term trends:
sum(bytes_over_time({namespace="default"}[1d])) by (app)
This query shows daily log volume for each application in the default namespace, helping you plan storage requirements.
Combining with Other Functions
The bytes_over_time
function becomes even more powerful when combined with other LogQL functions.
Example: Rate of Change in Log Volume
rate(sum(bytes_over_time({app="web-server"}[5m]))[30m])
This query calculates how quickly your log volume is changing over a 30-minute window, which can help identify gradually increasing logging behavior.
Example: Comparing to Historical Patterns
sum(bytes_over_time({app="database"}[1h]))
/
sum(bytes_over_time({app="database"}[1h] offset 7d))
This query compares current log volume to the same period one week ago, helping identify seasonal patterns or unexpected changes.
Best Practices
When using bytes_over_time
, keep these tips in mind:
-
Choose appropriate time windows: Too small windows may not show meaningful patterns, while too large windows could mask short-term spikes.
-
Use labels effectively: Aggregate and group by meaningful labels to get actionable insights.
-
Be aware of cardinality: High-cardinality labels can explode the number of time series, affecting performance.
-
Consider data retention policies: Make sure your queries don't exceed your configured retention period.
-
Compare with other metrics: For complete understanding, correlate log volume with other metrics like request count or error rates.
Troubleshooting with bytes_over_time
Scenario: Debugging Sudden Storage Increases
Imagine your Loki storage costs have unexpectedly increased. You can use bytes_over_time
to investigate:
sum by (namespace, app) (bytes_over_time({namespace=~".*"}[1h]))
This query breaks down log volume by namespace and application, helping you identify which components are generating excessive logs.
Scenario: Identifying Chatty Containers
To find containers that are generating more logs than expected:
topk(5, sum by (container) (bytes_over_time({pod=~".+"}[1h])))
This returns the top 5 containers by log volume, which can help you target log verbosity configuration.
Summary
The bytes_over_time
function is an essential tool for understanding log data volume patterns in Grafana Loki. By measuring the uncompressed size of logs over specific time windows, it helps you:
- Track log volume trends
- Identify services generating excessive logs
- Plan for storage capacity
- Detect anomalies in logging behavior
Mastering this function allows you to better manage your logging infrastructure and gain insights into application behavior through log volume metrics.
Practice Exercises
-
Write a LogQL query to find which of your applications generated the most log data in the past 24 hours.
-
Create a Grafana dashboard panel that shows the top 3 namespaces by log volume over time.
-
Write an alert expression that triggers when any service increases its log volume by more than 200% compared to its average from the previous week.
-
Create a query that compares the log volume distribution across different environments (dev, staging, production).
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)