Count Function
Introduction
The count()
function is one of the most fundamental and frequently used metrics functions in LogQL (Loki Query Language). It enables you to count the number of log entries that match a specific log stream selector and optional filter expressions. This simple yet powerful function forms the foundation for monitoring and alerting in Grafana Loki.
In this guide, we'll explore how the count()
function works, its syntax, and practical applications to help you make the most of your log data.
Basic Syntax
The basic syntax for the count()
function in LogQL is:
count(<log stream selector> [| <filter expressions>])
Where:
<log stream selector>
is a query that selects the log streams you want to analyze<filter expressions>
(optional) are expressions that filter the selected log streams
How Count Works
The count()
function operates by:
- Selecting log streams based on your stream selector
- Applying any filter expressions to narrow down the matching logs
- Counting the total number of log entries that match these criteria
- Returning the count as a time series
Unlike more complex aggregation functions, count()
performs a straightforward calculation - it simply tallies matching log entries.
Basic Examples
Count All Logs from an Application
count(app="frontend")
This query counts all log entries from streams with the label app
equal to "frontend".
Count Error Logs
count(app="frontend" | json | level="error")
This query:
- Selects logs from the "frontend" application
- Parses them as JSON
- Filters for entries where the "level" field equals "error"
- Counts the matching entries
Range Vectors and Aggregation
You can combine count()
with range vectors and aggregation operators to create more advanced metrics.
Count with Range Vector
count(app="frontend" | json | level="error")[5m]
This returns the count of error logs over 5-minute windows.
Count with Rate
rate(count(app="frontend" | json | level="error")[5m])
This calculates the rate of errors per second over 5-minute windows.
Practical Applications
Monitoring Error Rates
One of the most common uses of count()
is monitoring error rates in applications:
rate(count(app="payment-service" | json | status>=500)[1m])
This query gives you the rate of HTTP 5xx errors per second in your payment service, which could be used to trigger alerts.
Traffic Analysis
You can analyze traffic patterns using counts:
sum by (route) (rate(count(app="api-gateway" | json | route=~"/api/.*")[5m]))
This groups API requests by route and gives you the request rate for each endpoint.
Detecting Unusual Activity
count(app="auth-service" | json | action="failed_login")[5m]
> 10
This creates an alert condition that triggers when there are more than 10 failed login attempts in a 5-minute window.
Visualizing Count Data
Count metrics are commonly visualized in:
- Time series graphs showing count trends over time
- Bar charts comparing counts across different services or components
- Gauges displaying current count values against thresholds
Here's how you might structure a dashboard panel in Grafana:
Count vs. Other Metrics Functions
While count()
simply tallies log entries, LogQL offers other metrics functions for different needs:
Function | Purpose | Example |
---|---|---|
count() | Count matching log entries | count(app="web") |
rate() | Calculate per-second rate | rate(count(app="web")[5m]) |
sum() | Total values across streams | sum(count_over_time({app="web"}[5m])) |
avg() | Average values | avg(count_over_time({app="web"}[5m])) |
Best Practices
-
Be Specific: Narrow your log stream selection to improve performance.
logqlcount(app="payment" service="transactions" | level="error")
-
Use Appropriate Time Windows: Choose range vectors that match your monitoring needs.
- Short windows (1m-5m) for real-time monitoring
- Longer windows (1h-24h) for trend analysis
-
Combine with Labels: Use labels to create more insightful metrics.
logqlsum by (status_code) (count(app="api" | json | status_code=~"5.."))
-
Consider Rate Instead of Raw Counts: For most alerting,
rate(count()[time])
is more useful than rawcount()
.
Common Issues and Troubleshooting
No Data Returned
If your count()
query returns no data:
- Verify your log stream selector matches existing streams
- Check that the time range in Grafana includes the period when logs were generated
- Ensure any JSON or regex parsing in your filter expressions works correctly
Performance Concerns
Count operations on large log volumes can be resource-intensive. To optimize:
- Add more specific label filters to your log stream selector
- Use shorter time windows for high-volume logs
- Consider pre-filtering logs with line filters before counting
Summary
The count()
function is a cornerstone of metrics in LogQL, providing a simple way to quantify log entries matching specific criteria. By counting logs, you can:
- Monitor error rates and system health
- Track user activity and business metrics
- Set up alerts for unusual patterns
- Analyze trends over time
While simple in concept, count()
becomes powerful when combined with other LogQL features like range vectors, aggregation operators, and label manipulation.
Exercises
-
Write a LogQL query to count all error logs from a service named "user-service" over 10-minute windows.
-
Create a query that compares the count of "login" vs "logout" events from an authentication service.
-
Develop a query that counts HTTP requests by status code category (2xx, 3xx, 4xx, 5xx).
-
Write an alert expression that triggers when the error rate exceeds 5% of total logs for any service.
Additional Resources
Happy logging and counting!
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)