Count Function

Introduction

The count() function is one of the most fundamental and frequently used metrics functions in LogQL (Loki Query Language). It enables you to count the number of log entries that match a specific log stream selector and optional filter expressions. This simple yet powerful function forms the foundation for monitoring and alerting in Grafana Loki.

In this guide, we'll explore how the count() function works, its syntax, and practical applications to help you make the most of your log data.

Basic Syntax

The basic syntax for the count() function in LogQL is:

logql
count(<log stream selector> [| <filter expressions>])

Where:

<log stream selector> is a query that selects the log streams you want to analyze
<filter expressions> (optional) are expressions that filter the selected log streams

How Count Works

The count() function operates by:

Selecting log streams based on your stream selector
Applying any filter expressions to narrow down the matching logs
Counting the total number of log entries that match these criteria
Returning the count as a time series

Unlike more complex aggregation functions, count() performs a straightforward calculation - it simply tallies matching log entries.

Basic Examples

Count All Logs from an Application

logql
count(app="frontend")

This query counts all log entries from streams with the label app equal to "frontend".

Count Error Logs

logql
count(app="frontend" | json | level="error")

This query:

Selects logs from the "frontend" application
Parses them as JSON
Filters for entries where the "level" field equals "error"
Counts the matching entries

Range Vectors and Aggregation

You can combine count() with range vectors and aggregation operators to create more advanced metrics.

Count with Range Vector

logql
count(app="frontend" | json | level="error")[5m]

This returns the count of error logs over 5-minute windows.

Count with Rate

logql
rate(count(app="frontend" | json | level="error")[5m])

This calculates the rate of errors per second over 5-minute windows.

Practical Applications

Monitoring Error Rates

One of the most common uses of count() is monitoring error rates in applications:

logql
rate(count(app="payment-service" | json | status>=500)[1m])

This query gives you the rate of HTTP 5xx errors per second in your payment service, which could be used to trigger alerts.

Traffic Analysis

You can analyze traffic patterns using counts:

logql
sum by (route) (rate(count(app="api-gateway" | json | route=~"/api/.*")[5m]))

This groups API requests by route and gives you the request rate for each endpoint.

Detecting Unusual Activity

logql
count(app="auth-service" | json | action="failed_login")[5m]
> 10

This creates an alert condition that triggers when there are more than 10 failed login attempts in a 5-minute window.

Visualizing Count Data

Count metrics are commonly visualized in:

Time series graphs showing count trends over time
Bar charts comparing counts across different services or components
Gauges displaying current count values against thresholds

Here's how you might structure a dashboard panel in Grafana:

Count vs. Other Metrics Functions

While count() simply tallies log entries, LogQL offers other metrics functions for different needs:

Function	Purpose	Example
`count()`	Count matching log entries	`count(app="web")`
`rate()`	Calculate per-second rate	`rate(count(app="web")[5m])`
`sum()`	Total values across streams	`sum(count_over_time({app="web"}[5m]))`
`avg()`	Average values	`avg(count_over_time({app="web"}[5m]))`

Best Practices

Be Specific: Narrow your log stream selection to improve performance.
logql
```
count(app="payment" service="transactions" | level="error")
```
Use Appropriate Time Windows: Choose range vectors that match your monitoring needs.
- Short windows (1m-5m) for real-time monitoring
- Longer windows (1h-24h) for trend analysis
Combine with Labels: Use labels to create more insightful metrics.
logql
```
sum by (status_code) (count(app="api" | json | status_code=~"5.."))
```
Consider Rate Instead of Raw Counts: For most alerting, rate(count()[time]) is more useful than raw count().

Common Issues and Troubleshooting

No Data Returned

If your count() query returns no data:

Verify your log stream selector matches existing streams
Check that the time range in Grafana includes the period when logs were generated
Ensure any JSON or regex parsing in your filter expressions works correctly

Performance Concerns

Count operations on large log volumes can be resource-intensive. To optimize:

Add more specific label filters to your log stream selector
Use shorter time windows for high-volume logs
Consider pre-filtering logs with line filters before counting

Summary

The count() function is a cornerstone of metrics in LogQL, providing a simple way to quantify log entries matching specific criteria. By counting logs, you can:

Monitor error rates and system health
Track user activity and business metrics
Set up alerts for unusual patterns
Analyze trends over time

While simple in concept, count() becomes powerful when combined with other LogQL features like range vectors, aggregation operators, and label manipulation.

Exercises

Write a LogQL query to count all error logs from a service named "user-service" over 10-minute windows.
Create a query that compares the count of "login" vs "logout" events from an authentication service.
Develop a query that counts HTTP requests by status code category (2xx, 3xx, 4xx, 5xx).
Write an alert expression that triggers when the error rate exceeds 5% of total logs for any service.

Additional Resources

Happy logging and counting!

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Basic Syntax​

How Count Works​

Basic Examples​

Count All Logs from an Application​

Count Error Logs​

Range Vectors and Aggregation​

Count with Range Vector​

Count with Rate​

Practical Applications​

Monitoring Error Rates​

Traffic Analysis​

Detecting Unusual Activity​

Visualizing Count Data​

Count vs. Other Metrics Functions​

Best Practices​

Common Issues and Troubleshooting​

No Data Returned​

Performance Concerns​

Summary​

Exercises​

Additional Resources​