Skip to main content

Count Function

Introduction

The count() function is one of the most fundamental and frequently used metrics functions in LogQL (Loki Query Language). It enables you to count the number of log entries that match a specific log stream selector and optional filter expressions. This simple yet powerful function forms the foundation for monitoring and alerting in Grafana Loki.

In this guide, we'll explore how the count() function works, its syntax, and practical applications to help you make the most of your log data.

Basic Syntax

The basic syntax for the count() function in LogQL is:

logql
count(<log stream selector> [| <filter expressions>])

Where:

  • <log stream selector> is a query that selects the log streams you want to analyze
  • <filter expressions> (optional) are expressions that filter the selected log streams

How Count Works

The count() function operates by:

  1. Selecting log streams based on your stream selector
  2. Applying any filter expressions to narrow down the matching logs
  3. Counting the total number of log entries that match these criteria
  4. Returning the count as a time series

Unlike more complex aggregation functions, count() performs a straightforward calculation - it simply tallies matching log entries.

Basic Examples

Count All Logs from an Application

logql
count(app="frontend")

This query counts all log entries from streams with the label app equal to "frontend".

Count Error Logs

logql
count(app="frontend" | json | level="error")

This query:

  1. Selects logs from the "frontend" application
  2. Parses them as JSON
  3. Filters for entries where the "level" field equals "error"
  4. Counts the matching entries

Range Vectors and Aggregation

You can combine count() with range vectors and aggregation operators to create more advanced metrics.

Count with Range Vector

logql
count(app="frontend" | json | level="error")[5m]

This returns the count of error logs over 5-minute windows.

Count with Rate

logql
rate(count(app="frontend" | json | level="error")[5m])

This calculates the rate of errors per second over 5-minute windows.

Practical Applications

Monitoring Error Rates

One of the most common uses of count() is monitoring error rates in applications:

logql
rate(count(app="payment-service" | json | status>=500)[1m])

This query gives you the rate of HTTP 5xx errors per second in your payment service, which could be used to trigger alerts.

Traffic Analysis

You can analyze traffic patterns using counts:

logql
sum by (route) (rate(count(app="api-gateway" | json | route=~"/api/.*")[5m]))

This groups API requests by route and gives you the request rate for each endpoint.

Detecting Unusual Activity

logql
count(app="auth-service" | json | action="failed_login")[5m]
> 10

This creates an alert condition that triggers when there are more than 10 failed login attempts in a 5-minute window.

Visualizing Count Data

Count metrics are commonly visualized in:

  • Time series graphs showing count trends over time
  • Bar charts comparing counts across different services or components
  • Gauges displaying current count values against thresholds

Here's how you might structure a dashboard panel in Grafana:

Count vs. Other Metrics Functions

While count() simply tallies log entries, LogQL offers other metrics functions for different needs:

FunctionPurposeExample
count()Count matching log entriescount(app="web")
rate()Calculate per-second raterate(count(app="web")[5m])
sum()Total values across streamssum(count_over_time({app="web"}[5m]))
avg()Average valuesavg(count_over_time({app="web"}[5m]))

Best Practices

  1. Be Specific: Narrow your log stream selection to improve performance.

    logql
    count(app="payment" service="transactions" | level="error")
  2. Use Appropriate Time Windows: Choose range vectors that match your monitoring needs.

    • Short windows (1m-5m) for real-time monitoring
    • Longer windows (1h-24h) for trend analysis
  3. Combine with Labels: Use labels to create more insightful metrics.

    logql
    sum by (status_code) (count(app="api" | json | status_code=~"5.."))
  4. Consider Rate Instead of Raw Counts: For most alerting, rate(count()[time]) is more useful than raw count().

Common Issues and Troubleshooting

No Data Returned

If your count() query returns no data:

  • Verify your log stream selector matches existing streams
  • Check that the time range in Grafana includes the period when logs were generated
  • Ensure any JSON or regex parsing in your filter expressions works correctly

Performance Concerns

Count operations on large log volumes can be resource-intensive. To optimize:

  • Add more specific label filters to your log stream selector
  • Use shorter time windows for high-volume logs
  • Consider pre-filtering logs with line filters before counting

Summary

The count() function is a cornerstone of metrics in LogQL, providing a simple way to quantify log entries matching specific criteria. By counting logs, you can:

  • Monitor error rates and system health
  • Track user activity and business metrics
  • Set up alerts for unusual patterns
  • Analyze trends over time

While simple in concept, count() becomes powerful when combined with other LogQL features like range vectors, aggregation operators, and label manipulation.

Exercises

  1. Write a LogQL query to count all error logs from a service named "user-service" over 10-minute windows.

  2. Create a query that compares the count of "login" vs "logout" events from an authentication service.

  3. Develop a query that counts HTTP requests by status code category (2xx, 3xx, 4xx, 5xx).

  4. Write an alert expression that triggers when the error rate exceeds 5% of total logs for any service.

Additional Resources

Happy logging and counting!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)