Filter Expressions in LogQL
Introduction
Filter expressions are a fundamental component of LogQL (Loki Query Language) that allow you to narrow down and retrieve specific log entries from your log data. When working with Grafana Loki, efficient filtering is essential for managing the potentially vast amounts of log data and finding exactly what you need.
In this guide, we'll explore how filter expressions work in LogQL, how to create them, and how to leverage them to build powerful log queries.
What are Filter Expressions?
Filter expressions in LogQL are conditions that determine which log entries should be included in your query results. They act as criteria that each log entry must satisfy to be returned in the query response.
A basic LogQL query follows this pattern:
{<log stream selector>} [<filter expression>]
Where:
{<log stream selector>}
identifies the streams to query[<filter expression>]
(optional) filters the logs from these streams
Basic Filter Expressions
String Matching
The simplest filter expressions match log lines containing specific text:
{app="frontend"} |= "error"
This query returns all logs from the frontend
application that contain the word "error".
LogQL supports several string matching operators:
Operator | Description | Example |
---|---|---|
` | =` | Log line contains string | {app="frontend"} |= "error" |
!= | Log line does not contain string | {app="frontend"} != "debug" |
` | ~` | Log line matches regular expression | {app="frontend"} |~ "error.*timeout" |
!~ | Log line does not match regular expression | {app="frontend"} !~ "2[0-9]{2}" |
Let's see examples of each:
# Match log lines containing "error" in the "frontend" service
{app="frontend"} |= "error"
# Match log lines that don't contain "successful" in the "payment" service
{app="payment"} != "successful"
# Match log lines containing errors with status codes (using regex)
{app="api"} |~ "error.*status: [45][0-9]{2}"
# Exclude all 2xx status codes from results
{app="api"} !~ "status: 2[0-9]{2}"
Case Sensitivity
By default, string matching in LogQL is case-sensitive. To perform case-insensitive matching, you can use the (?i)
flag with regex operators:
{app="frontend"} |~ "(?i)error"
This query matches "error", "Error", "ERROR", etc.
Chaining Filter Expressions
You can combine multiple filter expressions to create more specific queries:
{app="frontend"} |= "error" != "timeout" |~ "user_id: [0-9]+"
This query:
- Selects logs from the
frontend
application - Includes only lines containing "error"
- Excludes lines containing "timeout"
- Further filters to include only lines with a user ID (matches the regex pattern)
Filters are applied in sequence, with each filter operating on the results of the previous one, creating a pipeline of filtering operations.
Using Parentheses for Grouping
LogQL 2.0+ supports parentheses for grouping filter expressions:
{app="frontend"} |= "error" |= "authentication" or |= "authorization"
This is equivalent to:
{app="frontend"} |= "error" (|= "authentication" or |= "authorization")
Both queries find logs containing "error" AND either "authentication" OR "authorization".
Advanced Filter Expressions
Parsing and Filtering JSON
Many applications output logs in JSON format. LogQL can parse and filter based on JSON fields:
{app="api"} | json | status_code >= 400
This query:
- Selects logs from the
api
application - Parses the log lines as JSON
- Filters for entries where
status_code
is greater than or equal to 400
Regular Expression Named Captures
You can extract and filter on parts of log lines using regex named captures:
{app="frontend"} |~ "user_id=(?P<user_id>[0-9]+)" | user_id = "12345"
This query:
- Selects logs from the
frontend
application - Uses regex to extract the
user_id
field - Filters for logs where
user_id
equals "12345"
Real-World Examples
Troubleshooting HTTP Errors
When investigating HTTP errors in a web application:
{app="web", env="production"} |= "status code" |~ "status code: [45][0-9]{2}"
| regexp "status code: (?P<status>[0-9]+)"
| status >= 500
This query finds all production HTTP 5xx server errors by:
- Selecting logs from the production web application
- Finding lines containing "status code"
- Further filtering for status codes starting with 4 or 5
- Extracting the actual status code
- Filtering for codes 500 and above (server errors)
Analyzing Authentication Failures
To identify authentication issues:
{app="auth-service"} |= "login" |= "failed" != "rate limit"
| json
| user_agent =~ "Mobile.*"
| count_over_time[1h] > 10
This query:
- Selects logs from the authentication service
- Finds failed logins (excluding rate limiting events)
- Parses the JSON log
- Filters for mobile user agents
- Counts instances over a 1-hour window, showing only those with more than 10 failures
Monitoring Application Performance
To analyze slow database operations:
{app="database"} |= "query" |= "execution time"
| regexp "execution time: (?P<exec_time>[0-9.]+)ms"
| exec_time > 100
| sort_by exec_time desc
This query:
- Selects database logs
- Finds query execution time entries
- Extracts the execution time in milliseconds
- Filters for slow queries (>100ms)
- Sorts by execution time (slowest first)
Filter Optimization Tips
Optimizing your filter expressions is crucial for performance:
-
Start with strict log stream selectors: The more specific your log stream selection (
{app="frontend"}
), the fewer logs Loki needs to process. -
Apply high-cardinality filters first: Put filters that exclude the most logs at the beginning of your query.
-
Use line filters before extractions: Apply
|=
,!=
,|~
, and!~
operators before usingjson
,logfmt
, orregexp
parsers. -
Use appropriate regex patterns: Optimize regular expressions to match efficiently.
Summary
Filter expressions are powerful tools in LogQL that enable you to precisely target the log data you need. By mastering string matching, regex patterns, and extraction techniques, you can build sophisticated queries to troubleshoot issues, monitor applications, and gain insights from your logs.
Remember these key points:
- Start with log stream selectors to narrow down the data source
- Use string matching operators (
|=
,!=
,|~
,!~
) for basic filtering - Chain multiple filters to refine your results
- Extract structured data using parsers like
json
,logfmt
, orregexp
- Apply filters on extracted fields for precise targeting
Additional Resources and Exercises
Exercises
-
Write a filter expression to find all logs containing "error" but not "timeout" from a service called "payment-processor".
-
Create a query that extracts and filters on the status code from logs matching this format:
INFO Request completed with status=200 in 45ms
. -
Write a query to find the top 10 slowest API calls in the last hour, assuming logs contain the pattern
processing time: 123ms
.
Further Learning
- Explore the Grafana Loki documentation for the latest LogQL features
- Practice creating dashboards that utilize your LogQL queries
- Learn about metric queries in LogQL to extract numerical data from logs
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)