Pattern Parsing in LogQL

Introduction

Pattern parsing is a powerful feature in LogQL that allows you to extract structured data from unstructured log messages. When working with logs, you'll often encounter semi-structured text that contains valuable information embedded within strings. Pattern parsing provides a way to identify, extract, and work with this data without having to pre-process your logs.

In this guide, we'll explore how LogQL's pattern parsing capabilities work, various parsing methods available, and how to use them effectively to query and analyze your log data.

Understanding Pattern Parsing

Pattern parsing transforms unstructured log lines into structured data by extracting fields based on patterns. This enables you to:

Filter logs based on the content of extracted fields
Create metrics from extracted values
Group and aggregate logs using extracted labels
Perform operations on extracted numeric values

LogQL supports multiple parsing methods to extract data from logs:

Regular expressions with named capture groups
JSON parsing
Logfmt parsing
Pattern parsing with custom formats

Basic Pattern Extraction

The most common way to extract patterns in LogQL is by using the | pipe operator followed by a parser expression.

Using Regular Expressions

Regular expressions provide a flexible way to match patterns in your logs:

{app="myapp"} |= "ERROR" | regexp `error: (?P<error_message>.*)`

In this example:

We first filter logs for the label app="myapp" that contain the string "ERROR"
Then we extract the error message using a regular expression with a named capture group error_message

The captured value becomes available as a label that you can use in your queries:

{app="myapp"} |= "ERROR" 
| regexp `error: (?P<error_message>.*)`
| label_format severity="error"

JSON Parsing

For logs in JSON format, LogQL provides a dedicated JSON parser:

{app="payment-service"} | json

This extracts all top-level fields from JSON log lines. You can also extract specific fields:

{app="payment-service"} 
| json transaction_id="transaction.id", amount="transaction.amount", currency="transaction.currency"

The extracted values are added as labels to your log entries.

Advanced Pattern Parsing

Nested Extraction

You can chain multiple extraction operations to parse complex logs:

{app="myapp"} 
| json 
| logfmt message

This first extracts JSON fields, then parses the extracted message field using the logfmt parser.

Custom Pattern Formats

LogQL supports custom pattern formats using the pattern parser:

{app="auth-service"} 
| pattern `<time> <_> <_> <level> <_> [<trace_id>] <message>`

This extracts structured fields based on position in the log line. Underscores (<_>) indicate fields to skip.

Unpack Parsing

The unpack parser is useful for extracting dot notation fields from nested structures:

{app="api-gateway"} 
| json 
| unpack

This flattens nested JSON objects into top-level labels.

Practical Examples

Example 1: Parsing Application Logs

Let's say we have logs in this format:

2023-06-15T12:34:56Z INFO [request-123] User login successful: user_id=456 source_ip=203.0.113.42

We can extract structured data with:

{app="auth-service"} |= "User login" 
| regexp `(?P<timestamp>\S+) (?P<level>\S+) \[(?P<request_id>[^\]]+)\] (?P<message>User login .*): user_id=(?P<user_id>\d+) source_ip=(?P<source_ip>\S+)`

This extracts:

timestamp
level
request_id
message
user_id
source_ip

Example 2: Parsing Error Logs and Creating Metrics

We can parse error logs and create metrics from them:

sum by (error_type) (
  count_over_time(
    {app="payment-service"} |= "ERROR" 
    | json error_type="error.type", error_code="error.code" 
    [5m]
  )
)

This creates a count of errors grouped by error type over a 5-minute window.

Example 3: Combining Multiple Parser Types

For complex logs, you may need to combine parser types:

{app="orders"} 
| json 
| line_format "{{.message}}" 
| logfmt 
| duration_seconds = duration

This extracts JSON fields, formats just the message field, parses it with logfmt, and converts a duration field to seconds.

Performance Considerations

Pattern parsing operations can be computationally expensive, especially on large volumes of logs. To optimize performance:

Filter logs as much as possible before applying parsers
Use the most specific parser for your log format
Extract only the fields you need
Consider using the LogQL pre-processing pipeline to parse logs during ingestion

Advanced Techniques

Using Extracted Fields in Filters

Once you've extracted fields using pattern parsing, you can filter on them:

{app="web-server"} 
| json method="req.method", path="req.path", status="resp.status" 
| status=~"5.." and method="POST"

This finds all 500-level errors for POST requests.

Creating Dynamic Labels

You can transform extracted fields into new labels:

{app="myapp"} 
| json 
| label_format api_path=`{{without .path "/api/v1"}}`

This strips "/api/v1" from the path and creates a new label.

Formatting Output

The line_format directive lets you create custom log lines from extracted fields:

{app="payment-service"} 
| json 
| line_format "{{.timestamp}} [{{.transaction_id}}] Amount: {{.amount}} {{.currency}}"

Troubleshooting Pattern Parsing

If your pattern parsing isn't working as expected:

Test with smaller datasets: Limit your query to a small time range
Debug with line_format: Use line_format "{{.}}" to see all extracted fields
Check your regex: Validate your regular expressions with a testing tool
Examine raw logs: Compare your patterns against raw log samples

Summary

Pattern parsing in LogQL provides powerful capabilities for extracting structured data from logs. With the various parsing methods available (regex, JSON, logfmt, and custom patterns), you can transform unstructured logs into structured data for analysis and visualization.

By mastering pattern parsing, you'll be able to:

Extract valuable information from logs
Create meaningful metrics from log data
Filter and aggregate logs based on their content
Build insightful dashboards

Additional Resources

Practice extracting fields from different log formats
Experiment with different regex patterns
Try combining multiple parsers in a single query
Build a dashboard using extracted fields

Exercises

Write a LogQL query to extract HTTP status codes, methods, and response times from web server logs
Create a query that finds the 10 slowest API requests using extracted duration fields
Parse logs containing JSON within string fields
Build a dashboard panel showing error rates by component using pattern parsing

💡 Found a typo or mistake? Click "Edit this page" to suggest a correction. Your feedback is greatly appreciated!

Introduction​

Understanding Pattern Parsing​

Basic Pattern Extraction​

Using Regular Expressions​

JSON Parsing​

Advanced Pattern Parsing​

Nested Extraction​

Custom Pattern Formats​

Unpack Parsing​

Practical Examples​

Example 1: Parsing Application Logs​

Example 2: Parsing Error Logs and Creating Metrics​

Example 3: Combining Multiple Parser Types​

Performance Considerations​

Advanced Techniques​

Using Extracted Fields in Filters​

Creating Dynamic Labels​

Formatting Output​

Troubleshooting Pattern Parsing​

Summary​

Additional Resources​

Exercises​