Data Source-Managed Alerts

Introduction

Data source-managed alerts (sometimes called "native alerts") are a powerful feature within Grafana's alerting ecosystem that allows certain data sources to handle alert evaluation directly. Unlike Grafana-managed alerts where the Grafana server evaluates alert conditions, data source-managed alerts offload this responsibility to the data source itself, which can improve performance and enable specialized alerting capabilities.

In this tutorial, you'll learn:

What data source-managed alerts are and how they differ from Grafana-managed alerts
Which data sources support managed alerts
How to create and configure data source-managed alerts
Best practices for implementing these alerts in your monitoring environment

Understanding Data Source-Managed Alerts

What Are Data Source-Managed Alerts?

Data source-managed alerts are alert rules that are created and executed within the data source itself rather than being evaluated by Grafana. This means the data source:

Stores the alert rule definitions
Evaluates the alert conditions
Determines the alert state
Generates notifications when alerts fire

Grafana simply provides the interface to create, view, and manage these alerts, while the actual processing happens within the data source.

Supported Data Sources

As of this writing, the primary data sources that support managed alerts are:

Prometheus - For alerting on metrics data
Loki - For alerting on log data
Mimir - For scaled Prometheus alerting
Alertmanager - For alert management

Each data source implements alerting in its own way, with features and limitations specific to its architecture.

Comparing with Grafana-Managed Alerts

Feature	Data Source-Managed Alerts	Grafana-Managed Alerts
Evaluation location	Within the data source	Within Grafana
Performance impact on Grafana	Low	Higher
Consistency during Grafana outages	Continue functioning	Stop functioning
Configuration complexity	Varies by data source	Unified across all data sources
Multi-data source alerts	Not supported	Supported
Alerting expressions	Data source specific	Grafana expressions

Creating Data Source-Managed Alerts

Let's walk through the process of creating a data source-managed alert using Prometheus as an example.

Prerequisites

A functioning Grafana instance (version 8.0+)
A configured Prometheus data source
Appropriate permissions to create and manage alerts

Step-by-Step Guide

1. Navigate to Alerting

Go to the Grafana menu and select "Alerting" to open the alerting UI.

2. Create a New Alert Rule

Click on "New alert rule" to start creating a new alert. You'll see different options for creating Grafana-managed alerts or data source-managed alerts.

3. Select Prometheus Rule Type

Select "Prometheus" under the "Data source rule" section. This indicates you want to create a data source-managed rule that will be stored and evaluated in Prometheus.

4. Define Your Alert Rule

alert: HighCPUUsage
expr: avg by(instance) (cpu_usage_percent) > 80
for: 5m
labels:
  severity: critical
  category: system
annotations:
  summary: High CPU usage on {{ $labels.instance }}
  description: CPU usage has been above 80% for more than 5 minutes on {{ $labels.instance }}

In this example, we're defining an alert that triggers when CPU usage exceeds 80% for 5 minutes.

5. Configure Rule Settings

Fill in the following fields:

Rule name: A descriptive name for your alert rule
Namespace: The Prometheus namespace where this rule will be stored
Group: The rule group for organizing related rules
For: How long the condition must be true before firing (e.g., 5m)

6. Save the Rule

Click "Save" to create your data source-managed alert. Prometheus will now store and evaluate this rule.

Working with Prometheus-Managed Alerts

Alert Rule Syntax

Prometheus alert rules consist of:

alert: The name of the alert
expr: The PromQL expression that defines the alert condition
for: Duration the condition must be true before firing
labels: Key-value pairs for categorizing alerts (used for routing)
annotations: Additional information about the alert (displayed in notifications)

Example PromQL Alert Expressions

Example 1: Service Availability

alert: ServiceDown
expr: up{job="api-service"} == 0
for: 2m
labels:
  severity: critical
annotations:
  summary: Service {{ $labels.job }} is down
  description: The service {{ $labels.job }} has been down for more than 2 minutes.

Example 2: Error Rate

alert: HighErrorRate
expr: sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.05
for: 10m
labels:
  severity: warning
annotations:
  summary: High error rate detected
  description: Error rate is above 5% for the last 10 minutes

Alert States

Data source-managed alerts can have the following states:

Normal: The alert condition is false
Pending: The alert condition is true but hasn't been true for the required duration
Firing: The alert condition is true and has been true for the required duration

Working with Loki-Managed Alerts

Loki, Grafana's log aggregation system, also supports data source-managed alerts. These are particularly useful for alerting on log patterns or log volumes.

Example Loki Alert

alert: TooManyErrorLogs
expr: sum(count_over_time({app="api"} |= "ERROR"[5m])) > 100
for: 5m
labels:
  severity: warning
annotations:
  summary: High number of error logs
  description: More than 100 error logs detected in the last 5 minutes

This alert fires when more than 100 error logs are detected in a 5-minute window.

Data Flow for Data Source-Managed Alerts

Here's a diagram showing how data flows in a data source-managed alert system:

Best Practices

When to Use Data Source-Managed Alerts

When you need alerts to continue functioning even if Grafana is down
For high-volume metrics where offloading evaluation improves performance
When you want to leverage data source-specific alerting capabilities

When to Use Grafana-Managed Alerts

When you need to alert on data from multiple data sources
When you need advanced Grafana expressions
For simpler setups where you prefer a unified alerting interface

Performance Considerations

Data source-managed alerts reduce load on Grafana servers
However, they can increase load on your data sources
Consider resource allocation for your data sources if implementing many alert rules

Organizational Tips

Use consistent naming conventions for alert rules
Group related alerts together
Use labels effectively to categorize and route alerts
Document alert thresholds and rationale

Troubleshooting

Common Issues

Alert Rules Not Saving

Ensure your Grafana instance has proper access to write rules to your data source. For Prometheus, check that the rule_files configuration is correctly set up.

Alerts Not Firing

Verify the alert expression directly in the data source query interface
Check for syntax errors in your alert expression
Ensure the alert condition is actually being met

Alert Notifications Not Being Received

Check that your data source's alertmanager is correctly configured
Verify notification channel settings
Check for any network issues between components

Real-World Examples

Scenario 1: Infrastructure Monitoring

A common use case is monitoring server infrastructure. Here's how you might set up data source-managed alerts:

CPU Usage Alert:

alert: HighCPUUsage
expr: avg by(instance) (node_cpu_seconds_total{mode="idle"}) > 80
for: 5m

Memory Alert:

alert: HighMemoryUsage
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 90
for: 5m

Disk Space Alert:

alert: LowDiskSpace
expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 < 10
for: 10m

Scenario 2: Application Monitoring

For monitoring a web application:

High Latency Alert:

alert: HighLatency
expr: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le, service)) > 2
for: 5m

Error Rate Alert:

alert: ErrorSpike
expr: sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) > 0.05
for: 5m

Summary

Data source-managed alerts are a powerful feature in Grafana's alerting ecosystem that allows alert rules to be stored and evaluated directly by supported data sources like Prometheus and Loki. This approach offers performance benefits and continued alerting functionality even during Grafana outages.

Key points to remember:

Data source-managed alerts are evaluated by the data source, not by Grafana
They're ideal for high-volume metrics and critical alerts that must function even if Grafana is down
Each data source has its own alert syntax and capabilities
You can create these alerts through the Grafana UI but they're stored in the data source

Further Learning

To deepen your understanding of data source-managed alerts, consider:

Exploring the specific alerting documentation for your data sources
Studying PromQL or LogQL to write more effective alert expressions
Learning about alert routing and silencing in Alertmanager
Practicing with different alert scenarios to understand evaluation behavior

Exercises

Create a Prometheus-managed alert that fires when a service is down for more than 1 minute.
Create a Loki-managed alert that detects when a specific error pattern appears in your logs.
Set up an alert that monitors the rate of increase of a specific metric and alerts on sudden changes.
Configure different notification channels for different severity levels of alerts.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Understanding Data Source-Managed Alerts​

What Are Data Source-Managed Alerts?​

Supported Data Sources​

Comparing with Grafana-Managed Alerts​

Creating Data Source-Managed Alerts​

Prerequisites​

Step-by-Step Guide​

1. Navigate to Alerting​

2. Create a New Alert Rule​

3. Select Prometheus Rule Type​

4. Define Your Alert Rule​

5. Configure Rule Settings​

6. Save the Rule​

Working with Prometheus-Managed Alerts​

Alert Rule Syntax​

Example PromQL Alert Expressions​

Example 1: Service Availability​

Example 2: Error Rate​

Alert States​

Working with Loki-Managed Alerts​

Example Loki Alert​

Data Flow for Data Source-Managed Alerts​

Best Practices​

When to Use Data Source-Managed Alerts​

When to Use Grafana-Managed Alerts​

Performance Considerations​

Organizational Tips​

Troubleshooting​

Common Issues​

Alert Rules Not Saving​

Alerts Not Firing​

Alert Notifications Not Being Received​

Real-World Examples​

Scenario 1: Infrastructure Monitoring​

Scenario 2: Application Monitoring​

Summary​

Further Learning​

Exercises​