Notification Policies
Introduction
Notification policies are a crucial component of Grafana's alerting system. They define how, when, and where alert notifications are sent when an alert rule fires. Think of notification policies as the traffic controllers of your alerting system - they determine the routing rules for your alerts, ensuring that the right people or systems are notified about specific alerts at the right time.
In this guide, we'll explore how notification policies work in Grafana, how to configure them effectively, and how they fit into the overall alerting workflow.
Understanding Notification Policies
What is a Notification Policy?
A notification policy is a set of rules that determines:
- Which contact points receive notifications for specific alerts
- When those notifications are sent (including grouping, timing, and muting)
- How alerts are grouped together in notifications
Notification policies work together with contact points and alert rules to create a complete alerting workflow:
The Notification Policy Tree
Grafana organizes notification policies in a hierarchical tree structure. At the top is the root policy, which serves as the default policy. You can then create nested policies with more specific matching criteria.
When an alert fires, Grafana evaluates it against all notification policies, starting from the most specific and moving up to more general ones, until it finds a matching policy.
Configuring Notification Policies
To access notification policies in Grafana:
- Navigate to the Alerting section in the Grafana sidebar
- Select "Notification policies"
Creating a Root Notification Policy
The root policy is created automatically and serves as the fallback for all alerts. You can configure it with these settings:
- Default contact point: The default destination for all alerts
- Group by: How alerts are grouped in notifications
- Timing options: When and how frequently notifications are sent
Example root policy configuration:
default_contact_point: "email-team"
group_by: ['alertname', 'grafana_folder']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
Creating Nested Notification Policies
To create more specific routing rules, you can add nested policies:
- In the Notification policies page, click "Add nested policy"
- Define the matching criteria (labels that alerts must match)
- Configure the policy settings
- Add the policy to the tree
Here's an example of a nested policy structure:
# Root policy
- contact_point: general-alerts
group_by: ['alertname', 'grafana_folder']
# Nested policy for database alerts
- matchers: [severity = 'critical', service = 'database']
contact_point: database-team
group_by: ['instance']
# Further nested policy for specific DB cluster
- matchers: [cluster = 'production-main']
contact_point: database-oncall
mute_timings: ['business-hours']
group_wait: 10s
Important Configuration Options
Matchers
Matchers determine which alerts a policy applies to. They work on the labels attached to your alerts:
matchers:
- severity = 'critical'
- service =~ 'database.*' # Regex matching
- environment != 'testing'
Group By
The "group by" option controls how alerts are grouped in notifications:
# Group alerts by name and environment
group_by: ['alertname', 'environment']
# Group all matching alerts together
group_by: []
# Preserve all labels, essentially no grouping
group_by: ['...']
Timing Options
These settings control notification timing:
- Group wait: Initial delay before sending a notification after an alert fires
- Group interval: Minimum time between sending update notifications
- Repeat interval: How often to resend notifications for active alerts
# Example timing settings
group_wait: 30s # Wait 30s after first alert
group_interval: 5m # Send updates every 5 minutes
repeat_interval: 4h # Repeat notifications every 4 hours
Mute Timings
You can configure periods when notifications are suppressed:
mute_timings: ['weekends', 'maintenance-window']
Practical Examples
Example 1: Different Teams for Different Services
In this example, we'll route alerts to different teams based on the service that's affected:
# Root policy
- contact_point: general-ops
group_by: ['alertname', 'severity']
# Database team
- matchers: [service = 'database']
contact_point: database-team
group_by: ['instance']
# Frontend team
- matchers: [service = 'frontend']
contact_point: frontend-team
group_by: ['cluster']
# Infrastructure team - critical alerts
- matchers: [category = 'infrastructure', severity = 'critical']
contact_point: infra-oncall
group_wait: 0s # No delay for critical infra alerts
Example 2: Different Notification Policies for Working Hours
This example shows how to route alerts differently during working hours vs. after hours:
# Root policy
- contact_point: email-general
group_by: ['alertname']
# During business hours - use Slack
- matchers: [severity = 'warning']
contact_point: slack-support
mute_timings: ['after-hours']
# After hours - page on-call for critical only
- matchers: [severity = 'critical']
contact_point: pagerduty-oncall
mute_timings: ['business-hours']
Working with Notification Policies via API
You can manage notification policies programmatically using Grafana's API:
# Get all notification policies
curl -X GET -H "Authorization: Bearer $GRAFANA_API_KEY" \
https://your-grafana-instance/api/v1/provisioning/policies
# Update notification policies
curl -X PUT -H "Authorization: Bearer $GRAFANA_API_KEY" \
-H "Content-Type: application/json" \
-d @policies.json \
https://your-grafana-instance/api/v1/provisioning/policies
Best Practices
Policy Organization
- Start simple: Begin with a basic policy structure and add complexity as needed
- Use consistent labels: Create a standardized set of labels for your alerts
- Think hierarchically: Structure policies from most specific to most general
Performance Considerations
- Avoid over-grouping: Too many "group by" values can lead to notification storms
- Set reasonable timing: Configure appropriate intervals to avoid notification fatigue
- Use mute timings: Suppress non-critical alerts during maintenance or off-hours
Labels and Matching
- Use descriptive labels: Make labels like
severity
,environment
, andservice
consistent - Leverage regex matching: Use regex (
=~
) for flexible matching - Combine matchers: Use multiple matchers to create precise routing rules
Troubleshooting
Common Issues
- Alerts not being routed: Check your label matchers and ensure alerts have the expected labels
- Too many notifications: Review your grouping and timing settings
- Missing notifications: Verify that contact points are configured correctly
Debugging Tips
Use Grafana's built-in tools to troubleshoot notification policies:
- Check the Alert instances view to see which labels your alerts have
- Review the state history to see how alerts were processed
- Test your notification policies with the alert testing feature
# Example test alert with labels to match policies
labels:
severity: critical
service: database
instance: db-prod-01
Summary
Notification policies are a powerful feature in Grafana Alerting that give you fine-grained control over how, when, and where alert notifications are sent. They allow you to:
- Route different alerts to different teams or channels
- Control the timing and frequency of notifications
- Group related alerts together
- Silence notifications during specific periods
By effectively configuring notification policies, you can ensure the right people are notified about the right issues at the right time, reducing alert fatigue and improving your team's ability to respond to problems quickly.
Additional Resources
- Grafana Official Documentation on Notification Policies
- Alert Grouping Best Practices
- Contact Point Configuration Guide
Exercises
- Create a notification policy structure for a hypothetical application with frontend, backend, and database components
- Configure different notification timing settings for warnings vs. critical alerts
- Set up a policy that only notifies during business hours for non-critical alerts
- Create a policy that routes alerts to different teams based on which part of your infrastructure is affected
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)