Alertmanager Integrations
Introduction
Alertmanager is a critical component in the Prometheus ecosystem that handles alerts sent by client applications such as the Prometheus server. While Prometheus is responsible for generating alerts based on metric conditions, Alertmanager takes care of deduplicating, grouping, and routing them to the right receiver integration. This separation of concerns allows Prometheus to focus on monitoring and detection while Alertmanager handles notification delivery.
In this guide, we'll explore various Alertmanager integrations that enable you to receive notifications through different channels and services. By the end, you'll understand how to configure Alertmanager to send alerts to your team through email, Slack, PagerDuty, and other popular notification systems.
Understanding Alertmanager's Role
Before diving into integrations, let's clarify Alertmanager's position in the alerting workflow:
Alertmanager adds significant value by:
- Deduplicating similar alerts to reduce notification noise
- Grouping related alerts to provide context
- Routing alerts to appropriate receivers based on labels
- Silencing alerts during maintenance windows
- Inhibiting alerts when others are active
Now, let's look at how to set up various integrations.
Core Concepts of Alertmanager Configuration
Alertmanager's configuration is defined in a YAML file, typically named alertmanager.yml
. This file consists of several main sections:
global
: Default parameters like SMTP settings and slack API URLsroute
: How alerts are routed to receiversreceivers
: Configuration for notification integrationsinhibit_rules
: Rules for suppressing notificationstemplates
: Custom notification templates
Here's a basic structure example:
global:
# global settings here
route:
receiver: 'default-receiver'
group_by: ['alertname']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
# Specific routing rules here
receivers:
- name: 'default-receiver'
# Integration configurations here
inhibit_rules:
# Inhibition rules here
Email Integration
Email is one of the most common notification methods. Here's how to configure the email receiver:
Configuration Example
global:
smtp_smarthost: 'smtp.example.org:587'
smtp_from: '[email protected]'
smtp_auth_username: 'alertmanager'
smtp_auth_password: 'password'
receivers:
- name: 'email-notifications'
email_configs:
- to: '[email protected]'
send_resolved: true
html: '{{ template "email.default.html" . }}'
headers:
subject: '{{ template "email.default.subject" . }}'
How It Works
- Alertmanager uses the SMTP settings defined in the
global
section - When an alert is routed to the email receiver, it formats the email using the specified templates
- The email is sent to all addresses in the
to
field - If
send_resolved: true
is set, you'll also receive notifications when alerts are resolved
Slack Integration
Slack is a popular team communication tool, and Alertmanager offers built-in support for Slack notifications.
Configuration Example
global:
slack_api_url: 'https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX'
receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#alerts'
send_resolved: true
icon_url: 'https://avatars3.githubusercontent.com/u/3380462'
title: '{{ template "slack.default.title" . }}'
text: '{{ template "slack.default.text" . }}'
actions:
- type: button
text: 'Runbook 📗'
url: '{{ (index .Alerts 0).Annotations.runbook }}'
Creating a Slack Webhook
To integrate with Slack, you need to:
- Create a Slack App in your workspace
- Enable Incoming Webhooks
- Create a webhook for a specific channel
- Use the webhook URL in your Alertmanager configuration
The process looks like:
- Go to https://api.slack.com/apps
- Click "Create New App" and select "From scratch"
- Name your app and select your workspace
- Navigate to "Incoming Webhooks" and enable them
- Click "Add New Webhook to Workspace"
- Select the channel where alerts should be posted
- Copy the webhook URL for your Alertmanager configuration
Output Example
When properly configured, you'll receive Slack messages that look like:
[FIRING:2] InstanceDown
Instance 10.0.0.1:9100 has been down for more than 5 minutes.
Instance 10.0.0.2:9100 has been down for more than 5 minutes.
PagerDuty Integration
PagerDuty is ideal for critical alerts that require immediate attention, especially for on-call rotations.
Configuration Example
receivers:
- name: 'pagerduty-notifications'
pagerduty_configs:
- service_key: '1234567890abcdef1234567890abcdef'
send_resolved: true
description: '{{ template "pagerduty.default.description" . }}'
details:
firing: '{{ template "pagerduty.default.instances" .Alerts.Firing }}'
resolved: '{{ template "pagerduty.default.instances" .Alerts.Resolved }}'
Setting Up PagerDuty
To integrate with PagerDuty:
- Create a service in PagerDuty
- Get the integration key (service key)
- Use this key in your Alertmanager configuration
The steps are:
- Log in to PagerDuty
- Navigate to "Services" and create a new service
- Select "Prometheus" as the integration type
- Copy the Integration Key for your Alertmanager configuration
OpsGenie Integration
OpsGenie is another popular incident management platform that works well with Alertmanager.
Configuration Example
receivers:
- name: 'opsgenie-notifications'
opsgenie_configs:
- api_key: '1234567890abcdef1234567890abcdef'
message: '{{ template "opsgenie.default.message" . }}'
description: '{{ template "opsgenie.default.description" . }}'
source: 'Prometheus Alertmanager'
responders:
- name: 'Team-DB'
type: 'team'
- name: 'SRE'
type: 'team'
Webhook Integration
Webhooks are extremely flexible as they allow you to integrate with any system that can receive HTTP requests.
Configuration Example
receivers:
- name: 'webhook-notifications'
webhook_configs:
- url: 'http://example.com/webhook'
send_resolved: true
Example Custom Webhook Implementation
Here's a simple example of a Python Flask application that can receive alerts from Alertmanager:
from flask import Flask, request, jsonify
import logging
app = Flask(__name__)
logging.basicConfig(level=logging.INFO)
@app.route('/webhook', methods=['POST'])
def webhook():
data = request.json
# Log the alert
logging.info(f"Received alert: {data}")
# Process the alerts
for alert in data['alerts']:
status = alert['status']
labels = alert['labels']
annotations = alert['annotations']
logging.info(f"Status: {status}")
logging.info(f"Labels: {labels}")
logging.info(f"Annotations: {annotations}")
# Here you could take additional actions based on the alert
# For example, create tickets in your ticketing system
return jsonify({"status": "success"}), 200
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Microsoft Teams Integration
Microsoft Teams doesn't have built-in support in Alertmanager, but you can use the webhook integration with a proxy service.
Using PrometheusAlertmanagerReceiver
One approach is to use the PrometheusAlertmanagerReceiver project:
- Deploy the prometheus-msteams proxy
- Configure it with your Teams webhook URL
- Configure Alertmanager to send to this proxy
receivers:
- name: 'teams-notifications'
webhook_configs:
- url: 'http://prometheus-msteams:2000/alertmanager'
send_resolved: true
Routing Configuration
The power of Alertmanager comes from its ability to route different alerts to different receivers based on labels.
Advanced Routing Example
route:
receiver: 'default-email'
group_by: ['alertname', 'cluster', 'service']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
- match:
severity: critical
receiver: 'pagerduty-notifications'
continue: true
- match_re:
service: ^(database|cache)$
receiver: 'database-team'
- match:
team: frontend
receiver: 'frontend-team'
receivers:
- name: 'default-email'
email_configs:
- to: '[email protected]'
- name: 'pagerduty-notifications'
pagerduty_configs:
- service_key: '1234567890abcdef1234567890abcdef'
- name: 'database-team'
slack_configs:
- channel: '#db-alerts'
- name: 'frontend-team'
slack_configs:
- channel: '#frontend-alerts'
How Routing Works
- All alerts first go through the root route
- Alerts are then matched against each child route
- When a match is found, the alert is sent to that receiver
- If
continue: true
is set, the routing continues to evaluate other rules - This allows critical alerts to go to both PagerDuty and team-specific channels
Creating Custom Notification Templates
You can customize alert notifications by creating your own templates:
- Create a template file (e.g.,
custom-templates.tmpl
) - Reference it in the Alertmanager configuration
- Use the custom templates in your receiver configuration
Example Template File (custom-templates.tmpl)
{{ define "slack.custom.title" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }}
{{ end }}
{{ define "slack.custom.text" }}
{{ range .Alerts }}
*Alert:* {{ .Annotations.summary }}
*Description:* {{ .Annotations.description }}
*Severity:* {{ .Labels.severity }}
*Started:* {{ .StartsAt | since }}
{{ if .Annotations.runbook }}*Runbook:* {{ .Annotations.runbook }}{{ end }}
{{ end }}
{{ end }}
Using Custom Templates
global:
templates:
- '/etc/alertmanager/custom-templates.tmpl'
receivers:
- name: 'slack-notifications'
slack_configs:
- channel: '#alerts'
title: '{{ template "slack.custom.title" . }}'
text: '{{ template "slack.custom.text" . }}'
Practical Example: Complete Setup
Let's put it all together with a real-world example that covers multiple integrations:
global:
smtp_smarthost: 'smtp.gmail.com:587'
smtp_from: '[email protected]'
smtp_auth_username: '[email protected]'
smtp_auth_password: 'password'
slack_api_url: 'https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX'
resolve_timeout: 5m
templates:
- '/etc/alertmanager/templates/*.tmpl'
route:
receiver: 'team-emails'
group_by: ['alertname', 'cluster', 'service']
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
- match:
severity: critical
receiver: 'sre-team'
continue: true
- match:
service: database
receiver: 'database-team'
- match:
service: frontend
receiver: 'frontend-team'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'cluster', 'service']
receivers:
- name: 'team-emails'
email_configs:
- to: '[email protected]'
send_resolved: true
- name: 'sre-team'
pagerduty_configs:
- service_key: '1234567890abcdef1234567890abcdef'
send_resolved: true
slack_configs:
- channel: '#sre-alerts'
send_resolved: true
title: '{{ template "slack.default.title" . }}'
text: '{{ template "slack.default.text" . }}'
- name: 'database-team'
slack_configs:
- channel: '#db-alerts'
send_resolved: true
- name: 'frontend-team'
slack_configs:
- channel: '#frontend-alerts'
send_resolved: true
Testing Alertmanager Integrations
Before deploying to production, it's essential to test your integrations:
Using the Alertmanager API
You can test by sending alerts directly to Alertmanager's API:
curl -H "Content-Type: application/json" -d '{
"receiver": "slack-notifications",
"status": "firing",
"alerts": [
{
"status": "firing",
"labels": {
"alertname": "TestAlert",
"service": "test-service",
"severity": "critical",
"instance": "test:9090"
},
"annotations": {
"summary": "Test alert from the Alertmanager API",
"description": "This is a test alert sent via the Alertmanager API"
},
"startsAt": "2023-04-15T12:00:00Z"
}
],
"groupLabels": {
"alertname": "TestAlert"
},
"commonLabels": {
"alertname": "TestAlert",
"service": "test-service",
"severity": "critical"
},
"commonAnnotations": {
"summary": "Test alert from the Alertmanager API"
},
"externalURL": "http://alertmanager:9093"
}' http://localhost:9093/api/v1/alerts
Using amtool
The amtool
command line utility is useful for testing:
# Check configuration
amtool check-config /etc/alertmanager/alertmanager.yml
# Send a test alert
amtool alert add alertname=TestAlert severity=critical service=test instance=test:9090 --annotation=summary="Test alert from amtool" --annotation=description="This is a test alert sent via amtool"
Troubleshooting Common Issues
Notifications Not Being Received
If your alerts aren't reaching your notification channels:
-
Check Alertmanager Logs: Look for error messages related to receivers
bashdocker logs alertmanager
# or
kubectl logs -l app=alertmanager -
Verify Route Configuration: Ensure your matching labels are correct
-
Test Direct API Calls: Send test alerts directly to the API to bypass Prometheus
Email Configuration Issues
Common email problems and solutions:
- SMTP Authentication: Verify credentials and consider using an app password for Gmail
- TLS Settings: Some SMTP servers require explicit TLS configuration
- Firewall Rules: Ensure your Alertmanager can reach the SMTP server
Slack Notification Problems
If Slack notifications aren't working:
- Check Webhook URL: Verify it's correctly copied and still valid
- Rate Limiting: Slack has API rate limits that might affect high-volume alerts
- Missing Permissions: Ensure the Slack app has the necessary permissions
Advanced Topics
High Availability Setup
For production environments, run Alertmanager in a clustered high-availability mode:
alertmanager --cluster.peer=alertmanager1:9094 --cluster.peer=alertmanager2:9094 --cluster.peer=alertmanager3:9094
This ensures alert delivery even if some Alertmanager instances fail.
Silencing Alerts
You can temporarily silence alerts during maintenance:
# Using amtool
amtool silence add --comment="Maintenance window" --duration=2h service=database
# Through the API
curl -H "Content-Type: application/json" -d '{
"matchers": [
{
"name": "service",
"value": "database",
"isRegex": false
}
],
"startsAt": "2023-04-15T14:00:00Z",
"endsAt": "2023-04-15T16:00:00Z",
"comment": "Database maintenance",
"createdBy": "admin"
}' http://localhost:9093/api/v1/silences
Summary
Alertmanager integrations are a powerful way to ensure your team receives timely notifications about issues in your systems. We've covered:
- The role of Alertmanager in the Prometheus ecosystem
- Common notification integrations (Email, Slack, PagerDuty, etc.)
- How to configure routing based on alert labels
- Custom templates for notification formatting
- Testing and troubleshooting your integrations
By properly configuring these integrations, you can build a robust alerting system that notifies the right people at the right time, reducing response times and improving system reliability.
Additional Resources
Exercises
- Set up a basic Prometheus and Alertmanager environment using Docker Compose
- Configure Alertmanager to send notifications to Slack
- Create a custom notification template that includes graphs from Grafana
- Implement a webhook receiver that logs alerts to a file
- Configure different routing rules for different severity levels and services
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)