Prometheus and Grafana Integration
Introduction
Prometheus is a powerful monitoring and alerting system that collects and stores metrics as time-series data. While Prometheus includes a basic built-in expression browser for visualizing metrics, it's often beneficial to pair it with Grafana, a dedicated visualization tool, to create more comprehensive and customizable dashboards.
In this guide, we'll explore how to integrate Prometheus with Grafana, allowing you to create beautiful, interactive dashboards to visualize your metrics effectively. This integration combines Prometheus's robust data collection capabilities with Grafana's powerful visualization features.
What You'll Learn
- What Grafana is and why it complements Prometheus
- How to install and configure Grafana
- Connecting Prometheus as a data source in Grafana
- Creating your first Grafana dashboard with Prometheus metrics
- Best practices for dashboard organization and visualization
Prerequisites
Before starting, ensure you have:
- A working Prometheus installation (as covered in previous sections)
- Basic understanding of Prometheus metrics and PromQL
- A system where you can install Grafana (same system as Prometheus is fine)
Understanding Grafana
Grafana is an open-source visualization and analytics platform that allows you to query, visualize, and alert on your metrics regardless of where they're stored. It provides a flexible and powerful interface for creating dashboards with various visualization options like graphs, tables, heatmaps, and more.
Why Use Grafana with Prometheus?
While Prometheus has its own web UI for basic visualization, Grafana offers several advantages:
- Rich visualization options: Beyond simple graphs, Grafana supports various panel types and visualization methods
- Dashboard organization: Create, save, and organize multiple dashboards
- Data source flexibility: Connect to multiple data sources (not just Prometheus)
- Annotations: Mark events on graphs to correlate metrics with deployments or incidents
- Templates and variables: Create dynamic, reusable dashboards
- Alerting: Configure alerts based on metric thresholds
- User management: Control access to dashboards and features
Installing Grafana
Let's start by installing Grafana. The process varies depending on your operating system:
Linux (Debian/Ubuntu)
# Add the GPG key
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
# Add the repository
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
# Update and install
sudo apt-get update
sudo apt-get install grafana
# Start Grafana server
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
Using Docker
If you prefer using Docker, you can run Grafana as a container:
docker run -d -p 3000:3000 --name grafana grafana/grafana
After installation, Grafana should be accessible at http://localhost:3000
. The default login credentials are:
- Username:
admin
- Password:
admin
You'll be prompted to change the password on first login.
Connecting Prometheus to Grafana
Now that Grafana is installed, let's connect Prometheus as a data source:
- Log in to Grafana
- Navigate to Configuration → Data Sources (or go to
http://localhost:3000/datasources
) - Click "Add data source"
- Select "Prometheus" from the list of data sources
- Configure the data source:
- Name: "Prometheus" (or any name you prefer)
- URL: The URL of your Prometheus server (e.g.,
http://localhost:9090
) - Access: Server (default) or Browser depending on your setup
- Click "Save & Test" to verify the connection
If the connection is successful, you'll see a green "Data source is working" message.
Here's how the configuration typically looks:
Creating Your First Dashboard
Now that we've connected Prometheus to Grafana, let's create a simple dashboard to monitor system metrics:
- Click on "+" in the side menu and select "Dashboard"
- Click "Add new panel"
- In the query editor:
- Select "Prometheus" as the data source
- Enter a PromQL query, for example:
rate(node_cpu_seconds_total{mode="user"}[1m])
- Adjust visualization settings in the panel options
- Click "Apply" to add the panel to your dashboard
- Click the save icon (disk) to save your dashboard with a name, e.g., "System Monitoring"
Basic PromQL Queries for Your Dashboard
Here are some useful PromQL queries to get started:
CPU Usage by Core:
rate(node_cpu_seconds_total{mode!="idle"}[1m])
Memory Usage:
node_memory_MemTotal_bytes - node_memory_MemFree_bytes - node_memory_Buffers_bytes - node_memory_Cached_bytes
Disk Space Usage:
100 - ((node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100)
Network Traffic:
rate(node_network_receive_bytes_total[5m])
rate(node_network_transmit_bytes_total[5m])
Advanced Grafana Features
Once you're comfortable with basic dashboards, you can explore Grafana's more advanced features:
Dashboard Variables
Variables allow you to create more dynamic dashboards. For example, you could create a variable for selecting different servers or services:
- On your dashboard, click the gear icon to open dashboard settings
- Select "Variables" and click "Add variable"
- Configure a new variable:
- Name:
instance
- Type: Query
- Data source: Prometheus
- Query:
label_values(node_exporter_build_info, instance)
- Name:
- Click "Update" and "Save"
Now you can use this variable in your queries like:
rate(node_cpu_seconds_total{instance="$instance", mode="user"}[1m])
Templating with Repeated Panels
You can create repeated panels for each value of a variable:
- Create a dashboard variable as shown above
- Create a panel with a query using that variable
- In the panel settings, go to the "General" tab
- Enable "Repeat options" and select your variable
This will create a separate panel for each value of your variable.
Annotations
Annotations allow you to mark significant events on your graphs:
- In dashboard settings, go to "Annotations"
- Click "Add Annotation Query"
- Configure an annotation query, for example using Prometheus Alertmanager events
Creating a Comprehensive Monitoring Dashboard
Let's put everything together to create a more comprehensive dashboard:
Step 1: Create a New Dashboard
- Click "+" → "Dashboard"
- Add dashboard variables for flexibility:
instance
for selecting different serversjob
for selecting different exporters
Step 2: Add System Overview Panels
Add panels for key system metrics:
System Load Panel:
- Query:
node_load1{instance="$instance"}
- Panel title: "System Load (1m)"
- Visualization: Stat or Graph
Memory Usage Panel:
- Query:
100 * (1 - ((node_memory_MemFree_bytes + node_memory_Cached_bytes + node_memory_Buffers_bytes) / node_memory_MemTotal_bytes))
- Panel title: "Memory Usage (%)"
- Visualization: Gauge
Disk Space Panel:
- Query:
100 - ((node_filesystem_avail_bytes{mountpoint="/", instance="$instance"} / node_filesystem_size_bytes{mountpoint="/", instance="$instance"}) * 100)
- Panel title: "Disk Usage (%)"
- Visualization: Gauge
Step 3: Add CPU Panels
CPU Usage Overview:
- Query:
sum by(mode)(rate(node_cpu_seconds_total{instance="$instance"}[1m]))
- Panel title: "CPU Usage by Mode"
- Visualization: Stacked Graph
CPU Usage per Core:
- Query:
avg by (cpu)(rate(node_cpu_seconds_total{instance="$instance", mode!="idle"}[1m])) * 100
- Panel title: "CPU Usage per Core (%)"
- Visualization: Graph or Heatmap
Step 4: Add Network Panels
Network Traffic:
- Query 1:
rate(node_network_receive_bytes_total{instance="$instance", device!="lo"}[5m])
- Query 2:
rate(node_network_transmit_bytes_total{instance="$instance", device!="lo"}[5m])
- Panel title: "Network Traffic"
- Visualization: Graph (with units set to bytes/sec or data rate)
Network Errors:
- Query:
sum(rate(node_network_transmit_errs_total{instance="$instance"}[5m]) + rate(node_network_receive_errs_total{instance="$instance"}[5m]))
- Panel title: "Network Errors"
- Visualization: Graph
Step 5: Organize and Save
- Arrange panels in a logical order
- Group related panels in rows
- Add text panels for documentation if needed
- Save your dashboard with a descriptive name
Example: Application Monitoring Dashboard
Beyond system metrics, you can create dashboards for monitoring your applications. Here's an example for a web service:
HTTP Request Rate:
sum(rate(http_requests_total[5m])) by (route)
HTTP Error Rate:
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
Response Time:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le))
Best Practices for Grafana Dashboards
To create effective dashboards, consider these best practices:
- Start with a purpose: Define what questions your dashboard should answer
- Use consistent naming: Apply consistent naming for panels and dashboards
- Group related metrics: Organize panels in logical groups
- Use appropriate visualizations: Choose the right visualization for each metric
- Set appropriate time ranges: Configure default time ranges that make sense
- Add documentation: Use text panels to explain complex metrics
- Use variables: Make dashboards reusable with template variables
- Set thresholds: Add thresholds to visually indicate when values are problematic
- Keep it simple: Avoid cluttering dashboards with too many panels
- Test with different data: Ensure dashboards work with various data scenarios
Sharing and Exporting Dashboards
Grafana allows you to share and export dashboards in several ways:
Sharing a Dashboard Link:
- Click the share icon in the dashboard
- Copy the direct link or generate a snapshot
Exporting a Dashboard:
- Go to dashboard settings
- Select "JSON Model"
- Copy the JSON or save it to a file
Importing a Dashboard:
- Click "+" → "Import"
- Paste the JSON or upload the file
- Configure data sources and variables
Common Troubleshooting Tips
No Data Shown in Panels:
- Verify Prometheus data source connection
- Check PromQL syntax for errors
- Ensure time range is appropriate
- Verify metrics exist in Prometheus
Dashboard Loads Slowly:
- Simplify complex queries
- Increase query intervals
- Reduce the number of panels
- Check Prometheus performance
Error: "Template variables could not be initialized":
- Check variable query syntax
- Ensure Prometheus has the queried data
- Verify access permissions
Summary
In this guide, we've explored how to integrate Prometheus with Grafana to create powerful visualizations of your metrics. We've covered:
- Installing and configuring Grafana
- Connecting Prometheus as a data source
- Creating basic and advanced dashboards
- Using Grafana's features like variables and annotations
- Best practices for effective visualization
This integration combines Prometheus's powerful data collection with Grafana's flexible visualization capabilities, giving you comprehensive insights into your systems and applications.
Next Steps
Now that you've set up Grafana with Prometheus, consider:
- Importing pre-built dashboards from the Grafana community
- Exploring Grafana Alerting to get notified when metrics cross thresholds
- Setting up additional exporters to collect more metrics
- Creating dashboards specific to your applications
Exercises
- Create a dashboard that monitors both system and application metrics
- Set up a dashboard with variables for selecting different environments (production, staging, etc.)
- Create a dashboard that visualizes Prometheus's own metrics
- Configure annotations to mark deployment events on your graphs
- Export your dashboard to JSON and share it with a colleague
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)