Google Cloud Monitoring
Introduction
Google Cloud Monitoring (formerly known as Stackdriver Monitoring) is a powerful monitoring service that provides visibility into the performance, uptime, and overall health of cloud-powered applications. As part of Google Cloud's operations suite, it collects metrics, events, and metadata from Google Cloud Platform, Amazon Web Services, and various application components.
In this guide, we'll explore how to set up Google Cloud Monitoring as a data source in Grafana, allowing you to visualize and analyze your cloud infrastructure metrics through Grafana's intuitive dashboards.
Prerequisites
Before integrating Google Cloud Monitoring with Grafana, ensure you have:
- A Google Cloud Platform account with appropriate permissions
- Monitoring API enabled in your GCP project
- Grafana instance (version 8.0 or higher recommended)
- Basic understanding of Grafana's interface
Setting Up the Google Cloud Monitoring Data Source
Step 1: Configure Authentication
First, you need to set up authentication. Google Cloud Monitoring in Grafana supports two authentication methods:
- Google JWT File Authentication (recommended for production)
- GCE Default Service Account (simpler for development)
Using Google JWT File Authentication
-
Create a service account in the Google Cloud Console with the following roles:
Monitoring Viewer
Monitoring Query User
-
Create and download a JWT key file for this service account.
-
Store this file securely where your Grafana instance can access it.
Step 2: Add Google Cloud Monitoring as a Data Source
-
Navigate to Configuration > Data Sources in your Grafana instance.
-
Click on Add data source.
-
Search for and select Google Cloud Monitoring.
-
Configure the following settings:
Name: Google Cloud Monitoring
Default: Toggle on if you want this to be your default data source
GCP Authentication: JWT File
Service Account Key File: /path/to/your/key.json
Project ID: your-gcp-project-id
- Click Save & Test to verify the connection.
If everything is configured correctly, you should see a "Data source is working" message.
Querying Google Cloud Monitoring
Google Cloud Monitoring in Grafana offers four query types:
- Metrics - For time series data
- Service Level Objectives (SLOs) - For tracking service performance against targets
- Log-based metrics - For metrics derived from log entries
- MQL - Using Monitoring Query Language for advanced queries
Using Metrics Query Mode
The Metrics mode provides a UI-driven approach to build queries:
- Select a GCP service from the Service dropdown
- Choose a Metric from the available options
- Apply filters, aggregations, and groupings as needed
Here's an example query configuration for VM instance CPU usage:
- Service: Compute Engine
- Metric: Instance/CPU/Utilization
- Group By: instance_name
- Aggregation: mean
Using the Monitoring Query Language (MQL)
For more advanced queries, you can use MQL, which is Google's specialized query language for Cloud Monitoring:
fetch compute.googleapis.com/instance/cpu/utilization
| filter resource.zone =~ 'us-east1.*'
| group_by 1m, [value_utilization_mean: mean(value.utilization)]
| every 1m
This query fetches CPU utilization for instances in US East regions, calculates the mean utilization every minute, and returns the results.
Creating Dashboards
Once your data source is configured, you can create dashboards to visualize your GCP infrastructure metrics.
Example: VM Instances Dashboard
Here's how to create a simple dashboard for monitoring VM instances:
- Create a new dashboard in Grafana
- Add a new panel
- Select Google Cloud Monitoring as the data source
- Configure the query:
- Service: Compute Engine
- Metric: Instance/CPU/Utilization
- Group By: instance_name
- Aggregation: mean
- Set the visualization type to "Time series"
- Add additional panels for metrics like:
- Instance/Disk/Read Bytes
- Instance/Disk/Write Bytes
- Instance/Network/Received Bytes
- Instance/Network/Sent Bytes
Using Variables for Dynamic Dashboards
Grafana variables enhance your dashboards by making them dynamic and reusable:
- Go to dashboard settings and select "Variables"
- Add a new variable:
- Name: project
- Type: Query
- Data source: Google Cloud Monitoring
- Query: projects()
- Add another variable:
- Name: zone
- Type: Query
- Data source: Google Cloud Monitoring
- Query: metadata.zone()
- Include "All" option: Yes
Now you can filter your dashboard by project and zone using dropdown selectors.
Monitoring Google Kubernetes Engine (GKE)
Google Cloud Monitoring is particularly useful for monitoring GKE clusters:
- Create a new dashboard for Kubernetes monitoring
- Add panels for key metrics:
Cluster CPU Utilization
- Service: Kubernetes Engine
- Metric: Container/CPU/Utilization
- Group By: cluster_name
- Aggregation: mean
Node Memory Usage
- Service: Kubernetes Engine
- Metric: Container/Memory/Used Bytes
- Group By: node_name
- Aggregation: sum
Pod Count
- Service: Kubernetes Engine
- Metric: Pod/Count
- Group By: cluster_name
- Aggregation: mean
Advanced Techniques
Using Alerting with Google Cloud Monitoring
Grafana allows you to set up alerts based on Google Cloud Monitoring metrics:
- Edit a panel with a Google Cloud Monitoring query
- Navigate to the "Alert" tab
- Set conditions, for example:
- Evaluate: every 1m for 5m
- Condition: WHEN last() OF query(A, 5m, now) IS ABOVE 0.8
- Add notifications and save the alert
Combining Multiple Data Sources
You can create powerful dashboards by combining Google Cloud Monitoring with other data sources:
- Use Google Cloud Monitoring for infrastructure metrics
- Combine with Prometheus for application metrics
- Add Loki for logs analysis
This gives you a complete view of your system's performance from infrastructure to application.
Troubleshooting Common Issues
Missing Metrics
If you're not seeing expected metrics:
- Verify your service account has the correct permissions
- Check that the metric exists in Google Cloud Monitoring Console
- Ensure the time range in Grafana includes when the metrics were collected
Authentication Errors
If you're experiencing authentication issues:
- Verify the JWT file path is correct and accessible to Grafana
- Check that your service account has not expired
- Ensure the required APIs are enabled in your GCP project
Query Performance
For slow-performing queries:
- Reduce the time range when possible
- Apply more specific filters
- Use pre-aggregated metrics when available
- Consider using MQL for more efficient queries
Summary
Google Cloud Monitoring is a powerful data source for Grafana that enables you to visualize and analyze metrics from your Google Cloud Platform resources. By following this guide, you've learned how to:
- Set up Google Cloud Monitoring as a data source in Grafana
- Create queries using both the metrics explorer and MQL
- Build dynamic dashboards for monitoring GCP resources
- Apply advanced techniques for alerts and multi-source dashboards
- Troubleshoot common issues
With these skills, you can create comprehensive monitoring solutions that provide valuable insights into your cloud infrastructure's performance and health.
Additional Resources
- Grafana Google Cloud Monitoring Documentation
- Google Cloud Monitoring Documentation
- Monitoring Query Language Reference
Exercises
- Set up a Google Cloud Monitoring data source in your Grafana instance.
- Create a dashboard to monitor CPU, memory, and disk usage for your VM instances.
- Use MQL to create an advanced query that shows the top 5 instances by CPU usage.
- Create a dashboard variable that allows you to filter by GCP zone.
- Set up an alert that notifies you when instance CPU utilization exceeds 80% for more than 5 minutes.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)