Skip to main content

Google Cloud Monitoring

Introduction

Google Cloud Monitoring (formerly known as Stackdriver Monitoring) is a powerful monitoring service that provides visibility into the performance, uptime, and overall health of cloud-powered applications. As part of Google Cloud's operations suite, it collects metrics, events, and metadata from Google Cloud Platform, Amazon Web Services, and various application components.

In this guide, we'll explore how to set up Google Cloud Monitoring as a data source in Grafana, allowing you to visualize and analyze your cloud infrastructure metrics through Grafana's intuitive dashboards.

Prerequisites

Before integrating Google Cloud Monitoring with Grafana, ensure you have:

  • A Google Cloud Platform account with appropriate permissions
  • Monitoring API enabled in your GCP project
  • Grafana instance (version 8.0 or higher recommended)
  • Basic understanding of Grafana's interface

Setting Up the Google Cloud Monitoring Data Source

Step 1: Configure Authentication

First, you need to set up authentication. Google Cloud Monitoring in Grafana supports two authentication methods:

  1. Google JWT File Authentication (recommended for production)
  2. GCE Default Service Account (simpler for development)

Using Google JWT File Authentication

  1. Create a service account in the Google Cloud Console with the following roles:

    • Monitoring Viewer
    • Monitoring Query User
  2. Create and download a JWT key file for this service account.

  3. Store this file securely where your Grafana instance can access it.

Step 2: Add Google Cloud Monitoring as a Data Source

  1. Navigate to Configuration > Data Sources in your Grafana instance.

  2. Click on Add data source.

  3. Search for and select Google Cloud Monitoring.

  4. Configure the following settings:

Name: Google Cloud Monitoring
Default: Toggle on if you want this to be your default data source
GCP Authentication: JWT File
Service Account Key File: /path/to/your/key.json
Project ID: your-gcp-project-id
  1. Click Save & Test to verify the connection.

If everything is configured correctly, you should see a "Data source is working" message.

Querying Google Cloud Monitoring

Google Cloud Monitoring in Grafana offers four query types:

  1. Metrics - For time series data
  2. Service Level Objectives (SLOs) - For tracking service performance against targets
  3. Log-based metrics - For metrics derived from log entries
  4. MQL - Using Monitoring Query Language for advanced queries

Using Metrics Query Mode

The Metrics mode provides a UI-driven approach to build queries:

  1. Select a GCP service from the Service dropdown
  2. Choose a Metric from the available options
  3. Apply filters, aggregations, and groupings as needed

Here's an example query configuration for VM instance CPU usage:

  • Service: Compute Engine
  • Metric: Instance/CPU/Utilization
  • Group By: instance_name
  • Aggregation: mean

Using the Monitoring Query Language (MQL)

For more advanced queries, you can use MQL, which is Google's specialized query language for Cloud Monitoring:

fetch compute.googleapis.com/instance/cpu/utilization
| filter resource.zone =~ 'us-east1.*'
| group_by 1m, [value_utilization_mean: mean(value.utilization)]
| every 1m

This query fetches CPU utilization for instances in US East regions, calculates the mean utilization every minute, and returns the results.

Creating Dashboards

Once your data source is configured, you can create dashboards to visualize your GCP infrastructure metrics.

Example: VM Instances Dashboard

Here's how to create a simple dashboard for monitoring VM instances:

  1. Create a new dashboard in Grafana
  2. Add a new panel
  3. Select Google Cloud Monitoring as the data source
  4. Configure the query:
    • Service: Compute Engine
    • Metric: Instance/CPU/Utilization
    • Group By: instance_name
    • Aggregation: mean
  5. Set the visualization type to "Time series"
  6. Add additional panels for metrics like:
    • Instance/Disk/Read Bytes
    • Instance/Disk/Write Bytes
    • Instance/Network/Received Bytes
    • Instance/Network/Sent Bytes

Using Variables for Dynamic Dashboards

Grafana variables enhance your dashboards by making them dynamic and reusable:

  1. Go to dashboard settings and select "Variables"
  2. Add a new variable:
    • Name: project
    • Type: Query
    • Data source: Google Cloud Monitoring
    • Query: projects()
  3. Add another variable:
    • Name: zone
    • Type: Query
    • Data source: Google Cloud Monitoring
    • Query: metadata.zone()
    • Include "All" option: Yes

Now you can filter your dashboard by project and zone using dropdown selectors.

Monitoring Google Kubernetes Engine (GKE)

Google Cloud Monitoring is particularly useful for monitoring GKE clusters:

  1. Create a new dashboard for Kubernetes monitoring
  2. Add panels for key metrics:

Cluster CPU Utilization

  • Service: Kubernetes Engine
  • Metric: Container/CPU/Utilization
  • Group By: cluster_name
  • Aggregation: mean

Node Memory Usage

  • Service: Kubernetes Engine
  • Metric: Container/Memory/Used Bytes
  • Group By: node_name
  • Aggregation: sum

Pod Count

  • Service: Kubernetes Engine
  • Metric: Pod/Count
  • Group By: cluster_name
  • Aggregation: mean

Advanced Techniques

Using Alerting with Google Cloud Monitoring

Grafana allows you to set up alerts based on Google Cloud Monitoring metrics:

  1. Edit a panel with a Google Cloud Monitoring query
  2. Navigate to the "Alert" tab
  3. Set conditions, for example:
    • Evaluate: every 1m for 5m
    • Condition: WHEN last() OF query(A, 5m, now) IS ABOVE 0.8
  4. Add notifications and save the alert

Combining Multiple Data Sources

You can create powerful dashboards by combining Google Cloud Monitoring with other data sources:

  • Use Google Cloud Monitoring for infrastructure metrics
  • Combine with Prometheus for application metrics
  • Add Loki for logs analysis

This gives you a complete view of your system's performance from infrastructure to application.

Troubleshooting Common Issues

Missing Metrics

If you're not seeing expected metrics:

  1. Verify your service account has the correct permissions
  2. Check that the metric exists in Google Cloud Monitoring Console
  3. Ensure the time range in Grafana includes when the metrics were collected

Authentication Errors

If you're experiencing authentication issues:

  1. Verify the JWT file path is correct and accessible to Grafana
  2. Check that your service account has not expired
  3. Ensure the required APIs are enabled in your GCP project

Query Performance

For slow-performing queries:

  1. Reduce the time range when possible
  2. Apply more specific filters
  3. Use pre-aggregated metrics when available
  4. Consider using MQL for more efficient queries

Summary

Google Cloud Monitoring is a powerful data source for Grafana that enables you to visualize and analyze metrics from your Google Cloud Platform resources. By following this guide, you've learned how to:

  • Set up Google Cloud Monitoring as a data source in Grafana
  • Create queries using both the metrics explorer and MQL
  • Build dynamic dashboards for monitoring GCP resources
  • Apply advanced techniques for alerts and multi-source dashboards
  • Troubleshoot common issues

With these skills, you can create comprehensive monitoring solutions that provide valuable insights into your cloud infrastructure's performance and health.

Additional Resources

Exercises

  1. Set up a Google Cloud Monitoring data source in your Grafana instance.
  2. Create a dashboard to monitor CPU, memory, and disk usage for your VM instances.
  3. Use MQL to create an advanced query that shows the top 5 instances by CPU usage.
  4. Create a dashboard variable that allows you to filter by GCP zone.
  5. Set up an alert that notifies you when instance CPU utilization exceeds 80% for more than 5 minutes.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)