ServiceMonitor Custom Resources

Introduction

When running Prometheus in a Kubernetes environment, discovering and monitoring services dynamically becomes crucial as containers are ephemeral and can move between nodes. The Prometheus Operator introduces the ServiceMonitor custom resource, which enables declarative service discovery and automatic configuration for Prometheus.

A ServiceMonitor custom resource defines how a set of services should be monitored by Prometheus. It allows you to:

Define which services Prometheus should scrape for metrics
Specify the endpoints, paths, and protocols for scraping
Configure scraping intervals and timeouts
Set relabeling configurations for customizing metrics

In this guide, we'll explore how ServiceMonitor resources work, how to create them, and how they integrate with Prometheus in a Kubernetes environment.

Prerequisites

Before diving into ServiceMonitors, ensure you have:

A running Kubernetes cluster
Prometheus Operator installed (typically via the kube-prometheus stack)
Basic knowledge of Kubernetes services and Prometheus metrics

Understanding ServiceMonitor Resources

The ServiceMonitor Custom Resource Definition

The ServiceMonitor CRD extends the Kubernetes API, allowing you to define monitoring configurations as native Kubernetes objects. The Prometheus Operator watches for these objects and automatically configures Prometheus servers to scrape the services that match the specifications.

Here's the relationship between the components:

Key Components of a ServiceMonitor

A ServiceMonitor resource includes several important fields:

Selector: Defines which services this monitor should target
Namespace selector: Specifies which namespaces to look for services
Endpoints: Defines the endpoints from which metrics should be scraped
Sample Limit: Optional limit on the number of samples that can be accepted

Creating Your First ServiceMonitor

Let's create a basic ServiceMonitor to scrape metrics from a service named example-app in the default namespace:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app-monitor
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: example-app
  namespaceSelector:
    matchNames:
      - default
  endpoints:
  - port: metrics
    interval: 15s
    path: /metrics

In this example:

The ServiceMonitor looks for services with the label app: example-app
It only checks in the default namespace
It scrapes the endpoint available on the metrics port
It uses a 15-second scraping interval
It expects metrics to be available at the /metrics path

Corresponding Service Definition

For the ServiceMonitor to work, you would need a Kubernetes service with matching labels:

apiVersion: v1
kind: Service
metadata:
  name: example-app
  namespace: default
  labels:
    app: example-app
spec:
  selector:
    app: example-app
  ports:
  - name: metrics
    port: 8080
    targetPort: 8080
    protocol: TCP

Notice that the service has the app: example-app label that matches our ServiceMonitor selector, and it includes a port named metrics which is referenced in the ServiceMonitor's endpoints.

Advanced ServiceMonitor Configurations

Monitoring Services Across Multiple Namespaces

If you want to monitor services across multiple specific namespaces:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: cross-namespace-monitor
  namespace: monitoring
spec:
  selector:
    matchLabels:
      role: api
  namespaceSelector:
    matchNames:
    - production
    - staging
    - testing
  endpoints:
  - port: http-metrics
    interval: 30s

Or to monitor services across all namespaces:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: all-namespaces-monitor
  namespace: monitoring
spec:
  selector:
    matchLabels:
      role: api
  namespaceSelector:
    any: true
  endpoints:
  - port: http-metrics

Configuring TLS and Authentication

For endpoints that require TLS or authentication:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: secure-app-monitor
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: secure-app
  endpoints:
  - port: https
    scheme: https
    tlsConfig:
      caFile: /etc/prometheus/certs/ca.crt
      certFile: /etc/prometheus/certs/client.crt
      keyFile: /etc/prometheus/certs/client.key
      insecureSkipVerify: false
    basicAuth:
      username:
        name: basic-auth
        key: username
      password:
        name: basic-auth
        key: password

This configuration uses TLS certificates and basic authentication stored in Kubernetes secrets to access a secure endpoint.

Relabeling Configurations

Relabeling allows you to modify the metadata of scraped metrics:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: relabel-example
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: metrics
    relabelings:
    - sourceLabels: [__meta_kubernetes_pod_name]
      targetLabel: pod_name
    - sourceLabels: [__meta_kubernetes_namespace]
      targetLabel: namespace
    metricRelabelings:
    - sourceLabels: [__name__]
      regex: 'container_memory_usage_bytes'
      action: drop

This example:

Adds a pod_name label containing the pod name
Adds a namespace label with the namespace name
Drops any metrics with the name container_memory_usage_bytes

Verifying Your ServiceMonitor Configuration

After creating a ServiceMonitor, you can verify it's working correctly:

Check if the ServiceMonitor is recognized:

kubectl get servicemonitors -n monitoring

Verify the Prometheus configuration:

Access your Prometheus UI (typically available through a port-forward or ingress) and check the "Status" > "Configuration" page. You should see your service listed as a scrape target.

Check if targets are being scraped:

In the Prometheus UI, go to "Status" > "Targets". Look for your service in the list of targets and verify it shows as "UP".

Practical Example: Monitoring a Web Application

Let's walk through a complete example of setting up monitoring for a web application:

1. Deploy a Sample Application

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-webapp
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: example-webapp
  template:
    metadata:
      labels:
        app: example-webapp
    spec:
      containers:
      - name: webapp
        image: example/webapp:v1
        ports:
        - name: web
          containerPort: 8080
        - name: metrics
          containerPort: 8081
---
apiVersion: v1
kind: Service
metadata:
  name: example-webapp
  namespace: default
  labels:
    app: example-webapp
spec:
  selector:
    app: example-webapp
  ports:
  - name: web
    port: 80
    targetPort: web
  - name: metrics
    port: 8081
    targetPort: metrics

2. Create a ServiceMonitor for the Application

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-webapp-monitor
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: example-webapp
  namespaceSelector:
    matchNames:
      - default
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics

3. View the Results

After deploying both resources:

Check that the ServiceMonitor is created:

kubectl get servicemonitor example-webapp-monitor -n monitoring

After a minute or two, check the Prometheus targets to ensure your application is being scraped:

kubectl port-forward svc/prometheus-operated 9090:9090 -n monitoring

Then open http://localhost:9090/targets in your browser and look for your targets.

Query some metrics from your application:

http_requests_total{service="example-webapp"}

Troubleshooting ServiceMonitors

Common Issues

ServiceMonitor not selecting any services:
- Verify that labels match between the ServiceMonitor selector and the Service
- Check that the namespaces match or are included in the namespaceSelector
Targets show as "DOWN" in Prometheus:
- Ensure the port name exists in your service definition
- Verify the application is exposing metrics on the specified path
- Check if any network policies are blocking Prometheus
Authentication fails:
- Verify that secrets referenced in the ServiceMonitor exist and contain the correct data
- Check certificate expiration dates

Debugging Commands

# Check ServiceMonitor details
kubectl describe servicemonitor <name> -n <namespace>

# Verify endpoints on the service
kubectl get endpoints <service-name> -n <service-namespace>

# Check if Prometheus can reach the service
kubectl exec -it <prometheus-pod> -n <prometheus-namespace> -- wget -O- --timeout=2 http://<service>.<namespace>.svc:8080/metrics

Integration with PrometheusRules

ServiceMonitors work well with PrometheusRules, which define alerting and recording rules:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: example-webapp-alerts
  namespace: monitoring
  labels:
    prometheus: k8s
    role: alert-rules
spec:
  groups:
  - name: example-webapp
    rules:
    - alert: HighErrorRate
      expr: sum(rate(http_requests_total{job="example-webapp",status=~"5.."}[5m])) / sum(rate(http_requests_total{job="example-webapp"}[5m])) > 0.1
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "High error rate in example-webapp"
        description: "Error rate is above 10% for more than 10 minutes (current value: {{ $value }})"

This rule creates an alert when the error rate for our example-webapp exceeds 10% for more than 10 minutes.

Best Practices

Use consistent labeling:
- Follow a standardized labeling convention for your services
- Consider using Helm charts to ensure consistency
Organize by namespace:
- Group ServiceMonitors in a monitoring namespace
- Use namespaceSelector thoughtfully to avoid unnecessary scraping
Set appropriate intervals:
- Balance between freshness of data and load on Prometheus
- Consider the cardinality of metrics and storage requirements
Leverage relabeling:
- Add useful metadata to metrics using relabeling
- Drop unnecessary metrics to reduce storage requirements
Monitor your monitors:
- Set up alerts for when ServiceMonitors aren't working
- Create documentation for your monitoring setup

Summary

ServiceMonitor custom resources provide a powerful, Kubernetes-native way to configure service discovery for Prometheus. By defining which services should be monitored declaratively, you can:

Automate the discovery and monitoring of services
Apply consistent monitoring across your applications
Integrate monitoring into your GitOps workflow
Scale your monitoring as your application grows

They form an essential part of the Prometheus Operator ecosystem, working alongside other custom resources like Prometheus, Alertmanager, and PrometheusRule to create a comprehensive monitoring solution.

Additional Resources

Exercises

Create a ServiceMonitor for a simple application that exposes metrics on a custom path like /custom-metrics.
Configure a ServiceMonitor that selects services based on multiple label criteria.
Set up a ServiceMonitor that monitors services across all namespaces that have a specific label.
Create a ServiceMonitor with relabeling configuration that adds pod and node information to metrics.
Implement a ServiceMonitor that uses TLS for a secure connection to your metrics endpoint.

If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)

Introduction​

Prerequisites​

Understanding ServiceMonitor Resources​

The ServiceMonitor Custom Resource Definition​

Key Components of a ServiceMonitor​

Creating Your First ServiceMonitor​

Corresponding Service Definition​

Advanced ServiceMonitor Configurations​

Monitoring Services Across Multiple Namespaces​

Configuring TLS and Authentication​

Relabeling Configurations​

Verifying Your ServiceMonitor Configuration​

Practical Example: Monitoring a Web Application​

1. Deploy a Sample Application​

2. Create a ServiceMonitor for the Application​

3. View the Results​

Troubleshooting ServiceMonitors​

Common Issues​

Debugging Commands​

Integration with PrometheusRules​

Best Practices​

Summary​

Additional Resources​

Exercises​