PromQL Data Types
Introduction
Prometheus Query Language (PromQL) is a powerful functional query language that lets you select and aggregate time series data stored in Prometheus. To effectively work with PromQL, it's essential to understand its underlying data types, as they determine how your queries are processed and what operations can be performed.
PromQL has four fundamental data types:
- Instant Vector
- Range Vector
- Scalar
- String
Each data type serves a specific purpose in the query language and is suitable for different scenarios. In this guide, we'll explore each of these data types in detail with examples to help you understand when and how to use them.
Instant Vector
An instant vector is the most common data type in PromQL. It represents a set of time series, each containing a single sample at a specific point in time (the current instant).
Syntax and Usage
http_requests_total
This simple query returns an instant vector containing all time series with the metric name http_requests_total
.
Key Characteristics
- Represents multiple time series with a single sample per series
- Includes both the sample value and a set of labels for each time series
- Most PromQL functions and operators work with instant vectors
Example
Let's say we want to see the current values of HTTP requests across different endpoints:
http_requests_total
Output:
http_requests_total{instance="server1:9090", job="api-server", endpoint="/users"} => 1234
http_requests_total{instance="server1:9090", job="api-server", endpoint="/products"} => 532
http_requests_total{instance="server2:9090", job="api-server", endpoint="/users"} => 901
Filtering Instant Vectors
You can use label matchers to filter instant vectors:
http_requests_total{job="api-server", endpoint="/users"}
Output:
http_requests_total{instance="server1:9090", job="api-server", endpoint="/users"} => 1234
http_requests_total{instance="server2:9090", job="api-server", endpoint="/users"} => 901
Range Vector
A range vector is a collection of time series data points over a specified time interval. Unlike instant vectors, which include a single sample per time series, range vectors include a range of samples for each time series.
Syntax and Usage
Range vectors are created by appending a time range selector ([time]
) to an instant vector expression:
http_requests_total[5m]
This returns the values of http_requests_total
over the last 5 minutes.
Key Characteristics
- Contains a range of samples for each time series
- Includes timestamps for each sample
- Used for calculating rates, averages, or analyzing trends over time
- Limited set of functions can work directly with range vectors
Example
To see HTTP requests over the last 5 minutes:
http_requests_total[5m]
Output:
http_requests_total{instance="server1:9090", job="api-server", endpoint="/users"} =>
[(t=1600443600, v=1180), (t=1600443660, v=1203), (t=1600443720, v=1219), (t=1600443780, v=1234)]
http_requests_total{instance="server1:9090", job="api-server", endpoint="/products"} =>
[(t=1600443600, v=500), (t=1600443660, v=510), (t=1600443720, v=525), (t=1600443780, v=532)]
Common Time Range Selectors
Suffix | Description |
---|---|
s | seconds |
m | minutes |
h | hours |
d | days |
w | weeks |
y | years |
Converting Range Vectors to Instant Vectors
Most PromQL functions require instant vectors as input. To convert a range vector to an instant vector, you can use functions like:
rate()
: Calculates the per-second average rate of increaseincrease()
: Calculates the increase in the time series over the provided time rangeavg_over_time()
: Calculates the average value over the provided range
Example:
rate(http_requests_total[5m])
Output:
rate(http_requests_total{instance="server1:9090", job="api-server", endpoint="/users"}[5m]) => 0.9
rate(http_requests_total{instance="server1:9090", job="api-server", endpoint="/products"}[5m]) => 0.53
Scalar
A scalar is a simple numeric floating-point value. Unlike vector types, a scalar has no labels or timestamps associated with it.
Syntax and Usage
Scalars can be:
- Literal numeric values
- The result of scalar operations
- The result of functions that return scalars
100
Key Characteristics
- Single numeric value (float64)
- No associated labels
- Can be combined with vectors in binary operations
- Used for constants, thresholds, or simple calculations
Example
Scalar literals:
42
3.14159
Results of vector-to-scalar operations:
scalar(sum(http_requests_total))
Output:
2667
Practical Uses of Scalars
Scalars are often used in combination with vectors for normalization or comparison:
# Normalize request counts by total requests
http_requests_total / scalar(sum(http_requests_total))
Output:
http_requests_total{instance="server1:9090", job="api-server", endpoint="/users"} => 0.463
http_requests_total{instance="server1:9090", job="api-server", endpoint="/products"} => 0.199
http_requests_total{instance="server2:9090", job="api-server", endpoint="/users"} => 0.338
String
The string data type represents a simple string value. This data type is rarely used directly in PromQL queries, as Prometheus is primarily designed for numeric time series data.
Key Characteristics
- Simple string value
- Cannot be directly queried or plotted
- Limited use in actual PromQL queries
- Mostly used in functions that operate on labels
Example
While you cannot query for string values directly, strings appear in label values:
http_requests_total{endpoint="/users"}
Here, "/users"
is a string value used in label matching.
Common Use Cases for Strings
- Label selectors in queries
- Label manipulation in functions like
label_replace()
- String outputs in functions like
label_join()
Type Conversion
PromQL allows conversion between different data types in certain situations:
Scalar to Instant Vector
A scalar can be converted to an instant vector with no labels:
vector(42)
Output:
{} => 42
Instant Vector to Scalar
You can convert a single-element instant vector to a scalar:
scalar(some_metric{label="value"})
Note: This will return an error if the instant vector doesn't contain exactly one element.
Data Type Selection Guide
Here's a quick guide for when to use each data type:
Data Type | When to Use |
---|---|
Instant Vector | For current values, filtering, and most operations |
Range Vector | For calculating rates, trends, and time-based aggregations |
Scalar | For constants, thresholds, and simple numeric values |
String | For label matching and string operations |
Data Type Operations
Different operations are available depending on the data types involved:
Operations on Instant Vectors
- Arithmetic:
+
,-
,*
,/
,%
,^
- Comparison:
==
,!=
,>
,<
,>=
,<=
- Logical/Set:
and
,or
,unless
- Aggregation:
sum
,min
,max
,avg
, etc.
Operations on Range Vectors
- Range functions:
rate
,increase
,avg_over_time
, etc.
Operations on Scalars
- Arithmetic between scalars:
1 + 2
- Arithmetic between scalar and instant vector:
http_requests_total * 2
Practical Examples
Let's explore some real-world examples that demonstrate how to use different data types effectively.
Monitoring HTTP Error Rates Using Instant and Range Vectors
# Calculate error rate as a percentage
100 * rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m])
In this example:
http_requests_total{status=~"5.."}[5m]
is a range vector of error requests over 5 minuteshttp_requests_total[5m]
is a range vector of all requests over 5 minutesrate()
converts these range vectors to instant vectors representing per-second rates- The scalar
100
is used to convert to percentage - The resulting expression gives the error rate percentage for each service/endpoint
Using Scalars for Dynamic Thresholds
# Flag services that use more than 20% of total CPU
instance_cpu_usage / scalar(sum(instance_cpu_usage)) > 0.2
Here:
instance_cpu_usage
is an instant vector with CPU usage per instancescalar(sum(instance_cpu_usage))
converts the total CPU usage to a scalar- We divide each instance's usage by the total to get a proportion
- The comparison
> 0.2
filters for instances using more than 20% of resources
Using Range Vectors for Trend Analysis
# Detect whether request rate is increasing
deriv(rate(http_requests_total[5m])[30m:5m])
In this example:
rate(http_requests_total[5m])
calculates the per-second rate[30m:5m]
takes multiple 5-minute rate calculations over a 30-minute windowderiv()
calculates the derivative (slope) of this trend- Positive values indicate an increasing trend, negative values a decreasing trend
Summary
Understanding PromQL data types is fundamental to writing effective queries:
- Instant Vectors are the most common type, representing current values with labels
- Range Vectors are essential for rate calculations and time-based analysis
- Scalars are useful for constants, thresholds, and normalizations
- Strings are primarily used in label matching
Each data type has its specific use cases and supports different operations. By understanding these types and their relationships, you can construct more powerful and precise queries to monitor your systems effectively.
When writing PromQL queries, always consider:
- What data type do I need for my analysis?
- What operations are available for that data type?
- Do I need to convert between data types to achieve my goal?
Additional Resources
To deepen your understanding of PromQL data types:
- Prometheus Documentation: Query Basics
- Prometheus Documentation: Operators
- Prometheus Documentation: Functions
Exercises
- Write a PromQL query that shows the HTTP request rate per second for each service over the last 5 minutes.
- Create a query that calculates the percentage of memory used by each instance relative to the total memory used.
- Develop a query that shows services with an error rate greater than 1% in the last 10 minutes.
- Write a query that compares the current request rate with the rate 24 hours ago.
- Create an expression that identifies instances where CPU usage has increased by more than 20% in the last hour.
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)