Rust Benchmark Tests
Introduction
Performance is often a critical aspect of software development, especially in systems programming languages like Rust. While functional correctness is verified through unit and integration tests, benchmark tests help us measure and optimize the performance of our code.
Benchmark tests in Rust allow developers to:
- Measure execution time of functions and code blocks
- Compare different implementations of the same functionality
- Detect performance regressions during development
- Make data-driven optimization decisions
In this guide, we'll explore how to write, run, and interpret benchmark tests in Rust, providing you with the tools to ensure your Rust code not only works correctly but also performs efficiently.
Understanding Benchmark Tests
Benchmark tests measure the performance of your code, typically in terms of execution time. Unlike regular tests that verify functional correctness, benchmarks focus on how fast your code runs.
Why Write Benchmark Tests?
- Optimization: Identify bottlenecks in your code
- Comparison: Compare different algorithms or implementations
- Regression Detection: Ensure code changes don't decrease performance
- Decision Making: Make informed choices based on actual performance metrics
Setting Up Benchmark Tests in Rust
Rust's benchmark tests are currently only available in the nightly channel through the test
feature. Let's set things up step by step.
Prerequisites
- Install Rust nightly:
rustup install nightly
- Create a new Rust project or navigate to an existing one:
cargo new benchmark_example
cd benchmark_example
Project Configuration
To enable benchmarks, update your Cargo.toml
file:
[package]
name = "benchmark_example"
version = "0.1.0"
edition = "2021"
[dev-dependencies]
criterion = "0.5"
[[bench]]
name = "my_benchmark"
harness = false
This configuration:
- Adds Criterion as a development dependency (a popular benchmarking library for Rust)
- Creates a benchmark target named "my_benchmark" with the default test harness disabled
Writing Your First Benchmark
Let's create a simple function to benchmark. Add this to your src/lib.rs
file:
#[inline]
pub fn fibonacci(n: u64) -> u64 {
match n {
0 => 0,
1 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
// Alternative implementation using iteration
#[inline]
pub fn fibonacci_iterative(n: u64) -> u64 {
if n <= 1 {
return n;
}
let mut a = 0;
let mut b = 1;
for _ in 2..=n {
let temp = a + b;
a = b;
b = temp;
}
b
}
Now, create a benches
directory and a file named my_benchmark.rs
inside it:
mkdir -p benches
touch benches/my_benchmark.rs
Add the following code to benches/my_benchmark.rs
:
use benchmark_example::*;
use criterion::{black_box, criterion_group, criterion_main, Criterion};
pub fn fibonacci_benchmark(c: &mut Criterion) {
// Simple benchmark
c.bench_function("fibonacci 10", |b| b.iter(|| fibonacci(black_box(10))));
// Benchmark comparing two implementations
let mut group = c.benchmark_group("fibonacci_comparison");
group.bench_function("recursive", |b| b.iter(|| fibonacci(black_box(20))));
group.bench_function("iterative", |b| b.iter(|| fibonacci_iterative(black_box(20))));
group.finish();
}
criterion_group!(benches, fibonacci_benchmark);
criterion_main!(benches);
Understanding the Code
black_box
prevents the compiler from optimizing away the function callbench_function
runs a benchmarked function multiple timesbenchmark_group
groups related benchmarks for better comparisoncriterion_group
andcriterion_main
are macros for setting up the benchmark infrastructure
Running Benchmark Tests
Run your benchmarks with:
cargo bench
Example output:
fibonacci 10 time: [227.91 ns 229.39 ns 231.06 ns]
fibonacci_comparison/recursive
time: [27.518 µs 27.644 µs 27.791 µs]
fibonacci_comparison/iterative
time: [12.839 ns 12.917 ns 12.998 ns]
Interpreting the Results
The output shows:
- The benchmark name
- Time measurements in a [lower bound, estimate, upper bound] format
- Units (ns = nanoseconds, µs = microseconds, ms = milliseconds)
In our example, we can see that:
- The recursive Fibonacci calculation for n=20 takes approximately 27.6 microseconds
- The iterative implementation takes only 12.9 nanoseconds (about 2,000 times faster!)
Advanced Benchmarking Techniques
Parameterized Benchmarks
Test a function with multiple inputs:
pub fn parameterized_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("fibonacci_sizes");
for size in [5, 10, 15, 20].iter() {
group.bench_with_input(format!("iterative_{}", size), size, |b, &size| {
b.iter(|| fibonacci_iterative(black_box(size)))
});
}
group.finish();
}
Throughput Benchmarks
Measure performance in terms of throughput:
use criterion::BenchmarkId;
use criterion::Throughput;
pub fn throughput_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("process_vector");
for size in [1000, 10000, 100000].iter() {
// Create input data
let data = vec![1u64; *size];
// Set throughput in bytes
group.throughput(Throughput::Bytes(*size as u64 * std::mem::size_of::<u64>() as u64));
group.bench_with_input(BenchmarkId::from_parameter(size), size, |b, &_| {
b.iter(|| {
// Example function that processes a vector
black_box(data.iter().map(|&x| x * 2).sum::<u64>())
})
});
}
group.finish();
}
Custom Measurements
Criterion allows for custom measurement types beyond just execution time:
pub fn memory_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("memory_usage");
// This is a simplified example. In practice, measuring memory would require
// custom measurement tools or libraries
group.bench_function("allocation", |b| {
b.iter(|| {
// Allocate and use memory
let v = black_box(vec![0u8; 1_000_000]);
black_box(v.len())
})
});
group.finish();
}
Real-World Example: String Concatenation
Let's benchmark different ways to concatenate strings:
First, add these functions to src/lib.rs
:
pub fn concat_with_push(strings: &[&str]) -> String {
let mut result = String::new();
for s in strings {
result.push_str(s);
}
result
}
pub fn concat_with_format(strings: &[&str]) -> String {
let mut result = String::new();
for s in strings {
result = format!("{}{}", result, s);
}
result
}
pub fn concat_with_join(strings: &[&str]) -> String {
strings.join("")
}
Now, add this benchmark to benches/my_benchmark.rs
:
pub fn string_concatenation_benchmark(c: &mut Criterion) {
let strings = ["Hello", ", ", "World", "! ", "This", " ", "is", " ", "a", " ", "test"];
let mut group = c.benchmark_group("string_concatenation");
group.bench_function("push_str", |b| {
b.iter(|| concat_with_push(black_box(&strings)))
});
group.bench_function("format", |b| {
b.iter(|| concat_with_format(black_box(&strings)))
});
group.bench_function("join", |b| {
b.iter(|| concat_with_join(black_box(&strings)))
});
group.finish();
}
// Update the criterion_group! macro
criterion_group!(benches, fibonacci_benchmark, string_concatenation_benchmark);
Visualizing Results
Criterion automatically generates HTML reports with graphs in the target/criterion
directory. These include:
- Line graphs showing performance over time
- Violin plots showing the distribution of measurements
- Comparison to previous runs
Best Practices for Benchmark Tests
-
Isolate Environment: Run benchmarks on stable hardware with minimal background processes
-
Multiple Iterations: Run benchmarks multiple times to ensure consistency
rustc.bench_function("my_function", |b| {
b.iter_batched(
|| setup_test_data(), // Setup code (not measured)
|data| my_function(data), // Benchmarked code
criterion::BatchSize::SmallInput,
)
}); -
Avoid Microbenchmarking Pitfalls:
- Be aware of compiler optimizations
- Use
black_box
to prevent optimizations from invalidating results - Benchmark realistic use cases, not just isolated functions
-
Compare Similar Things: When comparing implementations, ensure they have the same inputs and outputs
-
Version Control Benchmarks: Keep benchmark code in version control to track performance over time
-
Warm-up Runs: Allow for warm-up iterations before measuring (Criterion handles this automatically)
Other Benchmarking Libraries
While Criterion is the most popular benchmarking library for Rust, other options include:
-
Built-in benchmark tests: Require nightly Rust
rust#![feature(test)]
extern crate test;
#[cfg(test)]
mod tests {
use super::*;
use test::Bencher;
#[bench]
fn bench_fibonacci(b: &mut Bencher) {
b.iter(|| fibonacci(20));
}
} -
test-bench: A simpler alternative to Criterion
-
iai: Counts CPU instructions instead of measuring time (more stable across environments)
Common Performance Bottlenecks in Rust
When using benchmark tests to optimize Rust code, look for these common issues:
- Excessive Allocation: Creating too many objects on the heap
- Inefficient Algorithms: Using O(n²) algorithms where O(n log n) would suffice
- Locks and Contention: Synchronization overhead in multi-threaded code
- Inefficient I/O: Blocking I/O operations or too many small operations
- Unnecessary Cloning: Creating copies of data when references would suffice
Summary
Benchmark tests are a powerful tool in the Rust developer's toolkit:
- They provide objective measurements of code performance
- They help identify bottlenecks and optimization opportunities
- They prevent performance regressions as code evolves
- They allow for data-driven decisions when choosing between implementations
By incorporating benchmark tests into your development workflow, you can ensure your Rust code not only functions correctly but also performs efficiently.
Exercises
-
Write benchmark tests comparing different sorting algorithms (e.g., quicksort vs. mergesort) for various input sizes.
-
Create a benchmark to compare the performance of different data structures (Vec, LinkedList, HashSet) for your specific use case.
-
Benchmark the performance difference between using iterators and traditional for loops for a specific task.
-
Profile a function from one of your existing projects and use benchmark tests to measure the impact of your optimizations.
Additional Resources
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)