PostgreSQL pg_stat_statements
Introduction
Have you ever wondered which SQL queries are slowing down your PostgreSQL database? The pg_stat_statements
extension is your answer! This powerful extension tracks execution statistics for all SQL statements executed by your server, giving you visibility into query performance that would otherwise be difficult to obtain.
In this guide, we'll explore how to set up, use, and analyze data from the pg_stat_statements
extension to optimize your database performance. By the end, you'll have a solid understanding of how to identify problematic queries and improve your application's database interactions.
What is pg_stat_statements?
pg_stat_statements
is an official PostgreSQL extension that tracks execution statistics for all SQL statements. Unlike regular PostgreSQL logs that only show query text, this extension collects detailed metrics about each query's execution including:
- How many times a query was executed
- Total, mean, and maximum time spent on each query
- Number of rows affected
- Buffer usage statistics
- Temporary file usage
Think of it as a fitness tracker for your database - it measures how hard your queries are working and helps identify which ones need optimization.
Prerequisites
Before we begin, make sure you have:
- PostgreSQL installed (version 9.2 or later)
- Administrative access to your PostgreSQL server
- Basic knowledge of SQL commands
Setting Up pg_stat_statements
Step 1: Install the Extension
The extension is typically included with standard PostgreSQL installations, but we need to activate it. Connect to your database using psql
or another PostgreSQL client and run:
CREATE EXTENSION pg_stat_statements;
If successful, PostgreSQL will respond with:
CREATE EXTENSION
Step 2: Configure postgresql.conf
To enable the extension to collect data, you need to modify your postgresql.conf
file. Add or update these settings:
# Add pg_stat_statements to shared_preload_libraries
shared_preload_libraries = 'pg_stat_statements'
# Configure the module (optional but recommended)
pg_stat_statements.max = 10000 # maximum number of statements to track
pg_stat_statements.track = all # track all statements, including nested ones
Step 3: Restart PostgreSQL
After modifying the configuration, restart your PostgreSQL server for the changes to take effect:
# For systems using systemd
sudo systemctl restart postgresql
# For Debian/Ubuntu systems with specific version
sudo service postgresql restart
# For macOS with Homebrew
brew services restart postgresql
Step 4: Verify Installation
To confirm the extension is working properly, connect to your database and run:
SELECT * FROM pg_available_extensions WHERE name = 'pg_stat_statements';
You should see output like:
name | default_version | installed_version | comment
--------------------+-----------------+-------------------+--------------------------------------------------------
pg_stat_statements | 1.10 | 1.10 | track planning and execution statistics of all SQL statements executed
Using pg_stat_statements
Now that the extension is set up, let's learn how to use it to analyze query performance.
Basic Query to View Statement Statistics
The extension creates a view called pg_stat_statements
that you can query:
SELECT
queryid,
query,
calls,
total_exec_time,
min_exec_time,
max_exec_time,
mean_exec_time,
stddev_exec_time,
rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
This query shows your top 10 most time-consuming queries with execution statistics.
Understanding the Output
Let's break down the key columns in the result:
queryid
- A hash identifier for the normalized queryquery
- The SQL text of the query (with parameters replaced by 2, etc.)calls
- How many times the query has been executedtotal_exec_time
- Total time spent executing this query (in milliseconds)mean_exec_time
- Average execution time (in milliseconds)rows
- Total number of rows retrieved or affected by the query
For newer PostgreSQL versions (13+), you'll also see additional fields like:
total_plan_time
- Time spent planning the queryshared_blks_hit
andshared_blks_read
- Buffer cache hits and readstemp_blks_written
- Temporary files created
Practical Examples
Let's explore some common use cases for pg_stat_statements
:
Example 1: Finding Slow Queries
To identify queries that are taking the most time on average:
SELECT
substring(query, 1, 100) AS short_query,
round(mean_exec_time::numeric, 2) AS avg_time_ms,
calls,
round(total_exec_time::numeric, 2) AS total_time_ms
FROM pg_stat_statements
WHERE calls > 100 -- Ignore rarely run queries
ORDER BY mean_exec_time DESC
LIMIT 10;
This query helps you find slow operations that might benefit from optimization, even if they don't consume the most total time.
Example 2: Identifying High-Impact Queries
To find queries that have the highest overall impact (combining execution time and frequency):
SELECT
substring(query, 1, 100) AS short_query,
calls,
round(total_exec_time::numeric, 2) AS total_time_ms,
round(mean_exec_time::numeric, 2) AS avg_time_ms,
round((100 * total_exec_time / sum(total_exec_time) OVER ()), 2) AS percent_overall
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
This shows which queries consume the highest percentage of your database's execution time.
Example 3: Finding I/O Intensive Queries
In PostgreSQL 13+, you can identify queries with high disk I/O:
SELECT
substring(query, 1, 100) AS short_query,
calls,
shared_blks_hit,
shared_blks_read,
shared_blks_hit + shared_blks_read AS total_blocks,
round((shared_blks_hit::numeric / (shared_blks_hit + shared_blks_read)) * 100, 2) AS hit_percent
FROM pg_stat_statements
WHERE shared_blks_read > 1000
ORDER BY shared_blks_read DESC
LIMIT 10;
This helps identify queries that might benefit from additional indexing or memory tuning.
Visualizing Query Performance
To better understand query performance patterns, let's create a diagram showing how different factors contribute to overall query execution time:
Maintenance and Reset
The pg_stat_statements
view accumulates statistics until you explicitly reset it. To clear the statistics:
SELECT pg_stat_statements_reset();
This is useful when you want to measure performance changes after optimization or during specific application activities.
Common Issues and Troubleshooting
Missing Extension Error
If you see an error like:
ERROR: relation "pg_stat_statements" does not exist
Make sure:
- The extension is properly installed (
CREATE EXTENSION pg_stat_statements;
) - You've added it to
shared_preload_libraries
inpostgresql.conf
- You've restarted the PostgreSQL server
Memory Consumption
If the extension is consuming too much memory, adjust these settings in postgresql.conf
:
pg_stat_statements.max = 5000 # Lower the number of tracked statements
pg_stat_statements.track = top # Track only top-level statements to reduce memory usage
Best Practices for Using pg_stat_statements
-
Periodic Analysis: Schedule regular checks of your top queries to catch performance regressions early.
-
Reset Before Testing: Reset statistics before testing changes to get accurate measurements.
-
Focus on High-Impact Queries: Rather than optimizing every slow query, focus on those with the highest total impact (high frequency × high average time).
-
Look for Patterns: Sometimes similar queries have similar problems. Look for patterns in slow queries.
-
Combine with EXPLAIN: When you find a slow query, use PostgreSQL's
EXPLAIN ANALYZE
to dive deeper into execution plans.
Summary
The pg_stat_statements
extension is an invaluable tool for PostgreSQL performance tuning. By providing detailed statistics on query execution, it helps identify performance bottlenecks that might otherwise go unnoticed.
In this guide, we've learned:
- How to set up and configure the extension
- How to query it for performance insights
- How to identify slow, frequent, and resource-intensive queries
- Best practices for database performance monitoring
With these skills, you can significantly improve your application's database performance by targeting optimization efforts where they'll have the most impact.
Additional Resources
To further develop your PostgreSQL performance tuning skills:
- Explore the official PostgreSQL documentation on pg_stat_statements
- Learn about PostgreSQL's EXPLAIN command for detailed query plan analysis
- Study other PostgreSQL monitoring extensions like
pg_stat_activity
andauto_explain
Practice Exercises
- Install and configure
pg_stat_statements
on your development database - Run a variety of queries, then analyze which ones consume the most resources
- Identify your top 3 most time-consuming queries and practice optimizing them
- Create a periodic monitoring script that emails you when certain queries exceed performance thresholds
If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)