Skip to main content

PostgreSQL Window Functions

Introduction

Window functions are a powerful feature in PostgreSQL that allow you to perform calculations across a set of table rows that are somehow related to the current row. Unlike regular aggregate functions that group rows into a single output row, window functions retain all rows in the result while adding calculated values based on a set of rows related to each row.

Window functions are perfect when you need to:

  • Calculate running totals or moving averages
  • Rank rows within a partition
  • Access data from preceding or following rows
  • Compare values across related rows

If you've ever needed to compare each row to others or calculate cumulative values without losing the detail of individual rows, window functions are the solution you've been looking for.

Basic Syntax

The general syntax for a window function is:

sql
SELECT
column1,
column2,
window_function(arg1, arg2, ...) OVER (
PARTITION BY partition_expression
ORDER BY sort_expression
frame_clause
)
FROM table_name;

Let's break down the components:

  • window_function: The function to apply (like SUM, AVG, ROW_NUMBER, etc.)
  • OVER clause: Defines the "window" of rows to operate on
  • PARTITION BY: Optional; divides rows into groups or partitions
  • ORDER BY: Optional; defines the order of rows within each partition
  • frame_clause: Optional; further restricts the set of rows within the partition

Common Window Functions

Row Numbering Functions

PostgreSQL provides several functions for numbering rows:

ROW_NUMBER()

Assigns a unique number to each row in the result set.

sql
SELECT 
product_name,
category,
price,
ROW_NUMBER() OVER (ORDER BY price DESC) as row_num
FROM products;

Output:

  product_name  | category  | price | row_num 
----------------+-----------+-------+---------
Smartphone X | Electronics| 999.99| 1
Laptop Pro | Electronics| 849.50| 2
Coffee Machine | Appliances | 129.99| 3
Desk Lamp | Home | 45.50| 4
Tea Kettle | Appliances | 29.99| 5

RANK() and DENSE_RANK()

These functions assign ranks to rows, with RANK() leaving gaps for ties and DENSE_RANK() assigning consecutive ranks.

sql
SELECT 
product_name,
category,
price,
RANK() OVER (ORDER BY price DESC) as rank,
DENSE_RANK() OVER (ORDER BY price DESC) as dense_rank
FROM products;

Output:

  product_name  | category  | price | rank | dense_rank 
----------------+-----------+-------+------+------------
Smartphone X | Electronics| 999.99| 1 | 1
Laptop Pro | Electronics| 849.50| 2 | 2
Coffee Machine | Appliances | 129.99| 3 | 3
Desk Lamp | Home | 45.50| 4 | 4
Tea Kettle | Appliances | 29.99| 5 | 5

If we add a product with the same price as another:

sql
SELECT 
product_name,
category,
price,
RANK() OVER (ORDER BY price DESC) as rank,
DENSE_RANK() OVER (ORDER BY price DESC) as dense_rank
FROM products;

Output with ties:

  product_name  | category  | price | rank | dense_rank 
----------------+-----------+-------+------+------------
Smartphone X | Electronics| 999.99| 1 | 1
Laptop Pro | Electronics| 849.50| 2 | 2
Tablet Mini | Electronics| 129.99| 3 | 3
Coffee Machine | Appliances | 129.99| 3 | 3
Desk Lamp | Home | 45.50| 5 | 4
Tea Kettle | Appliances | 29.99| 6 | 5

Notice how RANK() skips rank 4 after the tie, while DENSE_RANK() uses consecutive ranks.

NTILE(n)

This function divides rows into a specified number of approximately equal groups.

sql
SELECT 
product_name,
category,
price,
NTILE(3) OVER (ORDER BY price DESC) as price_tier
FROM products;

Output:

  product_name  | category  | price | price_tier 
----------------+-----------+-------+------------
Smartphone X | Electronics| 999.99| 1
Laptop Pro | Electronics| 849.50| 1
Tablet Mini | Electronics| 129.99| 2
Coffee Machine | Appliances | 129.99| 2
Desk Lamp | Home | 45.50| 3
Tea Kettle | Appliances | 29.99| 3

Aggregate Window Functions

These functions perform calculations across rows within the window.

SUM(), AVG(), MIN(), MAX()

sql
SELECT 
category,
product_name,
price,
SUM(price) OVER (PARTITION BY category) as category_total,
AVG(price) OVER (PARTITION BY category) as category_avg,
price / SUM(price) OVER (PARTITION BY category) * 100 as pct_of_category
FROM products
ORDER BY category, price DESC;

Output:

  category   | product_name  | price | category_total | category_avg | pct_of_category 
-------------+---------------+-------+----------------+--------------+----------------
Appliances | Coffee Machine| 129.99| 159.98 | 79.99 | 81.25
Appliances | Tea Kettle | 29.99| 159.98 | 79.99 | 18.75
Electronics | Smartphone X | 999.99| 1979.48 | 659.83 | 50.52
Electronics | Laptop Pro | 849.50| 1979.48 | 659.83 | 42.92
Electronics | Tablet Mini | 129.99| 1979.48 | 659.83 | 6.56
Home | Desk Lamp | 45.50| 45.50 | 45.50 | 100.00

This query shows:

  • Total price of all products in each category
  • Average price of products in each category
  • Each product's price as a percentage of its category total

Running Totals and Moving Calculations

Window functions can calculate running totals or moving averages:

sql
SELECT
order_date,
order_amount,
SUM(order_amount) OVER (ORDER BY order_date) as running_total,
AVG(order_amount) OVER (
ORDER BY order_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) as moving_avg_3days
FROM orders
ORDER BY order_date;

Output:

 order_date | order_amount | running_total | moving_avg_3days 
------------+--------------+---------------+------------------
2023-01-01 | 100.00 | 100.00 | 100.00
2023-01-02 | 150.00 | 250.00 | 125.00
2023-01-03 | 200.00 | 450.00 | 150.00
2023-01-04 | 125.00 | 575.00 | 158.33
2023-01-05 | 175.00 | 750.00 | 166.67

Working with Window Frame Clauses

The frame clause defines which rows within a partition are included in the calculation. The default frame depends on whether you have an ORDER BY clause:

  • With ORDER BY: Default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  • Without ORDER BY: Default is RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

Common Frame Clauses

sql
-- Current row and all previous rows
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

-- Current row and one previous row
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW

-- Current row, one previous, and one following row
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING

-- All rows in the partition
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

Example: Moving Averages

sql
SELECT
date,
daily_sales,
AVG(daily_sales) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) as seven_day_average
FROM daily_sales
ORDER BY date;

Output:

    date    | daily_sales | seven_day_average 
------------+-------------+-------------------
2023-01-01 | 1200.00| 1200.00
2023-01-02 | 1350.00| 1275.00
2023-01-03 | 1100.00| 1216.67
2023-01-04 | 1450.00| 1275.00
2023-01-05 | 1300.00| 1280.00
2023-01-06 | 1050.00| 1241.67
2023-01-07 | 1100.00| 1221.43
2023-01-08 | 1400.00| 1250.00

Working with PARTITION BY

The PARTITION BY clause divides the result set into partitions to which the window function is applied separately.

sql
SELECT
department,
employee_name,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) as dept_salary_rank,
AVG(salary) OVER (PARTITION BY department) as dept_avg_salary,
MAX(salary) OVER (PARTITION BY department) as dept_max_salary
FROM employees
ORDER BY department, dept_salary_rank;

Output:

 department | employee_name | salary  | dept_salary_rank | dept_avg_salary | dept_max_salary 
------------+---------------+---------+------------------+-----------------+----------------
Engineering| John Smith | 95000.00| 1 | 82500.00 | 95000.00
Engineering| Lisa Jones | 90000.00| 2 | 82500.00 | 95000.00
Engineering| Mike Johnson | 85000.00| 3 | 82500.00 | 95000.00
Engineering| Sarah Williams| 60000.00| 4 | 82500.00 | 95000.00
Marketing | David Lee | 78000.00| 1 | 68000.00 | 78000.00
Marketing | Emily Chen | 72000.00| 2 | 68000.00 | 78000.00
Marketing | Robert Garcia | 65000.00| 3 | 68000.00 | 78000.00
Marketing | Mary Brown | 57000.00| 4 | 68000.00 | 78000.00
Sales | James Wilson | 82000.00| 1 | 73250.00 | 82000.00
Sales | Patricia Moore| 80000.00| 2 | 73250.00 | 82000.00
Sales | Thomas Taylor | 75000.00| 3 | 73250.00 | 82000.00
Sales | Jennifer Lewis| 56000.00| 4 | 73250.00 | 82000.00

Advanced Window Function Examples

Percentiles

sql
SELECT
product_name,
category,
price,
PERCENT_RANK() OVER (ORDER BY price) as percent_rank,
CUME_DIST() OVER (ORDER BY price) as cumulative_distribution
FROM products;

Output:

  product_name  | category  | price | percent_rank | cumulative_distribution 
----------------+-----------+-------+--------------+-------------------------
Tea Kettle | Appliances| 29.99| 0.0 | 0.1667
Desk Lamp | Home | 45.50| 0.25 | 0.3333
Coffee Machine | Appliances| 129.99| 0.5 | 0.6667
Laptop Pro | Electronics| 849.50| 0.75 | 0.8333
Smartphone X | Electronics| 999.99| 1.0 | 1.0000

Lag and Lead Functions

These functions access data from preceding or following rows without requiring a self-join.

sql
SELECT
order_date,
order_amount,
LAG(order_amount, 1) OVER (ORDER BY order_date) as previous_day_amount,
order_amount - LAG(order_amount, 1) OVER (ORDER BY order_date) as day_over_day_change,
LEAD(order_amount, 1) OVER (ORDER BY order_date) as next_day_amount
FROM daily_orders
ORDER BY order_date;

Output:

 order_date | order_amount | previous_day_amount | day_over_day_change | next_day_amount 
------------+--------------+---------------------+---------------------+-----------------
2023-01-01 | 1200.00 | NULL | NULL | 1350.00
2023-01-02 | 1350.00 | 1200.00 | 150.00 | 1100.00
2023-01-03 | 1100.00 | 1350.00 | -250.00 | 1450.00
2023-01-04 | 1450.00 | 1100.00 | 350.00 | 1300.00
2023-01-05 | 1300.00 | 1450.00 | -150.00 | NULL

First_Value and Last_Value

These functions retrieve values from the first or last row in the window frame.

sql
SELECT
department,
employee_name,
salary,
FIRST_VALUE(employee_name) OVER (
PARTITION BY department
ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as highest_paid_employee,
LAST_VALUE(employee_name) OVER (
PARTITION BY department
ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as lowest_paid_employee
FROM employees
ORDER BY department, salary DESC;

Output:

 department | employee_name | salary  | highest_paid_employee | lowest_paid_employee 
------------+---------------+---------+-----------------------+----------------------
Engineering| John Smith | 95000.00| John Smith | Sarah Williams
Engineering| Lisa Jones | 90000.00| John Smith | Sarah Williams
Engineering| Mike Johnson | 85000.00| John Smith | Sarah Williams
Engineering| Sarah Williams| 60000.00| John Smith | Sarah Williams
Marketing | David Lee | 78000.00| David Lee | Mary Brown
Marketing | Emily Chen | 72000.00| David Lee | Mary Brown
Marketing | Robert Garcia | 65000.00| David Lee | Mary Brown
Marketing | Mary Brown | 57000.00| David Lee | Mary Brown
Sales | James Wilson | 82000.00| James Wilson | Jennifer Lewis
Sales | Patricia Moore| 80000.00| James Wilson | Jennifer Lewis
Sales | Thomas Taylor | 75000.00| James Wilson | Jennifer Lewis
Sales | Jennifer Lewis| 56000.00| James Wilson | Jennifer Lewis

Real-World Applications

Analyzing Sales Data

Window functions are perfect for sales analysis:

sql
SELECT
product_category,
DATE_TRUNC('month', order_date) as month,
SUM(order_amount) as monthly_sales,
SUM(SUM(order_amount)) OVER (
PARTITION BY product_category
ORDER BY DATE_TRUNC('month', order_date)
ROWS UNBOUNDED PRECEDING
) as cumulative_sales,
SUM(order_amount) / SUM(SUM(order_amount)) OVER (
PARTITION BY product_category
) * 100 as percentage_of_category_total
FROM sales
GROUP BY product_category, DATE_TRUNC('month', order_date)
ORDER BY product_category, month;

Employee Salary Analysis

sql
SELECT
e.employee_id,
e.name,
e.department,
e.salary,
e.hire_date,
ROUND(e.salary / AVG(e.salary) OVER (PARTITION BY e.department) * 100, 2) as percent_of_dept_avg,
RANK() OVER (PARTITION BY e.department ORDER BY e.salary DESC) as dept_salary_rank,
DENSE_RANK() OVER (ORDER BY e.salary DESC) as company_salary_rank,
FIRST_VALUE(e.name) OVER (
PARTITION BY e.department
ORDER BY e.hire_date
ROWS UNBOUNDED PRECEDING
) as dept_most_senior,
LAG(e.salary) OVER (
PARTITION BY e.department
ORDER BY e.hire_date
) as previous_hire_salary
FROM employees e
ORDER BY e.department, e.salary DESC;

Time Series Analysis

Window functions excel at time series analysis:

sql
SELECT
date,
stock_price,
AVG(stock_price) OVER (
ORDER BY date
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
) as moving_avg_5day,
stock_price - LAG(stock_price, 5) OVER (ORDER BY date) as price_change_5day,
(stock_price / LAG(stock_price, 5) OVER (ORDER BY date) - 1) * 100 as percent_change_5day
FROM stock_prices
ORDER BY date;

Combining Multiple Window Functions

You can use multiple window functions in a single query:

sql
SELECT
department,
employee_name,
salary,
RANK() OVER dept_salary_window as salary_rank,
AVG(salary) OVER dept_window as dept_avg,
MAX(salary) OVER dept_window as dept_max,
MIN(salary) OVER dept_window as dept_min,
salary - AVG(salary) OVER dept_window as diff_from_avg
FROM employees
WINDOW
dept_window AS (PARTITION BY department),
dept_salary_window AS (PARTITION BY department ORDER BY salary DESC)
ORDER BY department, salary_rank;

This approach uses the WINDOW clause to define window specifications once and reuse them, making the query more readable.

Performance Considerations

Window functions are powerful but can be resource-intensive. Here are some tips:

  1. Use indexes: Ensure the columns in PARTITION BY and ORDER BY clauses are indexed.
  2. Limit partitions: Excessively large numbers of partitions can degrade performance.
  3. Consider materialized views: For frequently used window calculations, consider materialized views.
  4. Monitor execution plans: Use EXPLAIN ANALYZE to understand how PostgreSQL executes your window functions.

Summary

Window functions in PostgreSQL are a powerful tool for analytics and reporting. They allow you to:

  • Perform calculations across related rows without collapsing them into a single output row
  • Rank and number rows within partitions
  • Create running totals and moving averages
  • Compare current rows with preceding or following rows
  • Calculate percentiles and distributions

By mastering window functions, you can write more efficient queries and perform complex analytics directly in your database, reducing the need for application-side processing.

Additional Resources and Exercises

Resources

Exercises

  1. Basic Window Functions: Write a query that ranks products by price within each category.

  2. Running Totals: Calculate the running total of sales for each month of the year.

  3. Moving Averages: Calculate a 7-day moving average of daily website visits.

  4. Employee Analysis: For each employee, show their salary, the department average, and how much their salary differs from the department average as a percentage.

  5. Advanced Challenge: Write a query that identifies products whose price is higher than the average price of its category, and show by what percentage it exceeds that average.

Solutions to these exercises can be found by applying the concepts learned in this tutorial!



If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)