SQL Window Functions
What Are Window Functions?
Window functions perform calculations across a set of rows related to the current row. Unlike GROUP BY, they don't collapse rows - each row keeps its identity.
Basic Syntax
SELECT
column,
WINDOW_FUNCTION() OVER (
PARTITION BY partition_column
ORDER BY order_column
ROWS BETWEEN ... AND ...
) AS result
FROM table;
Common Window Functions
1. ROW_NUMBER() - Sequential Numbering
-- Number rows sequentially
SELECT
employee,
salary,
ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;
-- Result:
-- employee | salary | row_num
-- Alice | 95000 | 1
-- Bob | 85000 | 2
-- Charlie | 75000 | 3
2. RANK() and DENSE_RANK() - Ranking with Ties
SELECT
employee,
salary,
RANK() OVER (ORDER BY salary DESC) AS rank,
DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;
-- With ties:
-- employee | salary | rank | dense_rank
-- Alice | 90000 | 1 | 1
-- Bob | 90000 | 1 | 1 (same salary)
-- Charlie | 85000 | 3 | 2 (RANK skips 2)
-- David | 80000 | 4 | 3
-- RANK: Skips numbers after ties (1, 1, 3, 4)
-- DENSE_RANK: No gaps (1, 1, 2, 3)
3. PARTITION BY - Window Per Group
-- Rank employees within each department
SELECT
department,
employee,
salary,
RANK() OVER (
PARTITION BY department
ORDER BY salary DESC
) AS dept_rank
FROM employees;
-- Result:
-- dept | employee | salary | dept_rank
-- Sales | Alice | 90000 | 1
-- Sales | Bob | 85000 | 2
-- IT | Charlie | 95000 | 1 (ranking resets)
-- IT | David | 80000 | 2
4. LAG() and LEAD() - Access Previous/Next Row
-- Compare current salary to previous employee
SELECT
employee,
salary,
LAG(salary) OVER (ORDER BY salary) AS prev_salary,
LEAD(salary) OVER (ORDER BY salary) AS next_salary,
salary - LAG(salary) OVER (ORDER BY salary) AS diff_from_prev
FROM employees;
-- Result:
-- employee | salary | prev_salary | next_salary | diff_from_prev
-- Alice | 70000 | NULL | 75000 | NULL
-- Bob | 75000 | 70000 | 80000 | 5000
-- Charlie | 80000 | 75000 | NULL | 5000
5. Running Totals and Moving Averages
-- Running total
SELECT
date,
revenue,
SUM(revenue) OVER (
ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS running_total
FROM sales;
-- Moving average (last 3 days)
SELECT
date,
revenue,
AVG(revenue) OVER (
ORDER BY date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg_3day
FROM sales;
6. NTILE() - Divide Into Buckets
-- Divide employees into 4 quartiles by salary
SELECT
employee,
salary,
NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;
-- Use case: Top 25% earners
WHERE quartile = 1;
Frame Specification (ROWS vs RANGE)
-- ROWS: Physical row-based window
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
-- Looks at exactly 3 rows (2 before + current)
-- RANGE: Logical value-based window
RANGE BETWEEN 2 PRECEDING AND CURRENT ROW
-- Includes all rows with values within 2 of current
-- Common frames:
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW -- All rows up to current
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING -- Current to end
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING -- Sliding window of 3
Real-World Examples
Example 1: Top N Per Group
-- Top 3 products per category by sales
WITH ranked_products AS (
SELECT
category,
product,
sales,
ROW_NUMBER() OVER (
PARTITION BY category
ORDER BY sales DESC
) AS rank
FROM products
)
SELECT * FROM ranked_products
WHERE rank <= 3;
Example 2: Percentage of Total
-- Each employee's percentage of total department salary
SELECT
department,
employee,
salary,
salary * 100.0 / SUM(salary) OVER (PARTITION BY department) AS pct_of_dept,
salary * 100.0 / SUM(salary) OVER () AS pct_of_company
FROM employees;
Example 3: First Value in Group
-- Compare each sale to first sale of the month
SELECT
date,
revenue,
FIRST_VALUE(revenue) OVER (
PARTITION BY DATE_TRUNC('month', date)
ORDER BY date
) AS first_revenue_of_month,
revenue - FIRST_VALUE(revenue) OVER (
PARTITION BY DATE_TRUNC('month', date)
ORDER BY date
) AS diff_from_first
FROM sales;
Example 4: Gap Analysis
-- Find gaps between transactions
SELECT
transaction_id,
timestamp,
LEAD(timestamp) OVER (ORDER BY timestamp) AS next_timestamp,
LEAD(timestamp) OVER (ORDER BY timestamp) - timestamp AS gap_duration
FROM transactions
WHERE LEAD(timestamp) OVER (ORDER BY timestamp) - timestamp > INTERVAL '1 hour';
Window Functions vs GROUP BY
| Aspect | GROUP BY | Window Functions |
|---|---|---|
| Row count | Collapses to groups | Keeps all rows |
| Use case | Aggregated summaries | Row-level + aggregate info |
| Example | Total sales per dept | Each employee + dept total |
Performance Tips
- Index ORDER BY columns: Window functions sort data
- Limit window size: Smaller frames = faster queries
- Combine windows: Reuse same window definition
-- Bad: Define window twice
SELECT
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC),
RANK() OVER (PARTITION BY dept ORDER BY salary DESC)
FROM employees;
-- Good: Define window once
SELECT
ROW_NUMBER() OVER w,
RANK() OVER w
FROM employees
WINDOW w AS (PARTITION BY dept ORDER BY salary DESC);
Common Use Cases
- Top N per group: Best products per category
- Running calculations: Cumulative sums, moving averages
- Ranking: Leaderboards, performance rankings
- Time series: Compare to previous period
- Percentiles: Quartiles, deciles for segmentation
Key Takeaways:
- Window functions don't collapse rows like GROUP BY
- PARTITION BY: Creates separate windows per group
- ORDER BY: Defines row order within window
- LAG/LEAD: Access previous/next rows
- ROW_NUMBER/RANK: Assign rankings
- Use window definition (WINDOW) to avoid repetition