Useful Data Tips

SQL Window Functions

💾 SQL ⏱️ 35 sec read

What Are Window Functions?

Window functions perform calculations across a set of rows related to the current row. Unlike GROUP BY, they don't collapse rows - each row keeps its identity.

Basic Syntax

SELECT
    column,
    WINDOW_FUNCTION() OVER (
        PARTITION BY partition_column
        ORDER BY order_column
        ROWS BETWEEN ... AND ...
    ) AS result
FROM table;

Common Window Functions

1. ROW_NUMBER() - Sequential Numbering

-- Number rows sequentially
SELECT
    employee,
    salary,
    ROW_NUMBER() OVER (ORDER BY salary DESC) AS row_num
FROM employees;

-- Result:
-- employee  | salary | row_num
-- Alice     | 95000  | 1
-- Bob       | 85000  | 2
-- Charlie   | 75000  | 3

2. RANK() and DENSE_RANK() - Ranking with Ties

SELECT
    employee,
    salary,
    RANK() OVER (ORDER BY salary DESC) AS rank,
    DENSE_RANK() OVER (ORDER BY salary DESC) AS dense_rank
FROM employees;

-- With ties:
-- employee  | salary | rank | dense_rank
-- Alice     | 90000  | 1    | 1
-- Bob       | 90000  | 1    | 1  (same salary)
-- Charlie   | 85000  | 3    | 2  (RANK skips 2)
-- David     | 80000  | 4    | 3

-- RANK: Skips numbers after ties (1, 1, 3, 4)
-- DENSE_RANK: No gaps (1, 1, 2, 3)

3. PARTITION BY - Window Per Group

-- Rank employees within each department
SELECT
    department,
    employee,
    salary,
    RANK() OVER (
        PARTITION BY department
        ORDER BY salary DESC
    ) AS dept_rank
FROM employees;

-- Result:
-- dept  | employee | salary | dept_rank
-- Sales | Alice    | 90000  | 1
-- Sales | Bob      | 85000  | 2
-- IT    | Charlie  | 95000  | 1  (ranking resets)
-- IT    | David    | 80000  | 2

4. LAG() and LEAD() - Access Previous/Next Row

-- Compare current salary to previous employee
SELECT
    employee,
    salary,
    LAG(salary) OVER (ORDER BY salary) AS prev_salary,
    LEAD(salary) OVER (ORDER BY salary) AS next_salary,
    salary - LAG(salary) OVER (ORDER BY salary) AS diff_from_prev
FROM employees;

-- Result:
-- employee | salary | prev_salary | next_salary | diff_from_prev
-- Alice    | 70000  | NULL        | 75000       | NULL
-- Bob      | 75000  | 70000       | 80000       | 5000
-- Charlie  | 80000  | 75000       | NULL        | 5000

5. Running Totals and Moving Averages

-- Running total
SELECT
    date,
    revenue,
    SUM(revenue) OVER (
        ORDER BY date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM sales;

-- Moving average (last 3 days)
SELECT
    date,
    revenue,
    AVG(revenue) OVER (
        ORDER BY date
        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
    ) AS moving_avg_3day
FROM sales;

6. NTILE() - Divide Into Buckets

-- Divide employees into 4 quartiles by salary
SELECT
    employee,
    salary,
    NTILE(4) OVER (ORDER BY salary DESC) AS quartile
FROM employees;

-- Use case: Top 25% earners
WHERE quartile = 1;

Frame Specification (ROWS vs RANGE)

-- ROWS: Physical row-based window
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
-- Looks at exactly 3 rows (2 before + current)

-- RANGE: Logical value-based window
RANGE BETWEEN 2 PRECEDING AND CURRENT ROW
-- Includes all rows with values within 2 of current

-- Common frames:
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW  -- All rows up to current
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING  -- Current to end
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING          -- Sliding window of 3

Real-World Examples

Example 1: Top N Per Group

-- Top 3 products per category by sales
WITH ranked_products AS (
    SELECT
        category,
        product,
        sales,
        ROW_NUMBER() OVER (
            PARTITION BY category
            ORDER BY sales DESC
        ) AS rank
    FROM products
)
SELECT * FROM ranked_products
WHERE rank <= 3;

Example 2: Percentage of Total

-- Each employee's percentage of total department salary
SELECT
    department,
    employee,
    salary,
    salary * 100.0 / SUM(salary) OVER (PARTITION BY department) AS pct_of_dept,
    salary * 100.0 / SUM(salary) OVER () AS pct_of_company
FROM employees;

Example 3: First Value in Group

-- Compare each sale to first sale of the month
SELECT
    date,
    revenue,
    FIRST_VALUE(revenue) OVER (
        PARTITION BY DATE_TRUNC('month', date)
        ORDER BY date
    ) AS first_revenue_of_month,
    revenue - FIRST_VALUE(revenue) OVER (
        PARTITION BY DATE_TRUNC('month', date)
        ORDER BY date
    ) AS diff_from_first
FROM sales;

Example 4: Gap Analysis

-- Find gaps between transactions
SELECT
    transaction_id,
    timestamp,
    LEAD(timestamp) OVER (ORDER BY timestamp) AS next_timestamp,
    LEAD(timestamp) OVER (ORDER BY timestamp) - timestamp AS gap_duration
FROM transactions
WHERE LEAD(timestamp) OVER (ORDER BY timestamp) - timestamp > INTERVAL '1 hour';

Window Functions vs GROUP BY

Aspect GROUP BY Window Functions
Row count Collapses to groups Keeps all rows
Use case Aggregated summaries Row-level + aggregate info
Example Total sales per dept Each employee + dept total

Performance Tips

-- Bad: Define window twice
SELECT
    ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC),
    RANK() OVER (PARTITION BY dept ORDER BY salary DESC)
FROM employees;

-- Good: Define window once
SELECT
    ROW_NUMBER() OVER w,
    RANK() OVER w
FROM employees
WINDOW w AS (PARTITION BY dept ORDER BY salary DESC);

Common Use Cases

Key Takeaways:

  • Window functions don't collapse rows like GROUP BY
  • PARTITION BY: Creates separate windows per group
  • ORDER BY: Defines row order within window
  • LAG/LEAD: Access previous/next rows
  • ROW_NUMBER/RANK: Assign rankings
  • Use window definition (WINDOW) to avoid repetition