Useful Data Tips

Table Partitioning for Large Datasets

⏱️ 31 sec read 📊 SQL

Partitioning splits large tables into smaller, manageable pieces based on column values. This improves query performance, simplifies maintenance, and enables efficient data archiving.

Range Partitioning (Most Common)

-- PostgreSQL example
CREATE TABLE orders (
    order_id SERIAL,
    order_date DATE,
    customer_id INT,
    total DECIMAL(10,2)
) PARTITION BY RANGE (order_date);

-- Create partitions for each year
CREATE TABLE orders_2023 PARTITION OF orders
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE orders_2024 PARTITION OF orders
    FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

List Partitioning

-- Partition by region
CREATE TABLE sales (
    sale_id INT,
    region VARCHAR(20),
    amount DECIMAL(10,2)
) PARTITION BY LIST (region);

CREATE TABLE sales_north PARTITION OF sales
    FOR VALUES IN ('North', 'Northeast');

CREATE TABLE sales_south PARTITION OF sales
    FOR VALUES IN ('South', 'Southeast');

Hash Partitioning

-- Distribute data evenly across partitions
CREATE TABLE customers (
    customer_id INT,
    name VARCHAR(100)
) PARTITION BY HASH (customer_id);

CREATE TABLE customers_p0 PARTITION OF customers
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);

CREATE TABLE customers_p1 PARTITION OF customers
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);

Querying Partitioned Tables

-- Query entire table (searches all partitions)
SELECT * FROM orders WHERE customer_id = 123;

-- Query with partition key (only searches relevant partition)
SELECT * FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31';
-- Only scans orders_2024 partition!

Benefits of Partitioning

When to Partition

Pro Tip: Always include the partition key in your WHERE clause for maximum performance. Partition by date for time-series data, and set up automatic partition creation for ongoing tables.

← Back to SQL Tips