Useful Data Tips

UNION vs UNION ALL in SQL

⏱️ 25 sec read 📊 SQL

Both UNION and UNION ALL combine results from multiple queries. The key difference: UNION removes duplicates, UNION ALL keeps everything. This affects performance significantly.

UNION (Removes Duplicates)

SELECT customer_id, name FROM active_customers
UNION
SELECT customer_id, name FROM vip_customers;

-- If customer 123 is in both tables, appears only once

UNION ALL (Keeps Duplicates)

SELECT customer_id, name FROM active_customers
UNION ALL
SELECT customer_id, name FROM vip_customers;

-- If customer 123 is in both tables, appears twice

Performance Comparison

-- SLOWER: UNION must sort and compare to find duplicates
SELECT order_id FROM orders_2023
UNION
SELECT order_id FROM orders_2024;

-- FASTER: UNION ALL just appends results
SELECT order_id FROM orders_2023
UNION ALL
SELECT order_id FROM orders_2024;

Practical Example: Combining Tables

-- Get all transactions from current and archived tables
SELECT transaction_id, amount, transaction_date, 'Current' as source
FROM current_transactions
UNION ALL
SELECT transaction_id, amount, transaction_date, 'Archive' as source
FROM archived_transactions
ORDER BY transaction_date DESC;

When to Use Each

Use UNION when: Use UNION ALL when:
You need distinct results only Duplicates are impossible or wanted
Data quality requires deduplication Performance is critical
Working with small datasets Combining partitioned tables

Requirements

Pro Tip: Default to UNION ALL for better performance. Only use UNION when you specifically need duplicate removal. For large datasets, UNION can be significantly slower due to the sorting overhead.

← Back to SQL Tips