Sets in Python: When and Why
Sets are unordered collections of unique elements. They're optimized for membership testing, removing duplicates, and mathematical set operations like unions and intersections.
Creating Sets
# Using curly braces
fruits = {'apple', 'banana', 'orange'}
# Using set() constructor
numbers = set([1, 2, 3, 2, 1]) # {1, 2, 3} - duplicates removed
# Empty set (can't use {} - that's a dict!)
empty = set()
Fast Membership Testing
# Sets use hash tables: O(1) lookup
# Lists use linear search: O(n) lookup
# Slow with list (checks each element)
allowed_list = [1, 2, 3, ..., 10000]
if 9999 in allowed_list: # Slow for large lists
pass
# Fast with set (hash lookup)
allowed_set = {1, 2, 3, ..., 10000}
if 9999 in allowed_set: # Instant, regardless of size
pass
Removing Duplicates
# Remove duplicates from list
numbers = [1, 2, 2, 3, 4, 4, 5]
unique = list(set(numbers)) # [1, 2, 3, 4, 5]
# Keep original order with dict
from collections import OrderedDict
unique_ordered = list(OrderedDict.fromkeys(numbers))
Set Operations
Union (All Elements)
a = {1, 2, 3}
b = {3, 4, 5}
union = a | b # {1, 2, 3, 4, 5}
# or
union = a.union(b)
Intersection (Common Elements)
a = {1, 2, 3}
b = {2, 3, 4}
intersection = a & b # {2, 3}
# or
intersection = a.intersection(b)
Difference
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
diff = a - b # {1, 2} - elements in a but not b
# or
diff = a.difference(b)
Symmetric Difference (XOR)
a = {1, 2, 3}
b = {3, 4, 5}
sym_diff = a ^ b # {1, 2, 4, 5} - elements in either but not both
# or
sym_diff = a.symmetric_difference(b)
Practical Examples
Find Common Customers
email_campaign = {'[email protected]', '[email protected]', '[email protected]'}
purchasers = {'[email protected]', '[email protected]', '[email protected]'}
# Who received email AND purchased?
converted = email_campaign & purchasers # {'[email protected]', '[email protected]'}
Find Missing Items
required_fields = {'name', 'email', 'age', 'phone'}
provided_fields = {'name', 'email', 'age'}
missing = required_fields - provided_fields # {'phone'}
Set Methods
s = {1, 2, 3}
s.add(4) # Add single element
s.update([5, 6]) # Add multiple elements
s.remove(2) # Remove element (raises error if not found)
s.discard(10) # Remove element (no error if not found)
s.clear() # Remove all elements
When to Use Sets
- Removing duplicates from data
- Fast membership testing (is x in collection?)
- Finding common or unique elements between collections
- When order doesn't matter
Pro Tip: Use sets for membership testing when performance matters. Testing if an item is in a set is much faster than checking a list, especially with large datasets.
← Back to Python Tips