Sets
Learn how to work with sets in Python — collections of unique values with powerful operations.
Sets
A set is a collection of unique values with no duplicates and no order. If you add the same value twice, it only appears once. Python automatically removes duplicates for you.
numbers = {1, 2, 3, 4, 5}
fruits = {"apple", "banana", "mango"}Two things make sets special:
- No duplicates — every value is unique
- No order — items have no index, no position
Creating a set
# empty set — must use set(), not {} (that creates an empty dict)
empty = set()
# set with values
fruits = {"apple", "banana", "mango"}
# set from a list — duplicates removed automatically
numbers = set([1, 2, 2, 3, 3, 3, 4])
print(numbers) # {1, 2, 3, 4}
# set from a string — each character becomes an item
letters = set("hello")
print(letters) # {'h', 'e', 'l', 'o'} — only one 'l'To create an empty set, always use set() — not {}. An empty {} creates an empty dictionary, not a set.
Sets remove duplicates automatically
This is the most common reason people use sets — quick and easy deduplication:
names = ["Ali", "Sara", "Ali", "Omar", "Sara", "Ali"]
unique_names = set(names)
print(unique_names) # {'Ali', 'Sara', 'Omar'}If you want the result back as a list:
unique_names = list(set(names))Sets are unordered — when you print a set, the order of items is not guaranteed. Do not rely on set order.
Accessing items
Sets have no index — you cannot do fruits[0]. You loop over them or check membership:
fruits = {"apple", "banana", "mango"}
# check if something is in the set
print("apple" in fruits) # True
print("grape" in fruits) # False
# loop over the set
for fruit in fruits:
print(fruit)Adding and removing items
add() — add one item
fruits = {"apple", "banana"}
fruits.add("mango")
print(fruits) # {'apple', 'banana', 'mango'}
# adding a duplicate does nothing
fruits.add("apple")
print(fruits) # {'apple', 'banana', 'mango'} — unchangedupdate() — add multiple items
fruits = {"apple", "banana"}
fruits.update(["mango", "grape", "orange"])
print(fruits) # {'apple', 'banana', 'mango', 'grape', 'orange'}remove() — remove an item, raises error if missing
fruits = {"apple", "banana", "mango"}
fruits.remove("banana")
print(fruits) # {'apple', 'mango'}
fruits.remove("grape") # KeyError — grape is not in the setdiscard() — remove an item, no error if missing
fruits = {"apple", "banana", "mango"}
fruits.discard("banana") # removes it
fruits.discard("grape") # does nothing — no error
print(fruits) # {'apple', 'mango'}Use discard() when you are not sure if the item exists. Use remove() when you expect it to be there and want an error if it is not.
pop() — remove and return a random item
fruits = {"apple", "banana", "mango"}
removed = fruits.pop()
print(removed) # some item — unpredictable which one
print(fruits) # remaining itemsclear() — remove everything
fruits = {"apple", "banana", "mango"}
fruits.clear()
print(fruits) # set()Set operations
This is where sets become really powerful. Set operations let you compare and combine sets mathematically.
Let's use a real example. Two groups of students:
python_students = {"Ali", "Sara", "Omar", "Fatima"}
javascript_students = {"Sara", "Omar", "Zainab", "Ahmed"}Union — all students from both groups
Everyone who studies either Python or JavaScript:
all_students = python_students | javascript_students
# or
all_students = python_students.union(javascript_students)
print(all_students)
# {'Ali', 'Sara', 'Omar', 'Fatima', 'Zainab', 'Ahmed'}Intersection — students in both groups
Only students who study both Python and JavaScript:
both = python_students & javascript_students
# or
both = python_students.intersection(javascript_students)
print(both)
# {'Sara', 'Omar'}Difference — students in one group but not the other
Students who study Python but not JavaScript:
only_python = python_students - javascript_students
# or
only_python = python_students.difference(javascript_students)
print(only_python)
# {'Ali', 'Fatima'}Students who study JavaScript but not Python:
only_javascript = javascript_students - python_students
print(only_javascript)
# {'Zainab', 'Ahmed'}Symmetric difference — students in one group but not both
Everyone who studies only one language — not both:
exclusive = python_students ^ javascript_students
# or
exclusive = python_students.symmetric_difference(javascript_students)
print(exclusive)
# {'Ali', 'Fatima', 'Zainab', 'Ahmed'}Subset and superset
Check if one set is completely contained within another:
admins = {"Ali", "Sara"}
all_users = {"Ali", "Sara", "Omar", "Fatima"}
# is admins a subset of all_users? (all admins are in all_users)
print(admins.issubset(all_users)) # True
print(admins <= all_users) # True — same thing
# is all_users a superset of admins? (all_users contains all admins)
print(all_users.issuperset(admins)) # True
print(all_users >= admins) # True — same thingisdisjoint() — check if two sets share nothing
a = {1, 2, 3}
b = {4, 5, 6}
c = {3, 4, 5}
print(a.isdisjoint(b)) # True — no common items
print(a.isdisjoint(c)) # False — both have 3Frozenset — an immutable set
A frozenset is a set that cannot be changed after creation. No adding, no removing:
permissions = frozenset({"read", "write", "execute"})
permissions.add("delete") # AttributeError — frozensets are immutableFrozensets can be used as dictionary keys because they are immutable — regular sets cannot.
When to use sets
| Situation | Use a set |
|---|---|
| Remove duplicates from a list | Yes |
| Check if a value exists quickly | Yes |
| Find common items between two collections | Yes |
| Find items in one collection but not another | Yes |
| You need order or index access | No — use a list |
| You need key-value pairs | No — use a dict |
Sets are extremely fast at checking membership. Checking x in list gets slower as the list grows. Checking x in set stays fast no matter the size — because of how sets are stored internally.
# slow for large collections
big_list = list(range(1_000_000))
print(999999 in big_list) # has to check each item
# fast no matter the size
big_set = set(range(1_000_000))
print(999999 in big_set) # instant lookupA real example
Find which users have logged in more than once — and which users are new today versus returning:
yesterday_visitors = {"ali", "sara", "omar", "fatima", "zainab"}
today_visitors = {"sara", "omar", "ahmed", "ali", "hassan"}
# users who visited both days
returning = yesterday_visitors & today_visitors
print(f"Returning visitors: {returning}")
# Returning visitors: {'ali', 'sara', 'omar'}
# users who are new today
new_today = today_visitors - yesterday_visitors
print(f"New today: {new_today}")
# New today: {'ahmed', 'hassan'}
# users who only visited yesterday and did not come back
lost = yesterday_visitors - today_visitors
print(f"Did not return: {lost}")
# Did not return: {'fatima', 'zainab'}
# all unique visitors across both days
total_unique = yesterday_visitors | today_visitors
print(f"Total unique visitors: {len(total_unique)}")
# Total unique visitors: 7Summary
| Method / Operator | What it does |
|---|---|
add(x) | Add one item |
update(iterable) | Add multiple items |
remove(x) | Remove item — error if missing |
discard(x) | Remove item — no error if missing |
pop() | Remove and return a random item |
clear() | Remove all items |
a | b | Union — all items from both |
a & b | Intersection — items in both |
a - b | Difference — items in a not in b |
a ^ b | Symmetric difference — items in one but not both |
a <= b | Is a a subset of b |
a >= b | Is a a superset of b |
isdisjoint(b) | True if no common items |