DocsHub
Data Structures

Sets

Learn how to work with sets in Python — collections of unique values with powerful operations.

Sets

A set is a collection of unique values with no duplicates and no order. If you add the same value twice, it only appears once. Python automatically removes duplicates for you.

numbers = {1, 2, 3, 4, 5}
fruits = {"apple", "banana", "mango"}

Two things make sets special:

  1. No duplicates — every value is unique
  2. No order — items have no index, no position

Creating a set

# empty set — must use set(), not {} (that creates an empty dict)
empty = set()

# set with values
fruits = {"apple", "banana", "mango"}

# set from a list — duplicates removed automatically
numbers = set([1, 2, 2, 3, 3, 3, 4])
print(numbers)   # {1, 2, 3, 4}

# set from a string — each character becomes an item
letters = set("hello")
print(letters)   # {'h', 'e', 'l', 'o'} — only one 'l'

To create an empty set, always use set() — not {}. An empty {} creates an empty dictionary, not a set.


Sets remove duplicates automatically

This is the most common reason people use sets — quick and easy deduplication:

names = ["Ali", "Sara", "Ali", "Omar", "Sara", "Ali"]

unique_names = set(names)
print(unique_names)   # {'Ali', 'Sara', 'Omar'}

If you want the result back as a list:

unique_names = list(set(names))

Sets are unordered — when you print a set, the order of items is not guaranteed. Do not rely on set order.


Accessing items

Sets have no index — you cannot do fruits[0]. You loop over them or check membership:

fruits = {"apple", "banana", "mango"}

# check if something is in the set
print("apple" in fruits)    # True
print("grape" in fruits)    # False

# loop over the set
for fruit in fruits:
    print(fruit)

Adding and removing items

add() — add one item

fruits = {"apple", "banana"}

fruits.add("mango")
print(fruits)   # {'apple', 'banana', 'mango'}

# adding a duplicate does nothing
fruits.add("apple")
print(fruits)   # {'apple', 'banana', 'mango'} — unchanged

update() — add multiple items

fruits = {"apple", "banana"}

fruits.update(["mango", "grape", "orange"])
print(fruits)   # {'apple', 'banana', 'mango', 'grape', 'orange'}

remove() — remove an item, raises error if missing

fruits = {"apple", "banana", "mango"}

fruits.remove("banana")
print(fruits)   # {'apple', 'mango'}

fruits.remove("grape")   # KeyError — grape is not in the set

discard() — remove an item, no error if missing

fruits = {"apple", "banana", "mango"}

fruits.discard("banana")   # removes it
fruits.discard("grape")    # does nothing — no error

print(fruits)   # {'apple', 'mango'}

Use discard() when you are not sure if the item exists. Use remove() when you expect it to be there and want an error if it is not.

pop() — remove and return a random item

fruits = {"apple", "banana", "mango"}

removed = fruits.pop()
print(removed)   # some item — unpredictable which one
print(fruits)    # remaining items

clear() — remove everything

fruits = {"apple", "banana", "mango"}
fruits.clear()
print(fruits)   # set()

Set operations

This is where sets become really powerful. Set operations let you compare and combine sets mathematically.

Union A | B Intersection A & B Difference A - B Symmetric Difference A ^ B A ∪ BAll items from both sets A ∩ BOnly items in both sets A - BItems in A but not in B A △ BItems in either but not both

Let's use a real example. Two groups of students:

python_students = {"Ali", "Sara", "Omar", "Fatima"}
javascript_students = {"Sara", "Omar", "Zainab", "Ahmed"}

Union — all students from both groups

Everyone who studies either Python or JavaScript:

all_students = python_students | javascript_students
# or
all_students = python_students.union(javascript_students)

print(all_students)
# {'Ali', 'Sara', 'Omar', 'Fatima', 'Zainab', 'Ahmed'}

Intersection — students in both groups

Only students who study both Python and JavaScript:

both = python_students & javascript_students
# or
both = python_students.intersection(javascript_students)

print(both)
# {'Sara', 'Omar'}

Difference — students in one group but not the other

Students who study Python but not JavaScript:

only_python = python_students - javascript_students
# or
only_python = python_students.difference(javascript_students)

print(only_python)
# {'Ali', 'Fatima'}

Students who study JavaScript but not Python:

only_javascript = javascript_students - python_students
print(only_javascript)
# {'Zainab', 'Ahmed'}

Symmetric difference — students in one group but not both

Everyone who studies only one language — not both:

exclusive = python_students ^ javascript_students
# or
exclusive = python_students.symmetric_difference(javascript_students)

print(exclusive)
# {'Ali', 'Fatima', 'Zainab', 'Ahmed'}

Subset and superset

Check if one set is completely contained within another:

admins = {"Ali", "Sara"}
all_users = {"Ali", "Sara", "Omar", "Fatima"}

# is admins a subset of all_users? (all admins are in all_users)
print(admins.issubset(all_users))      # True
print(admins <= all_users)             # True — same thing

# is all_users a superset of admins? (all_users contains all admins)
print(all_users.issuperset(admins))    # True
print(all_users >= admins)             # True — same thing

isdisjoint() — check if two sets share nothing

a = {1, 2, 3}
b = {4, 5, 6}
c = {3, 4, 5}

print(a.isdisjoint(b))   # True  — no common items
print(a.isdisjoint(c))   # False — both have 3

Frozenset — an immutable set

A frozenset is a set that cannot be changed after creation. No adding, no removing:

permissions = frozenset({"read", "write", "execute"})

permissions.add("delete")   # AttributeError — frozensets are immutable

Frozensets can be used as dictionary keys because they are immutable — regular sets cannot.


When to use sets

SituationUse a set
Remove duplicates from a listYes
Check if a value exists quicklyYes
Find common items between two collectionsYes
Find items in one collection but not anotherYes
You need order or index accessNo — use a list
You need key-value pairsNo — use a dict

Sets are extremely fast at checking membership. Checking x in list gets slower as the list grows. Checking x in set stays fast no matter the size — because of how sets are stored internally.

# slow for large collections
big_list = list(range(1_000_000))
print(999999 in big_list)   # has to check each item

# fast no matter the size
big_set = set(range(1_000_000))
print(999999 in big_set)    # instant lookup

A real example

Find which users have logged in more than once — and which users are new today versus returning:

yesterday_visitors = {"ali", "sara", "omar", "fatima", "zainab"}
today_visitors = {"sara", "omar", "ahmed", "ali", "hassan"}

# users who visited both days
returning = yesterday_visitors & today_visitors
print(f"Returning visitors: {returning}")
# Returning visitors: {'ali', 'sara', 'omar'}

# users who are new today
new_today = today_visitors - yesterday_visitors
print(f"New today: {new_today}")
# New today: {'ahmed', 'hassan'}

# users who only visited yesterday and did not come back
lost = yesterday_visitors - today_visitors
print(f"Did not return: {lost}")
# Did not return: {'fatima', 'zainab'}

# all unique visitors across both days
total_unique = yesterday_visitors | today_visitors
print(f"Total unique visitors: {len(total_unique)}")
# Total unique visitors: 7

Summary

Method / OperatorWhat it does
add(x)Add one item
update(iterable)Add multiple items
remove(x)Remove item — error if missing
discard(x)Remove item — no error if missing
pop()Remove and return a random item
clear()Remove all items
a | bUnion — all items from both
a & bIntersection — items in both
a - bDifference — items in a not in b
a ^ bSymmetric difference — items in one but not both
a <= bIs a a subset of b
a >= bIs a a superset of b
isdisjoint(b)True if no common items

On this page