Group Anagrams
Learn how to group words that are anagrams of each other in Python.
Group Anagrams
Problem
Two words are anagrams if they contain the same letters in any order.
"eat" and "tea" → anagrams (same letters: a, e, t)
"tan" and "nat" → anagrams (same letters: a, n, t)
"bat" → not an anagram of anything elseGiven a list of words, group all anagrams together.
Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]
Input: ["abc", "bca", "cab", "xyz", "zyx"]
Output: [["abc", "bca", "cab"], ["xyz", "zyx"]]
Input: ["listen", "silent", "hello", "world", "enlist"]
Output: [["listen", "silent", "enlist"], ["hello"], ["world"]]
Input: ["a"]
Output: [["a"]]Logic
Key insight — sorted word as key:
If two words are anagrams, sorting their letters gives the same result:
"eat" → sorted → "aet"
"tea" → sorted → "aet"
"ate" → sorted → "aet"
"tan" → sorted → "ant"
"nat" → sorted → "ant"Use the sorted letters as the dictionary key. All anagrams will map to the same key.
- Create an empty dictionary
- For each word — sort its letters to get the key
- Add the word to the list at that key
- Return all the groups
Solution 1 — sorted letters as key
def group_anagrams(words):
groups = {} # key = sorted letters, value = list of anagrams
for word in words:
# sort the letters of the word — this is the key
# "eat" → ["a", "e", "t"] → "aet"
key = "".join(sorted(word))
if key in groups:
groups[key].append(word) # add to existing group
else:
groups[key] = [word] # start a new group
# return just the groups — not the keys
return list(groups.values())
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
words2 = ["abc", "bca", "cab", "xyz", "zyx"]
print(group_anagrams(words2))
# [['abc', 'bca', 'cab'], ['xyz', 'zyx']]
words3 = ["listen", "silent", "hello", "world", "enlist"]
print(group_anagrams(words3))
# [['listen', 'silent', 'enlist'], ['hello'], ['world']]Code Execution — Solution 1
Trace through group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"]):
| Step | word | sorted(word) | key | Action | groups |
|---|---|---|---|---|---|
| 1st | "eat" | ["a","e","t"] | "aet" | New group | {"aet": ["eat"]} |
| 2nd | "tea" | ["a","e","t"] | "aet" | Add to group | {"aet": ["eat","tea"]} |
| 3rd | "tan" | ["a","n","t"] | "ant" | New group | {"aet": ["eat","tea"], "ant": ["tan"]} |
| 4th | "ate" | ["a","e","t"] | "aet" | Add to group | {"aet": ["eat","tea","ate"], "ant": ["tan"]} |
| 5th | "nat" | ["a","n","t"] | "ant" | Add to group | {"aet": [...], "ant": ["tan","nat"]} |
| 6th | "bat" | ["a","b","t"] | "abt" | New group | {"aet": [...], "ant": [...], "abt": ["bat"]} |
groups.values() = [["eat","tea","ate"], ["tan","nat"], ["bat"]]
"".join(sorted(word)) works in two steps. sorted("eat") returns ["a", "e", "t"] — a list of characters in alphabetical order. "".join(...) joins them back into a string "aet". This string is a reliable key — any anagram of "eat" will always produce "aet".
Solution 2 — using defaultdict
defaultdict(list) removes the need to check if a key exists — it creates an empty list automatically for new keys.
from collections import defaultdict
def group_anagrams(words):
# defaultdict(list) creates [] automatically for new keys
groups = defaultdict(list)
for word in words:
key = "".join(sorted(word))
groups[key].append(word) # no need to check if key exists
return list(groups.values())
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]Code Execution — Solution 2
Trace through group_anagrams(["eat", "tea", "bat"]):
| Step | word | key | groups[key] before | After .append() |
|---|---|---|---|---|
| 1st | "eat" | "aet" | [] (auto-created) | ["eat"] |
| 2nd | "tea" | "aet" | ["eat"] | ["eat", "tea"] |
| 3rd | "bat" | "abt" | [] (auto-created) | ["bat"] |
defaultdict automatically creates an empty list for "abt" when first accessed — no if key in groups needed.
Solution 3 — using character frequency as key
Instead of sorting, count the frequency of each letter. Two anagrams will always have the same character frequencies.
def group_anagrams(words):
from collections import defaultdict
groups = defaultdict(list)
for word in words:
# count frequency of each letter
# using a tuple of 26 counts (one per letter a-z)
counts = [0] * 26
for char in word:
# ord("a") = 97, ord("b") = 98, etc.
# char index = ord(char) - ord("a")
counts[ord(char) - ord("a")] += 1
# tuple is hashable — can be used as dict key
key = tuple(counts)
groups[key].append(word)
return list(groups.values())
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]Code Execution — Solution 3
Trace how the key is built for "eat" and "tea":
For "eat":
"e"→ index4→counts[4] += 1"a"→ index0→counts[0] += 1"t"→ index19→counts[19] += 1
counts = [1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0]
(index 0=a, 4=e, 19=t each have count 1)
For "tea":
"t"→ index19→counts[19] += 1"e"→ index4→counts[4] += 1"a"→ index0→counts[0] += 1
Same counts → same key → same group ✅
The character frequency approach avoids sorting — it is O(n) per word instead of O(n log n). For very long words, this is faster. For typical short words the difference is negligible, and the sorted approach is simpler to read.
Solution 4 — with sorting and case insensitive
Handle uppercase, lowercase, and punctuation in real sentences.
from collections import defaultdict
def group_anagrams(words, case_sensitive=False):
groups = defaultdict(list)
for word in words:
# normalize — lowercase and remove non-letters
clean = "".join(c for c in word.lower() if c.isalpha())
key = "".join(sorted(clean))
groups[key].append(word) # store original word
return list(groups.values())
# case insensitive — "Eat" and "tea" are anagrams
words = ["Eat", "Tea", "TAN", "ate", "Nat", "BAT"]
print(group_anagrams(words))
# [['Eat', 'Tea', 'ate'], ['TAN', 'Nat'], ['BAT']]
# real words
words2 = ["listen", "silent", "ENLIST", "hello", "world"]
print(group_anagrams(words2))
# [['listen', 'silent', 'ENLIST'], ['hello'], ['world']]Code Execution — Solution 4
Trace through ["Eat", "Tea", "TAN"]:
word | .lower() | isalpha() filter | sorted | key |
|---|---|---|---|---|
"Eat" | "eat" | "eat" | ["a","e","t"] | "aet" |
"Tea" | "tea" | "tea" | ["a","e","t"] | "aet" |
"TAN" | "tan" | "tan" | ["a","n","t"] | "ant" |
"Eat" and "Tea" share key "aet" → same group.
"TAN" gets key "ant" → own group.
Bonus — check if two words are anagrams
A simpler version — just check if two specific words are anagrams of each other.
def are_anagrams(word1, word2):
# clean both words
clean1 = "".join(sorted(word1.lower()))
clean2 = "".join(sorted(word2.lower()))
return clean1 == clean2
print(are_anagrams("listen", "silent")) # True
print(are_anagrams("hello", "world")) # False
print(are_anagrams("Astronomer", "Moon starer")) # False — spaces differBonus — anagram check ignoring spaces
def are_anagrams(word1, word2):
# remove spaces and sort
clean1 = "".join(sorted(word1.lower().replace(" ", "")))
clean2 = "".join(sorted(word2.lower().replace(" ", "")))
return clean1 == clean2
print(are_anagrams("Astronomer", "Moon starer")) # True
print(are_anagrams("listen", "silent")) # True
print(are_anagrams("The eyes", "They see")) # TrueBonus — find all anagram pairs in a list
from collections import defaultdict
from itertools import combinations
def find_anagram_pairs(words):
"""Return all pairs of words that are anagrams."""
pairs = []
for word1, word2 in combinations(words, 2):
if sorted(word1.lower()) == sorted(word2.lower()):
pairs.append((word1, word2))
return pairs
words = ["eat", "tea", "tan", "ate", "nat", "bat"]
pairs = find_anagram_pairs(words)
for pair in pairs:
print(f"{pair[0]} ↔ {pair[1]}")Output:
eat ↔ tea
eat ↔ ate
tea ↔ ate
tan ↔ natWhich solution to use?
| Solution | How | Best when |
|---|---|---|
| Solution 1 | Sort letters as key | Clear and readable — default choice |
| Solution 2 | defaultdict + sort | Cleaner version of Solution 1 |
| Solution 3 | Character frequency | Very long words where sort speed matters |
| Solution 4 | Case insensitive | Real world text with mixed case |
Output
[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
[['abc', 'bca', 'cab'], ['xyz', 'zyx']]
[['listen', 'silent', 'enlist'], ['hello'], ['world']]
[['a']]