Group Anagrams

Problem

Two words are anagrams if they contain the same letters in any order.

"eat" and "tea" → anagrams   (same letters: a, e, t)
"tan" and "nat" → anagrams   (same letters: a, n, t)
"bat"           → not an anagram of anything else

Given a list of words, group all anagrams together.

Input:  ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]

Input:  ["abc", "bca", "cab", "xyz", "zyx"]
Output: [["abc", "bca", "cab"], ["xyz", "zyx"]]

Input:  ["listen", "silent", "hello", "world", "enlist"]
Output: [["listen", "silent", "enlist"], ["hello"], ["world"]]

Input:  ["a"]
Output: [["a"]]

Logic

Key insight — sorted word as key:

If two words are anagrams, sorting their letters gives the same result:

"eat" → sorted → "aet"
"tea" → sorted → "aet"
"ate" → sorted → "aet"

"tan" → sorted → "ant"
"nat" → sorted → "ant"

Use the sorted letters as the dictionary key. All anagrams will map to the same key.

Create an empty dictionary
For each word — sort its letters to get the key
Add the word to the list at that key
Return all the groups

Solution 1 — sorted letters as key

def group_anagrams(words):
    groups = {}   # key = sorted letters, value = list of anagrams

    for word in words:
        # sort the letters of the word — this is the key
        # "eat" → ["a", "e", "t"] → "aet"
        key = "".join(sorted(word))

        if key in groups:
            groups[key].append(word)    # add to existing group
        else:
            groups[key] = [word]        # start a new group

    # return just the groups — not the keys
    return list(groups.values())


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

words2 = ["abc", "bca", "cab", "xyz", "zyx"]
print(group_anagrams(words2))
# [['abc', 'bca', 'cab'], ['xyz', 'zyx']]

words3 = ["listen", "silent", "hello", "world", "enlist"]
print(group_anagrams(words3))
# [['listen', 'silent', 'enlist'], ['hello'], ['world']]

Code Execution — Solution 1

Trace through group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"]):

Step	`word`	`sorted(word)`	`key`	Action	`groups`
1st	`"eat"`	`["a","e","t"]`	`"aet"`	New group	`{"aet": ["eat"]}`
2nd	`"tea"`	`["a","e","t"]`	`"aet"`	Add to group	`{"aet": ["eat","tea"]}`
3rd	`"tan"`	`["a","n","t"]`	`"ant"`	New group	`{"aet": ["eat","tea"], "ant": ["tan"]}`
4th	`"ate"`	`["a","e","t"]`	`"aet"`	Add to group	`{"aet": ["eat","tea","ate"], "ant": ["tan"]}`
5th	`"nat"`	`["a","n","t"]`	`"ant"`	Add to group	`{"aet": [...], "ant": ["tan","nat"]}`
6th	`"bat"`	`["a","b","t"]`	`"abt"`	New group	`{"aet": [...], "ant": [...], "abt": ["bat"]}`

groups.values() = [["eat","tea","ate"], ["tan","nat"], ["bat"]]

"".join(sorted(word)) works in two steps. sorted("eat") returns ["a", "e", "t"] — a list of characters in alphabetical order. "".join(...) joins them back into a string "aet". This string is a reliable key — any anagram of "eat" will always produce "aet".

Solution 2 — using defaultdict

defaultdict(list) removes the need to check if a key exists — it creates an empty list automatically for new keys.

from collections import defaultdict

def group_anagrams(words):
    # defaultdict(list) creates [] automatically for new keys
    groups = defaultdict(list)

    for word in words:
        key = "".join(sorted(word))
        groups[key].append(word)   # no need to check if key exists

    return list(groups.values())


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Code Execution — Solution 2

Trace through group_anagrams(["eat", "tea", "bat"]):

Step	`word`	`key`	`groups[key]` before	After `.append()`
1st	`"eat"`	`"aet"`	`[]` (auto-created)	`["eat"]`
2nd	`"tea"`	`"aet"`	`["eat"]`	`["eat", "tea"]`
3rd	`"bat"`	`"abt"`	`[]` (auto-created)	`["bat"]`

defaultdict automatically creates an empty list for "abt" when first accessed — no if key in groups needed.

Solution 3 — using character frequency as key

Instead of sorting, count the frequency of each letter. Two anagrams will always have the same character frequencies.

def group_anagrams(words):
    from collections import defaultdict

    groups = defaultdict(list)

    for word in words:
        # count frequency of each letter
        # using a tuple of 26 counts (one per letter a-z)
        counts = [0] * 26

        for char in word:
            # ord("a") = 97, ord("b") = 98, etc.
            # char index = ord(char) - ord("a")
            counts[ord(char) - ord("a")] += 1

        # tuple is hashable — can be used as dict key
        key = tuple(counts)
        groups[key].append(word)

    return list(groups.values())


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Code Execution — Solution 3

Trace how the key is built for "eat" and "tea":

For "eat":

"e" → index 4 → counts[4] += 1
"a" → index 0 → counts[0] += 1
"t" → index 19 → counts[19] += 1

counts = [1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0] (index 0=a, 4=e, 19=t each have count 1)

For "tea":

"t" → index 19 → counts[19] += 1
"e" → index 4 → counts[4] += 1
"a" → index 0 → counts[0] += 1

Same counts → same key → same group ✅

The character frequency approach avoids sorting — it is O(n) per word instead of O(n log n). For very long words, this is faster. For typical short words the difference is negligible, and the sorted approach is simpler to read.

Solution 4 — with sorting and case insensitive

Handle uppercase, lowercase, and punctuation in real sentences.

from collections import defaultdict

def group_anagrams(words, case_sensitive=False):
    groups = defaultdict(list)

    for word in words:
        # normalize — lowercase and remove non-letters
        clean = "".join(c for c in word.lower() if c.isalpha())
        key = "".join(sorted(clean))
        groups[key].append(word)   # store original word

    return list(groups.values())


# case insensitive — "Eat" and "tea" are anagrams
words = ["Eat", "Tea", "TAN", "ate", "Nat", "BAT"]
print(group_anagrams(words))
# [['Eat', 'Tea', 'ate'], ['TAN', 'Nat'], ['BAT']]

# real words
words2 = ["listen", "silent", "ENLIST", "hello", "world"]
print(group_anagrams(words2))
# [['listen', 'silent', 'ENLIST'], ['hello'], ['world']]

Code Execution — Solution 4

Trace through ["Eat", "Tea", "TAN"]:

`word`	`.lower()`	`isalpha()` filter	`sorted`	`key`
`"Eat"`	`"eat"`	`"eat"`	`["a","e","t"]`	`"aet"`
`"Tea"`	`"tea"`	`"tea"`	`["a","e","t"]`	`"aet"`
`"TAN"`	`"tan"`	`"tan"`	`["a","n","t"]`	`"ant"`

"Eat" and "Tea" share key "aet" → same group. "TAN" gets key "ant" → own group.

Bonus — check if two words are anagrams

A simpler version — just check if two specific words are anagrams of each other.

def are_anagrams(word1, word2):
    # clean both words
    clean1 = "".join(sorted(word1.lower()))
    clean2 = "".join(sorted(word2.lower()))
    return clean1 == clean2


print(are_anagrams("listen", "silent"))   # True
print(are_anagrams("hello", "world"))     # False
print(are_anagrams("Astronomer", "Moon starer"))   # False — spaces differ

Bonus — anagram check ignoring spaces

def are_anagrams(word1, word2):
    # remove spaces and sort
    clean1 = "".join(sorted(word1.lower().replace(" ", "")))
    clean2 = "".join(sorted(word2.lower().replace(" ", "")))
    return clean1 == clean2


print(are_anagrams("Astronomer", "Moon starer"))   # True
print(are_anagrams("listen", "silent"))            # True
print(are_anagrams("The eyes", "They see"))        # True

Bonus — find all anagram pairs in a list

from collections import defaultdict
from itertools import combinations

def find_anagram_pairs(words):
    """Return all pairs of words that are anagrams."""
    pairs = []

    for word1, word2 in combinations(words, 2):
        if sorted(word1.lower()) == sorted(word2.lower()):
            pairs.append((word1, word2))

    return pairs


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
pairs = find_anagram_pairs(words)

for pair in pairs:
    print(f"{pair[0]} ↔ {pair[1]}")

Output:

eat ↔ tea
eat ↔ ate
tea ↔ ate
tan ↔ nat

Which solution to use?

Solution	How	Best when
Solution 1	Sort letters as key	Clear and readable — default choice
Solution 2	`defaultdict` + sort	Cleaner version of Solution 1
Solution 3	Character frequency	Very long words where sort speed matters
Solution 4	Case insensitive	Real world text with mixed case

Output

[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
[['abc', 'bca', 'cab'], ['xyz', 'zyx']]
[['listen', 'silent', 'enlist'], ['hello'], ['world']]
[['a']]

Group Anagrams

On this page