DocsHub
PythonAdvanced

Group Anagrams

Learn how to group words that are anagrams of each other in Python.

Group Anagrams

Problem

Two words are anagrams if they contain the same letters in any order.

"eat" and "tea" → anagrams   (same letters: a, e, t)
"tan" and "nat" → anagrams   (same letters: a, n, t)
"bat"           → not an anagram of anything else

Given a list of words, group all anagrams together.

Input:  ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]

Input:  ["abc", "bca", "cab", "xyz", "zyx"]
Output: [["abc", "bca", "cab"], ["xyz", "zyx"]]

Input:  ["listen", "silent", "hello", "world", "enlist"]
Output: [["listen", "silent", "enlist"], ["hello"], ["world"]]

Input:  ["a"]
Output: [["a"]]

Logic

Key insight — sorted word as key:

If two words are anagrams, sorting their letters gives the same result:

"eat" → sorted → "aet"
"tea" → sorted → "aet"
"ate" → sorted → "aet"

"tan" → sorted → "ant"
"nat" → sorted → "ant"

Use the sorted letters as the dictionary key. All anagrams will map to the same key.

  1. Create an empty dictionary
  2. For each word — sort its letters to get the key
  3. Add the word to the list at that key
  4. Return all the groups

Solution 1 — sorted letters as key

def group_anagrams(words):
    groups = {}   # key = sorted letters, value = list of anagrams

    for word in words:
        # sort the letters of the word — this is the key
        # "eat" → ["a", "e", "t"] → "aet"
        key = "".join(sorted(word))

        if key in groups:
            groups[key].append(word)    # add to existing group
        else:
            groups[key] = [word]        # start a new group

    # return just the groups — not the keys
    return list(groups.values())


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

words2 = ["abc", "bca", "cab", "xyz", "zyx"]
print(group_anagrams(words2))
# [['abc', 'bca', 'cab'], ['xyz', 'zyx']]

words3 = ["listen", "silent", "hello", "world", "enlist"]
print(group_anagrams(words3))
# [['listen', 'silent', 'enlist'], ['hello'], ['world']]

Code Execution — Solution 1

Trace through group_anagrams(["eat", "tea", "tan", "ate", "nat", "bat"]):

Stepwordsorted(word)keyActiongroups
1st"eat"["a","e","t"]"aet"New group{"aet": ["eat"]}
2nd"tea"["a","e","t"]"aet"Add to group{"aet": ["eat","tea"]}
3rd"tan"["a","n","t"]"ant"New group{"aet": ["eat","tea"], "ant": ["tan"]}
4th"ate"["a","e","t"]"aet"Add to group{"aet": ["eat","tea","ate"], "ant": ["tan"]}
5th"nat"["a","n","t"]"ant"Add to group{"aet": [...], "ant": ["tan","nat"]}
6th"bat"["a","b","t"]"abt"New group{"aet": [...], "ant": [...], "abt": ["bat"]}

groups.values() = [["eat","tea","ate"], ["tan","nat"], ["bat"]]

"".join(sorted(word)) works in two steps. sorted("eat") returns ["a", "e", "t"] — a list of characters in alphabetical order. "".join(...) joins them back into a string "aet". This string is a reliable key — any anagram of "eat" will always produce "aet".


Solution 2 — using defaultdict

defaultdict(list) removes the need to check if a key exists — it creates an empty list automatically for new keys.

from collections import defaultdict

def group_anagrams(words):
    # defaultdict(list) creates [] automatically for new keys
    groups = defaultdict(list)

    for word in words:
        key = "".join(sorted(word))
        groups[key].append(word)   # no need to check if key exists

    return list(groups.values())


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Code Execution — Solution 2

Trace through group_anagrams(["eat", "tea", "bat"]):

Stepwordkeygroups[key] beforeAfter .append()
1st"eat""aet"[] (auto-created)["eat"]
2nd"tea""aet"["eat"]["eat", "tea"]
3rd"bat""abt"[] (auto-created)["bat"]

defaultdict automatically creates an empty list for "abt" when first accessed — no if key in groups needed.


Solution 3 — using character frequency as key

Instead of sorting, count the frequency of each letter. Two anagrams will always have the same character frequencies.

def group_anagrams(words):
    from collections import defaultdict

    groups = defaultdict(list)

    for word in words:
        # count frequency of each letter
        # using a tuple of 26 counts (one per letter a-z)
        counts = [0] * 26

        for char in word:
            # ord("a") = 97, ord("b") = 98, etc.
            # char index = ord(char) - ord("a")
            counts[ord(char) - ord("a")] += 1

        # tuple is hashable — can be used as dict key
        key = tuple(counts)
        groups[key].append(word)

    return list(groups.values())


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(group_anagrams(words))
# [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Code Execution — Solution 3

Trace how the key is built for "eat" and "tea":

For "eat":

  • "e" → index 4counts[4] += 1
  • "a" → index 0counts[0] += 1
  • "t" → index 19counts[19] += 1

counts = [1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0] (index 0=a, 4=e, 19=t each have count 1)

For "tea":

  • "t" → index 19counts[19] += 1
  • "e" → index 4counts[4] += 1
  • "a" → index 0counts[0] += 1

Same counts → same key → same group ✅

The character frequency approach avoids sorting — it is O(n) per word instead of O(n log n). For very long words, this is faster. For typical short words the difference is negligible, and the sorted approach is simpler to read.


Solution 4 — with sorting and case insensitive

Handle uppercase, lowercase, and punctuation in real sentences.

from collections import defaultdict

def group_anagrams(words, case_sensitive=False):
    groups = defaultdict(list)

    for word in words:
        # normalize — lowercase and remove non-letters
        clean = "".join(c for c in word.lower() if c.isalpha())
        key = "".join(sorted(clean))
        groups[key].append(word)   # store original word

    return list(groups.values())


# case insensitive — "Eat" and "tea" are anagrams
words = ["Eat", "Tea", "TAN", "ate", "Nat", "BAT"]
print(group_anagrams(words))
# [['Eat', 'Tea', 'ate'], ['TAN', 'Nat'], ['BAT']]

# real words
words2 = ["listen", "silent", "ENLIST", "hello", "world"]
print(group_anagrams(words2))
# [['listen', 'silent', 'ENLIST'], ['hello'], ['world']]

Code Execution — Solution 4

Trace through ["Eat", "Tea", "TAN"]:

word.lower()isalpha() filtersortedkey
"Eat""eat""eat"["a","e","t"]"aet"
"Tea""tea""tea"["a","e","t"]"aet"
"TAN""tan""tan"["a","n","t"]"ant"

"Eat" and "Tea" share key "aet" → same group. "TAN" gets key "ant" → own group.


Bonus — check if two words are anagrams

A simpler version — just check if two specific words are anagrams of each other.

def are_anagrams(word1, word2):
    # clean both words
    clean1 = "".join(sorted(word1.lower()))
    clean2 = "".join(sorted(word2.lower()))
    return clean1 == clean2


print(are_anagrams("listen", "silent"))   # True
print(are_anagrams("hello", "world"))     # False
print(are_anagrams("Astronomer", "Moon starer"))   # False — spaces differ

Bonus — anagram check ignoring spaces

def are_anagrams(word1, word2):
    # remove spaces and sort
    clean1 = "".join(sorted(word1.lower().replace(" ", "")))
    clean2 = "".join(sorted(word2.lower().replace(" ", "")))
    return clean1 == clean2


print(are_anagrams("Astronomer", "Moon starer"))   # True
print(are_anagrams("listen", "silent"))            # True
print(are_anagrams("The eyes", "They see"))        # True

Bonus — find all anagram pairs in a list

from collections import defaultdict
from itertools import combinations

def find_anagram_pairs(words):
    """Return all pairs of words that are anagrams."""
    pairs = []

    for word1, word2 in combinations(words, 2):
        if sorted(word1.lower()) == sorted(word2.lower()):
            pairs.append((word1, word2))

    return pairs


words = ["eat", "tea", "tan", "ate", "nat", "bat"]
pairs = find_anagram_pairs(words)

for pair in pairs:
    print(f"{pair[0]}{pair[1]}")

Output:

eat ↔ tea
eat ↔ ate
tea ↔ ate
tan ↔ nat

Which solution to use?

SolutionHowBest when
Solution 1Sort letters as keyClear and readable — default choice
Solution 2defaultdict + sortCleaner version of Solution 1
Solution 3Character frequencyVery long words where sort speed matters
Solution 4Case insensitiveReal world text with mixed case

Output

[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
[['abc', 'bca', 'cab'], ['xyz', 'zyx']]
[['listen', 'silent', 'enlist'], ['hello'], ['world']]
[['a']]

On this page