DocsHub
Functions

Generators

Learn what generators are, how yield works, and why generators are useful in Python.

Generators

Before understanding generators, think about this problem.

You need to work with a million numbers. The normal approach — store them all in a list:

numbers = list(range(1_000_000))

This creates all one million numbers right now and stores them in memory. That is a lot of memory used at once, even if you only ever look at them one at a time.

A generator solves this. Instead of creating all the values at once and storing them, it creates values one at a time, only when you ask for the next one. The rest do not exist yet.

Think of it like a tap vs a bucket. A bucket holds all the water at once. A tap gives you water only when you turn it on — one flow at a time.


A normal function vs a generator

A normal function runs, returns a value, and is done:

def get_numbers():
    return [1, 2, 3, 4, 5]

numbers = get_numbers()
print(numbers)   # [1, 2, 3, 4, 5]

All five numbers are created and returned at once.

A generator function uses yield instead of return:

def get_numbers():
    yield 1
    yield 2
    yield 3
    yield 4
    yield 5

gen = get_numbers()
print(gen)   # <generator object get_numbers at 0x...>

Notice — calling get_numbers() did not run the function at all. It gave back a generator object. The function is paused at the very beginning, waiting.


yield — the heart of generators

yield is what makes a generator. When Python hits a yield:

  1. It gives back that value to the caller
  2. It pauses the function right there — remembering exactly where it stopped
  3. Next time you ask for a value, it resumes from where it paused
def get_numbers():
    print("About to yield 1")
    yield 1
    print("About to yield 2")
    yield 2
    print("About to yield 3")
    yield 3

gen = get_numbers()

print(next(gen))   # About to yield 1 → 1
print(next(gen))   # About to yield 2 → 2
print(next(gen))   # About to yield 3 → 3

Output:

About to yield 1
1
About to yield 2
2
About to yield 3
3

next() tells the generator — give me the next value. Each call resumes the function from where it last paused.

When there are no more values, calling next() raises a StopIteration exception:

print(next(gen))   # StopIteration

In practice you rarely call next() manually. You use a for loop — it handles StopIteration automatically and stops cleanly:

def get_numbers():
    yield 1
    yield 2
    yield 3

for number in get_numbers():
    print(number)

Output:

1
2
3

yield vs return

returnyield
Gives backOne value, then doneOne value, then pauses
Function stateLost after returnRemembered until next call
Can be called againStarts fresh each timeResumes from where it paused
ReturnsA valueA generator object

A real generator — counting up

def count_up(start, end):
    current = start
    while current <= end:
        yield current
        current += 1

for number in count_up(1, 5):
    print(number)

Output:

1
2
3
4
5

Each iteration of the for loop asks for the next value. The generator runs until the yield, pauses, then resumes from there next time.


Why generators save memory

Here is the difference clearly:

# Normal list — creates all 1 million numbers RIGHT NOW
numbers = list(range(1_000_000))   # uses ~8MB of memory

# Generator — creates numbers ONE AT A TIME as needed
def generate_numbers(n):
    for i in range(n):
        yield i

gen = generate_numbers(1_000_000)   # uses almost no memory

The generator object itself is tiny — it just remembers where it is and what to do next. The million numbers are never all in memory at the same time.

A practical example — reading a huge file line by line:

def read_large_file(filepath):
    with open(filepath, "r") as f:
        for line in f:
            yield line.strip()

for line in read_large_file("big_log_file.txt"):
    print(line)

If the file has 10 million lines, this reads one line at a time. A regular approach loading the whole file would crash on large files.


Generator with logic — only yield what you need

Generators can have conditions, loops, anything a normal function can have:

def even_numbers(limit):
    for n in range(limit):
        if n % 2 == 0:
            yield n

for num in even_numbers(20):
    print(num, end=" ")

Output:

0 2 4 6 8 10 12 14 16 18

Generator expressions — one line generators

Just like list comprehensions create lists in one line, generator expressions create generators in one line. The only difference — use () instead of []:

# list comprehension — creates the whole list immediately
squares_list = [x ** 2 for x in range(10)]

# generator expression — creates values one at a time
squares_gen = (x ** 2 for x in range(10))

print(squares_list)   # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
print(squares_gen)    # <generator object <genexpr> at 0x...>

for square in squares_gen:
    print(square, end=" ")
# 0 1 4 9 16 25 36 49 64 81

Generator expressions are great when you want to pass a sequence to a function without building the whole list first:

# sum a million squares without building a list
total = sum(x ** 2 for x in range(1_000_000))
print(total)

This is memory efficient — sum() processes each value one at a time.

Notice there are no double parentheses here — sum(x ** 2 for x in range(10)) not sum((x ** 2 for x in range(10))). When a generator expression is the only argument to a function, you can drop the extra parentheses.


Chaining generators

Generators can feed into each other. Each one processes the output of the previous one — like a pipeline. None of them build a full list at any point:

def get_orders(orders):
    for order in orders:
        yield order

def filter_delivered(orders):
    for order in orders:
        if order["status"] == "delivered":
            yield order

def get_totals(orders):
    for order in orders:
        yield order["total"]

orders = [
    {"id": 1, "status": "delivered", "total": 250},
    {"id": 2, "status": "pending",   "total": 89},
    {"id": 3, "status": "delivered", "total": 540},
    {"id": 4, "status": "cancelled", "total": 120},
    {"id": 5, "status": "delivered", "total": 75},
]

pipeline = get_totals(filter_delivered(get_orders(orders)))

for total in pipeline:
    print(total)

Output:

250
540
75

Each generator does one job. They are chained together and data flows through one item at a time.


A real example — generating invoice numbers

def invoice_number_generator(prefix, start=1):
    number = start
    while True:
        yield f"{prefix}-{number:04d}"
        number += 1

invoices = invoice_number_generator("INV")

print(next(invoices))   # INV-0001
print(next(invoices))   # INV-0002
print(next(invoices))   # INV-0003

This generator runs forever — while True with a yield is a common pattern for infinite sequences. It only produces the next value when you ask for it, so it never actually runs forever in practice.

An infinite generator is fine as long as you control when to stop — with a break in a loop, or by only calling next() a specific number of times. Never iterate over an infinite generator with a plain for loop without a break — it will run forever.


When to use generators

SituationUse generator?
Working with a large dataset that does not fit in memoryYes
Reading a large file line by lineYes
Generating an infinite sequenceYes
Building a data processing pipelineYes
Small list you need to access multiple timesNo — use a list
You need to know the length upfrontNo — use a list
You need random access like items[3]No — use a list

Summary

ConceptExample
Generator functiondef gen(): yield value
Get next valuenext(gen)
Loop over generatorfor x in gen():
Generator expression(x ** 2 for x in range(10))
Infinite generatorwhile True: yield value
Memory efficient sumsum(x ** 2 for x in range(n))

On this page