Generators
Learn what generators are, how yield works, and why generators are useful in Python.
Generators
Before understanding generators, think about this problem.
You need to work with a million numbers. The normal approach — store them all in a list:
numbers = list(range(1_000_000))This creates all one million numbers right now and stores them in memory. That is a lot of memory used at once, even if you only ever look at them one at a time.
A generator solves this. Instead of creating all the values at once and storing them, it creates values one at a time, only when you ask for the next one. The rest do not exist yet.
Think of it like a tap vs a bucket. A bucket holds all the water at once. A tap gives you water only when you turn it on — one flow at a time.
A normal function vs a generator
A normal function runs, returns a value, and is done:
def get_numbers():
return [1, 2, 3, 4, 5]
numbers = get_numbers()
print(numbers) # [1, 2, 3, 4, 5]All five numbers are created and returned at once.
A generator function uses yield instead of return:
def get_numbers():
yield 1
yield 2
yield 3
yield 4
yield 5
gen = get_numbers()
print(gen) # <generator object get_numbers at 0x...>Notice — calling get_numbers() did not run the function at all. It gave back a generator object. The function is paused at the very beginning, waiting.
yield — the heart of generators
yield is what makes a generator. When Python hits a yield:
- It gives back that value to the caller
- It pauses the function right there — remembering exactly where it stopped
- Next time you ask for a value, it resumes from where it paused
def get_numbers():
print("About to yield 1")
yield 1
print("About to yield 2")
yield 2
print("About to yield 3")
yield 3
gen = get_numbers()
print(next(gen)) # About to yield 1 → 1
print(next(gen)) # About to yield 2 → 2
print(next(gen)) # About to yield 3 → 3Output:
About to yield 1
1
About to yield 2
2
About to yield 3
3next() tells the generator — give me the next value. Each call resumes the function from where it last paused.
When there are no more values, calling next() raises a StopIteration exception:
print(next(gen)) # StopIterationIn practice you rarely call next() manually. You use a for loop — it handles StopIteration automatically and stops cleanly:
def get_numbers():
yield 1
yield 2
yield 3
for number in get_numbers():
print(number)Output:
1
2
3yield vs return
return | yield | |
|---|---|---|
| Gives back | One value, then done | One value, then pauses |
| Function state | Lost after return | Remembered until next call |
| Can be called again | Starts fresh each time | Resumes from where it paused |
| Returns | A value | A generator object |
A real generator — counting up
def count_up(start, end):
current = start
while current <= end:
yield current
current += 1
for number in count_up(1, 5):
print(number)Output:
1
2
3
4
5Each iteration of the for loop asks for the next value. The generator runs until the yield, pauses, then resumes from there next time.
Why generators save memory
Here is the difference clearly:
# Normal list — creates all 1 million numbers RIGHT NOW
numbers = list(range(1_000_000)) # uses ~8MB of memory
# Generator — creates numbers ONE AT A TIME as needed
def generate_numbers(n):
for i in range(n):
yield i
gen = generate_numbers(1_000_000) # uses almost no memoryThe generator object itself is tiny — it just remembers where it is and what to do next. The million numbers are never all in memory at the same time.
A practical example — reading a huge file line by line:
def read_large_file(filepath):
with open(filepath, "r") as f:
for line in f:
yield line.strip()
for line in read_large_file("big_log_file.txt"):
print(line)If the file has 10 million lines, this reads one line at a time. A regular approach loading the whole file would crash on large files.
Generator with logic — only yield what you need
Generators can have conditions, loops, anything a normal function can have:
def even_numbers(limit):
for n in range(limit):
if n % 2 == 0:
yield n
for num in even_numbers(20):
print(num, end=" ")Output:
0 2 4 6 8 10 12 14 16 18Generator expressions — one line generators
Just like list comprehensions create lists in one line, generator expressions create generators in one line. The only difference — use () instead of []:
# list comprehension — creates the whole list immediately
squares_list = [x ** 2 for x in range(10)]
# generator expression — creates values one at a time
squares_gen = (x ** 2 for x in range(10))
print(squares_list) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
print(squares_gen) # <generator object <genexpr> at 0x...>
for square in squares_gen:
print(square, end=" ")
# 0 1 4 9 16 25 36 49 64 81Generator expressions are great when you want to pass a sequence to a function without building the whole list first:
# sum a million squares without building a list
total = sum(x ** 2 for x in range(1_000_000))
print(total)This is memory efficient — sum() processes each value one at a time.
Notice there are no double parentheses here — sum(x ** 2 for x in range(10)) not sum((x ** 2 for x in range(10))). When a generator expression is the only argument to a function, you can drop the extra parentheses.
Chaining generators
Generators can feed into each other. Each one processes the output of the previous one — like a pipeline. None of them build a full list at any point:
def get_orders(orders):
for order in orders:
yield order
def filter_delivered(orders):
for order in orders:
if order["status"] == "delivered":
yield order
def get_totals(orders):
for order in orders:
yield order["total"]
orders = [
{"id": 1, "status": "delivered", "total": 250},
{"id": 2, "status": "pending", "total": 89},
{"id": 3, "status": "delivered", "total": 540},
{"id": 4, "status": "cancelled", "total": 120},
{"id": 5, "status": "delivered", "total": 75},
]
pipeline = get_totals(filter_delivered(get_orders(orders)))
for total in pipeline:
print(total)Output:
250
540
75Each generator does one job. They are chained together and data flows through one item at a time.
A real example — generating invoice numbers
def invoice_number_generator(prefix, start=1):
number = start
while True:
yield f"{prefix}-{number:04d}"
number += 1
invoices = invoice_number_generator("INV")
print(next(invoices)) # INV-0001
print(next(invoices)) # INV-0002
print(next(invoices)) # INV-0003This generator runs forever — while True with a yield is a common pattern for infinite sequences. It only produces the next value when you ask for it, so it never actually runs forever in practice.
An infinite generator is fine as long as you control when to stop — with a break in a loop, or by only calling next() a specific number of times. Never iterate over an infinite generator with a plain for loop without a break — it will run forever.
When to use generators
| Situation | Use generator? |
|---|---|
| Working with a large dataset that does not fit in memory | Yes |
| Reading a large file line by line | Yes |
| Generating an infinite sequence | Yes |
| Building a data processing pipeline | Yes |
| Small list you need to access multiple times | No — use a list |
| You need to know the length upfront | No — use a list |
You need random access like items[3] | No — use a list |
Summary
| Concept | Example |
|---|---|
| Generator function | def gen(): yield value |
| Get next value | next(gen) |
| Loop over generator | for x in gen(): |
| Generator expression | (x ** 2 for x in range(10)) |
| Infinite generator | while True: yield value |
| Memory efficient sum | sum(x ** 2 for x in range(n)) |