Python’s generators are like hidden superpowers waiting to be unleashed. Most of us know the basics, but there’s so much more beneath the surface. Let’s dive into some advanced generator patterns that’ll make your code sing.
First up, let’s talk about coroutines. These bad boys are generators on steroids. They can both consume and produce data, making them perfect for building pipelines or state machines. Here’s a simple example:
def coroutine():
while True:
x = yield
print(f"Received: {x}")
c = coroutine()
next(c) # Prime the coroutine
c.send("Hello")
c.send("World")
This coroutine receives values and prints them. The next(c)
call is crucial - it starts the coroutine up to the first yield point. After that, we can send values in with the send()
method.
But wait, there’s more! We can use generators to create lazy sequences. This is great when you’re dealing with large datasets and don’t want to load everything into memory at once. Check this out:
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fib = fibonacci()
for i in range(10):
print(next(fib))
This generates Fibonacci numbers on-the-fly. No need to pre-calculate or store a bunch of values. It’s efficient and memory-friendly.
Now, let’s talk about generator expressions. These are like list comprehensions but lazier (in a good way). They’re perfect for when you need to process a sequence of items but don’t need to store them all at once. Here’s a quick example:
squares = (x*x for x in range(1000000))
print(next(squares)) # Prints 0
print(next(squares)) # Prints 1
This creates a generator that yields squared numbers. It doesn’t calculate all million squares upfront - just when you ask for them.
But here’s where it gets really cool. We can chain generators together to create complex data processing pipelines. Each generator in the chain consumes the output of the previous one. It’s like building with LEGO blocks, but for data processing:
def integers():
for i in range(1, 10000):
yield i
def squares(seq):
for i in seq:
yield i * i
def take(n, seq):
for i in range(n):
yield next(seq)
pipeline = take(5, squares(integers()))
print(list(pipeline)) # Prints [1, 4, 9, 16, 25]
This pipeline generates integers, squares them, and then takes the first five. It’s modular, efficient, and damn cool.
Now, let’s talk about some real-world applications. Generators are fantastic for processing large files. Instead of reading the whole file into memory, you can process it line by line:
def process_large_file(filename):
with open(filename, 'r') as f:
for line in f:
yield line.strip().upper()
for processed_line in process_large_file('huge_file.txt'):
print(processed_line)
This reads and processes the file one line at a time, keeping memory usage low even for gigantic files.
Generators can also be used for infinite sequences. Need a constant stream of random numbers? No problem:
import random
def infinite_random():
while True:
yield random.random()
rand_gen = infinite_random()
for _ in range(5):
print(next(rand_gen))
This will keep spitting out random numbers until the heat death of the universe (or until you stop it, whichever comes first).
But wait, there’s more! Generators can be used to implement your own itertools-like functions. Here’s a simple implementation of groupby
:
def groupby(iterable, key=None):
if key is None:
key = lambda x: x
iterator = iter(iterable)
current_key = key(next(iterator))
current_group = [current_key]
for item in iterator:
item_key = key(item)
if item_key == current_key:
current_group.append(item)
else:
yield current_key, current_group
current_key = item_key
current_group = [item]
yield current_key, current_group
data = [1, 1, 2, 3, 3, 3, 4, 5, 5]
for key, group in groupby(data):
print(f"Key: {key}, Group: {group}")
This groups consecutive elements with the same key. It’s lazy and efficient, just like we like our code.
Now, let’s talk about some advanced tricks. Did you know you can use generators to implement cooperative multitasking? It’s like having multiple functions running concurrently, but without the complexity of threads:
def task1():
for i in range(3):
print(f"Task 1: Step {i}")
yield
def task2():
for i in range(3):
print(f"Task 2: Step {i}")
yield
def run_tasks():
t1, t2 = task1(), task2()
try:
while True:
next(t1)
next(t2)
except StopIteration:
pass
run_tasks()
This runs two tasks “concurrently”, switching between them at each yield point. It’s a simple form of cooperative multitasking.
Generators can also be used to implement state machines. This is super useful for parsing or processing structured data:
def parser():
state = 'START'
while True:
char = yield
if state == 'START':
if char == '{':
state = 'IN_OBJECT'
elif char == '[':
state = 'IN_ARRAY'
elif state == 'IN_OBJECT':
if char == '}':
state = 'START'
yield 'Object ended'
elif state == 'IN_ARRAY':
if char == ']':
state = 'START'
yield 'Array ended'
p = parser()
next(p) # Prime the parser
for char in '{"foo": [1, 2, 3]}':
result = p.send(char)
if result:
print(result)
This implements a simple JSON-like parser as a state machine using a generator.
Generators can also be used to implement lazy property evaluation. This is great when you have properties that are expensive to compute but not always needed:
class LazyProperty:
def __init__(self, function):
self.function = function
self.name = function.__name__
def __get__(self, obj, type=None):
if obj is None:
return self
value = self.function(obj)
setattr(obj, self.name, value)
return value
class Person:
def __init__(self, name):
self.name = name
@LazyProperty
def friends(self):
print("Expensive operation!")
return [f"Friend of {self.name}"]
p = Person("Alice")
print(p.name) # Prints immediately
print(p.friends) # Prints "Expensive operation!" then the list
print(p.friends) # Just prints the list, no expensive operation
This LazyProperty
decorator uses a generator-like pattern to compute the property value only when it’s first accessed.
Finally, let’s talk about asynchronous programming. Generators are the foundation of Python’s async/await syntax. Here’s a simple example using the asyncio
library:
import asyncio
async def countdown(name, n):
while n > 0:
print(f"{name}: {n}")
await asyncio.sleep(1)
n -= 1
async def main():
await asyncio.gather(
countdown("A", 3),
countdown("B", 5)
)
asyncio.run(main())
This runs two countdowns concurrently. The async
and await
keywords are syntactic sugar over generators, making asynchronous programming more intuitive.
In conclusion, generators in Python are incredibly powerful and versatile. They can help you write more efficient, cleaner, and more expressive code. From lazy evaluation to cooperative multitasking, from data processing pipelines to asynchronous programming, generators have got you covered. So next time you’re coding, ask yourself: “Could a generator make this better?” Chances are, the answer is yes.