Python’s structural subtyping with Protocols is a game-changer for those of us who love the language’s flexibility but want more robustness in our code. It’s like having your cake and eating it too – you get the best of both dynamic and static typing worlds.
I’ve been using Python for years, and one of the things that always bothered me was the lack of a middle ground between the wild west of duck typing and the rigidity of strict type hierarchies. Protocols fill that gap beautifully.
Let’s start with the basics. Protocols allow us to define interfaces implicitly, focusing on what an object can do rather than what it is. This aligns perfectly with Python’s “duck typing” philosophy – if it walks like a duck and quacks like a duck, it’s a duck. But now, we can add a layer of safety on top of that.
Here’s a simple example to illustrate:
from typing import Protocol
class Quacker(Protocol):
def quack(self) -> str:
...
def make_noise(animal: Quacker) -> None:
print(animal.quack())
class Duck:
def quack(self) -> str:
return "Quack!"
class Person:
def quack(self) -> str:
return "I'm imitating a duck!"
make_noise(Duck()) # Works fine
make_noise(Person()) # Also works fine
In this example, we define a Quacker
protocol that requires any object to have a quack
method returning a string. Both Duck
and Person
classes satisfy this protocol, so they can be used interchangeably where a Quacker
is expected.
What’s cool about this is that we didn’t need to explicitly inherit from Quacker
or register our classes anywhere. The type checker infers compatibility based on the structure of our objects. This is what we mean by structural subtyping.
But Protocols aren’t just about static type checking. They also support runtime checks using isinstance()
. This is incredibly useful when you’re working with data from external sources or when you need to perform type checks at runtime.
Here’s how you can use runtime checks:
from typing import Protocol, runtime_checkable
@runtime_checkable
class Sized(Protocol):
def __len__(self) -> int:
...
print(isinstance([], Sized)) # True
print(isinstance("hello", Sized)) # True
print(isinstance(42, Sized)) # False
The @runtime_checkable
decorator allows us to use isinstance()
with our Protocol. This gives us the flexibility to perform dynamic checks when needed, while still benefiting from static type checking during development.
One thing I’ve found particularly useful is how Protocols compare to abstract base classes (ABCs). While ABCs require explicit registration or inheritance, Protocols are much more flexible. They allow for structural compatibility without forcing a rigid class hierarchy.
Let’s look at a more complex example to see how Protocols can help us write more flexible and maintainable code:
from typing import Protocol, List
class DataSource(Protocol):
def fetch_data(self) -> List[dict]:
...
class DatabaseSource:
def fetch_data(self) -> List[dict]:
# Simulate fetching from a database
return [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]
class APISource:
def fetch_data(self) -> List[dict]:
# Simulate fetching from an API
return [{"id": 3, "name": "Charlie"}, {"id": 4, "name": "David"}]
def process_data(source: DataSource) -> None:
data = source.fetch_data()
for item in data:
print(f"Processing: {item['name']}")
process_data(DatabaseSource())
process_data(APISource())
In this example, we define a DataSource
protocol that requires a fetch_data
method. We then have two different implementations: DatabaseSource
and APISource
. Our process_data
function can work with any object that satisfies the DataSource
protocol, making our code more modular and easier to extend.
This approach has saved me countless hours when working on large-scale applications. It allows for easy mocking in tests, swapping out implementations, and adding new data sources without changing existing code.
But Protocols aren’t just for simple method definitions. They can also include attributes, making them even more powerful for defining complex interfaces:
from typing import Protocol
class ConfigurableWorker(Protocol):
max_retries: int
timeout: float
def process(self, data: str) -> bool:
...
class MyWorker:
def __init__(self, max_retries: int, timeout: float):
self.max_retries = max_retries
self.timeout = timeout
def process(self, data: str) -> bool:
# Implementation here
return True
def run_worker(worker: ConfigurableWorker, input_data: str) -> None:
success = worker.process(input_data)
if success:
print(f"Processed with {worker.max_retries} retries and {worker.timeout}s timeout")
my_worker = MyWorker(max_retries=3, timeout=5.0)
run_worker(my_worker, "some data")
In this example, our ConfigurableWorker
protocol includes both methods and attributes. This allows us to define more comprehensive interfaces that capture not just behavior but also configuration.
One of the things I love about Protocols is how they encourage thinking in terms of interfaces rather than implementations. This leads to more decoupled, modular code that’s easier to maintain and extend.
However, it’s important to note that Protocols aren’t a silver bullet. They work best when you’re defining clear, focused interfaces. If you find yourself creating Protocols with dozens of methods, it might be a sign that you need to break things down into smaller, more manageable pieces.
Another thing to keep in mind is that while Protocols provide static type checking, they don’t enforce runtime behavior. An object might satisfy a Protocol’s interface but still behave incorrectly. It’s up to you to ensure that objects implementing a Protocol do so correctly.
Let’s look at an example of how Protocols can help us write more generic, reusable code:
from typing import Protocol, TypeVar, List
T = TypeVar('T')
class Comparable(Protocol):
def __lt__(self, other: T) -> bool:
...
def bubble_sort(items: List[Comparable]) -> List[Comparable]:
n = len(items)
for i in range(n):
for j in range(0, n - i - 1):
if items[j] > items[j + 1]:
items[j], items[j + 1] = items[j + 1], items[j]
return items
# Works with any type that implements __lt__
print(bubble_sort([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]))
print(bubble_sort(['banana', 'apple', 'cherry', 'date']))
class Person:
def __init__(self, name: str, age: int):
self.name = name
self.age = age
def __lt__(self, other: 'Person') -> bool:
return self.age < other.age
def __repr__(self) -> str:
return f"Person('{self.name}', {self.age})"
people = [
Person('Alice', 30),
Person('Bob', 25),
Person('Charlie', 35)
]
print(bubble_sort(people))
In this example, we define a Comparable
protocol that requires only the __lt__
method. Our bubble_sort
function can then work with any list of items that satisfy this protocol. This includes built-in types like integers and strings, as well as our custom Person
class.
This level of flexibility and reusability is hard to achieve with traditional inheritance-based approaches. Protocols allow us to write generic algorithms that work with a wide variety of types, as long as they support the necessary operations.
One area where I’ve found Protocols particularly useful is in testing. They make it much easier to create mock objects that satisfy specific interfaces:
from typing import Protocol, List
from unittest.mock import Mock
class DataFetcher(Protocol):
def fetch(self) -> List[dict]:
...
def process_data(fetcher: DataFetcher) -> List[str]:
data = fetcher.fetch()
return [item['name'].upper() for item in data if 'name' in item]
# In your test
mock_fetcher = Mock(spec=DataFetcher)
mock_fetcher.fetch.return_value = [
{'name': 'Alice'},
{'name': 'Bob'},
{'id': 3} # This one will be skipped
]
result = process_data(mock_fetcher)
assert result == ['ALICE', 'BOB']
mock_fetcher.fetch.assert_called_once()
By defining a DataFetcher
protocol, we can easily create a mock object that satisfies the interface. This allows us to test our process_data
function in isolation, without needing to set up a real data source.
As I’ve worked more with Protocols, I’ve discovered some best practices that have helped me use them effectively:
-
Keep Protocols focused: Define Protocols that represent a single concept or capability. This makes them more reusable and easier to understand.
-
Use Protocols for public interfaces: They’re great for defining the public API of your modules or packages. This allows users of your code to easily create compatible objects.
-
Combine Protocols: You can use Protocol inheritance to create more complex interfaces from simpler ones. This promotes code reuse and helps keep your interfaces modular.
-
Document your Protocols: While the interface is defined in code, it’s still important to document the expected behavior of methods and attributes.
-
Use runtime checks sparingly: While
isinstance()
checks with Protocols are possible, they should be used judiciously. Overuse can lead to code that’s overly dependent on runtime type checking.
Here’s an example that puts some of these practices into action:
from typing import Protocol, runtime_checkable
class Readable(Protocol):
def read(self) -> str:
...
class Writable(Protocol):
def write(self, data: str) -> None:
...
@runtime_checkable
class ReadWritable(Readable, Writable, Protocol):
"""An object that can be both read from and written to."""
pass
class FileWrapper:
def __init__(self, filename: str):
self.filename = filename
def read(self) -> str:
with open(self.filename, 'r') as f:
return f.read()
def write(self, data: str) -> None:
with open(self.filename, 'w') as f:
f.write(data)
def process(rw: ReadWritable) -> None:
data = rw.read()
processed = data.upper()
rw.write(processed)
# This works because FileWrapper satisfies the ReadWritable protocol
file = FileWrapper('example.txt')
process(file)
# We can also do runtime checks if needed
print(isinstance(file, ReadWritable)) # True
In this example, we define simple Readable
and Writable
protocols, then combine them into a ReadWritable
protocol. We use the @runtime_checkable
decorator to allow for isinstance()
checks if needed. Our FileWrapper
class implicitly satisfies the ReadWritable
protocol, allowing it to be used with the process
function.
As Python continues to evolve, features like Protocols are making it easier than ever to write type-safe, flexible, and maintainable code. They bridge the gap between dynamic and static typing, giving us the best of both worlds.
In my experience, adopting Protocols has led to more robust and easier-to-understand codebases. They encourage thinking in terms of interfaces and capabilities, which naturally leads to more modular and reusable code. While they do require a bit of upfront thought in designing your interfaces, the payoff in terms of code flexibility and maintainability is well worth it.
Whether you’re building large-scale applications, libraries for others to use, or just want to write cleaner, more idiomatic Python, I highly recommend giving Protocols a try. They’re a powerful tool that aligns perfectly with Python’s philosophy of simplicity and readability, while providing the safety and clarity of static typing. Happy coding!