gh-74690: typing: Call _get_protocol_attrs and _callable_members_only at protocol class creation time, not during isinstance() checks by AlexWaygood · Pull Request #103160 · python/cpython (original) (raw)

This PR proposes caching the results of _get_protocol_attrs() and _callable_members_only(), so that they only need to be computed once for each protocol class. This hugely speeds up calling isinstance() against runtime-checkable protocols on the "second call", for all kinds of subtypes of a runtime-checkable protocol. There is, however, a small behaviour change:

from typing import * @runtime_checkable ... class Bar(Protocol): ... x: int ... class Foo: ... def init(self): ... self.x = 42 ... isinstance(Foo(), Bar) True Bar.annotations["y"] = int isinstance(Foo(), Bar) # Evaluates to False on main; True with this PR

It seems pretty unlikely that anybody would be doing that, though (monkey-patching methods or the __annotations__ dict on a protocol class itself). Do we care about the behaviour change? Is it worth documenting the behaviour change, if we do decide it's okay?

Here's benchmark results on my machine for this PR:

Time taken for objects with a property: 1.76
Time taken for objects with a classvar: 1.69
Time taken for objects with an instance var: 2.38
Time taken for objects with no var: 7.28
Time taken for nominal subclass instances: 19.92
Time taken for registered subclass instances: 11.60

And here's the same benchmark on main:

Time taken for objects with a property: 3.14
Time taken for objects with a classvar: 3.14
Time taken for objects with an instance var: 11.57
Time taken for objects with no var: 15.26
Time taken for nominal subclass instances: 24.60
Time taken for registered subclass instances: 21.32

(The benchmark is pretty skewed towards showing a good result for caching, since it just calls isinstance() 500,000 times against the same runtime-checkable protocol.)

Benchmark script

import time from typing import Protocol, runtime_checkable

@runtime_checkable class HasX(Protocol): x: int

class Foo: @property def x(self) -> int: return 42

class Bar: x = 42

class Baz: def init(self): self.x = 42

class Egg: ...

class Nominal(HasX): def init(self): self.x = 42

class Registered: ...

HasX.register(Registered)

num_instances = 500_000 foos = [Foo() for _ in range(num_instances)] bars = [Bar() for _ in range(num_instances)] bazzes = [Baz() for _ in range(num_instances)] basket = [Egg() for _ in range(num_instances)] nominals = [Nominal() for _ in range(num_instances)] registereds = [Registered() for _ in range(num_instances)]

def bench(objs, title): start_time = time.perf_counter() for obj in objs: isinstance(obj, HasX) elapsed = time.perf_counter() - start_time print(f"{title}: {elapsed:.2f}")

bench(foos, "Time taken for objects with a property") bench(bars, "Time taken for objects with a classvar") bench(bazzes, "Time taken for objects with an instance var") bench(basket, "Time taken for objects with no var") bench(nominals, "Time taken for nominal subclass instances") bench(registereds, "Time taken for registered subclass instances")