gh-103193: cache calls to inspect._shadowed_dict in inspect.getattr_static by AlexWaygood · Pull Request #104267 · python/cpython (original) (raw)

EDIT: These performance numbers are out of date with the current PR; read further down the thread for the up-to-date numbers.

This dramatically speeds up calls to inspect.getattr_static.

Here are benchmark results on main, using @sobolevn's benchmark script from #103193 (comment):

type[Foo]                :   88 ±  0 ns
Foo                      :  158 ±  0 ns
type[Bar]                :   89 ±  0 ns
Bar                      :  158 ±  0 ns
WithParentClassX         :  224 ±  0 ns
Baz                      :  205 ±  0 ns
WithParentX              :  271 ±  1 ns
type[Missing]            :  252 ±  0 ns
Missing                  :  207 ±  0 ns
Slotted                  :  200 ±  1 ns
Method                   :  160 ±  1 ns
StMethod                 :  158 ±  0 ns
ClsMethod                :  158 ±  0 ns

And here are results with this PR:

type[Foo]                :   54 ±  0 ns
Foo                      :   85 ±  0 ns
type[Bar]                :   53 ±  0 ns
Bar                      :   85 ±  0 ns
WithParentClassX         :  102 ±  0 ns
Baz                      :   96 ±  0 ns
WithParentX              :  113 ±  0 ns
type[Missing]            :  110 ±  0 ns
Missing                  :   98 ±  0 ns
Slotted                  :  137 ±  0 ns
Method                   :   85 ±  0 ns
StMethod                 :   85 ±  0 ns
ClsMethod                :   85 ±  0 ns

With this PR, inspect.getattr_static is fast enough that even isinstance() calls like this, with "pathological" runtime-checkable protocols, are faster than they were on Python 3.11:

Pathological protocol

from typing import Protocol, runtime_checkable

@runtime_checkable class Foo(Protocol): a: int b: int c: int d: int e: int f: int g: int h: int i: int j: int k: int l: int m: int n: int o: int p: int q: int r: int s: int t: int u: int v: int w: int x: int y: int z: int

class Bar: def init(self): for attrname in 'abcdefghijklmnopqrstuvwxyz': setattr(self, attrname, 42)

isinstance(Bar(), Foo)

This approach makes me a little nervous. There are two plausible reasons I thought of why adding a cache here might not be a good idea:

  1. It could cause references to the klass argument to be held by the cache even after they've been deleted elsewhere, which could be unexpected behaviour for a low-level function like getattr_static
  2. Perhaps it's possible that whether or not a __dict__ attribute is shadowed could change at some point during the lifetime of a class.

However, I think we're okay on both points.

For objection (1): _shadowed_dict is only ever called on type objects. The vast majority of classes are defined once in the global namespace and never deleted, so it shouldn't be an issue that the cache is holding strong references to the type objects. (If we do think this is an issue, I also experimented with a version of this PR that uses a weakref.WeakKeyDictionary as a cache. It also sped things up, but not by nearly as much.)

For objection (2): It doesn't seem to be possible, once a class has been created, to change the class's __dict__ attribute, at least from Python code. So for any given class klass, _shadowed_dict(klass) should always return the same result.

@carljm, what do you think?