gh-103193: Micro-optimise helper functions for inspect.getattr_static
by AlexWaygood · Pull Request #103195 · python/cpython (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation14 Commits2 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Micro-optimising inspect._static_getmro
and inspect._shadowed_dict
leads to a significant improvement in the time taken to call isinstance()
on a runtime-checkable protocol. (This is a useful benchmark, as it's a real-world use of inspect.getattr_static
in a tight loop, that's found in the stdlib.)
Benchmark results on a0305c5:
Time taken for objects with a property: 3.34
Time taken for objects with a classvar: 3.08
Time taken for objects with an instance var: 12.15
Time taken for objects with no var: 15.38
Time taken for nominal subclass instances: 25.13
Time taken for registered subclass instances: 21.65
Benchmark results with this PR:
Time taken for objects with a property: 2.82
Time taken for objects with a classvar: 2.85
Time taken for objects with an instance var: 8.62
Time taken for objects with no var: 14.01
Time taken for nominal subclass instances: 21.80
Time taken for registered subclass instances: 20.23
Benchmark script:
import time from typing import Protocol, runtime_checkable
@runtime_checkable class HasX(Protocol): x: int
class Foo: @property def x(self) -> int: return 42
class Bar: x = 42
class Baz: def init(self): self.x = 42
class Egg: ...
class Nominal(HasX): def init(self): self.x = 42
class Registered: ...
HasX.register(Registered)
num_instances = 500_000 foos = [Foo() for _ in range(num_instances)] bars = [Bar() for _ in range(num_instances)] bazzes = [Baz() for _ in range(num_instances)] basket = [Egg() for _ in range(num_instances)] nominals = [Nominal() for _ in range(num_instances)] registereds = [Registered() for _ in range(num_instances)]
def bench(objs, title): start_time = time.perf_counter() for obj in objs: isinstance(obj, HasX) elapsed = time.perf_counter() - start_time print(f"{title}: {elapsed:.2f}")
bench(foos, "Time taken for objects with a property") bench(bars, "Time taken for objects with a classvar") bench(bazzes, "Time taken for objects with an instance var") bench(basket, "Time taken for objects with no var") bench(nominals, "Time taken for nominal subclass instances") bench(registereds, "Time taken for registered subclass instances")
Hello, Alex!
Just out of curiosity, why do you use a x.__dict__
instead of vars(x)
, it's related to a performance issues?
Just out of curiosity, why do you use a
x.__dict__
instead ofvars(x)
, it's related to a performance issues?
There's not really any difference between the two; it's just a matter of taste. Some people consider vars(x)
to be "more Pythonic" than accessing the __dict__
variable directly, but I don't really have a strong preference either way.
In this case, the main reason I'm accessing it via .__dict__
is just because that's what the existing code already does, and I don't see any reason to change the style here. It's also arguably more explicit in this case -- but again, that's just a subjective opinion about code readability, really.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several things I can think of to speed up getattr_static
.
- Change
_check_instance
to be
def _check_instance(obj, attr): try: instance_dict = object.getattribute(obj, "dict") except AttributeError: return _sentinel return dict.get(instance_dict, attr, _sentinel)
- Here
type
is called twice:
if (_check_class(type(klass_result), 'get') is not _sentinel and _check_class(type(klass_result), 'set') is not _sentinel):
Ideally we can also call _check_class
once with several attributes 🤔
Because in this case it will allow us to decrease the amount of _static_mro
calls.
I think that a variable might be a bit faster
There are several things I can think of to speed up
getattr_static
.
Those sound great! My instinct is to tackle them in a separate PR, so that each change can be evaluated and benchmarked independently. Would you like to submit a PR?
Yes, I will do it tomorrow 👍
Would it be ok to have _dunder_dict_descriptor_get = type.__dict__["__dict__"].__get__
instead of _dunder_dict_descriptor = type.__dict__["__dict__"]
in order to minimize the number of accesses to __get__
when iterating over the classes ?
Would it be ok to have
_dunder_dict_descriptor_get = type.__dict__["__dict__"].__get__
instead of_dunder_dict_descriptor = type.__dict__["__dict__"]
in order to minimize the number of accesses to__get__
when iterating over the classes ?
I considered it, but felt like it would make it significantly less readable (_dunder_dict_descriptor
already isn't a great name, and _dunder_dict_descriptor_get
feels even worse). Happy to reconsider if others agree that we should go that way, though.
_dunder_dict_descriptor
already isn't a great name, and_dunder_dict_descriptor_get
feels even worse
IMO, since we are already micro-optimizing an internal helper of an internal function, we may not necessarily need to be exact so perhaps we can name it _dunder_dict_of
?
IMO, since we are already micro-optimizing an internal helper of an internal function, we may not necessarily need to be exact so perhaps we can name it
_dunder_dict_of
?
That's a decent name. Or maybe _get_dunder_dict_of_class
.
I'll make the change — thanks!
I pushed the change and updated the benchmark results in the PR description (it's maybe a teeny tiny bit faster than it was, but not by much)
it's maybe a teeny tiny bit faster than it was, but not by much
I think the impact will be clearer when we have a complicated inheritance diagram (e.g., large projects with abstract interfaces and/or mixins here and there).
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice!
AlexWaygood deleted the microoptimise-getattr_static-helpers branch
There are several things I can think of to speed up
getattr_static
.
- Change
_check_instance
to bedef _check_instance(obj, attr): try: instance_dict = object.getattribute(obj, "dict") except AttributeError: return _sentinel return dict.get(instance_dict, attr, _sentinel)
- Here
type
is called twice:if (_check_class(type(klass_result), 'get') is not _sentinel and _check_class(type(klass_result), 'set') is not _sentinel):
Ideally we can also call
_check_class
once with several attributes 🤔 Because in this case it will allow us to decrease the amount of_static_mro
calls.I think that a variable might be a bit faster
I tried these out locally, but unfortunately I can't measure any speedup from them :/
It's possible I'm not using the right benchmark to show a speedup, however -- happy to be proven wrong if you can get a speedup somewhere!
This was referenced
Apr 6, 2023
gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this pull request
warsaw pushed a commit to warsaw/cpython that referenced this pull request