gh-103193: Micro-optimise helper functions for inspect.getattr_static by AlexWaygood · Pull Request #103195 · python/cpython (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation14 Commits2 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

AlexWaygood

Micro-optimising inspect._static_getmro and inspect._shadowed_dict leads to a significant improvement in the time taken to call isinstance() on a runtime-checkable protocol. (This is a useful benchmark, as it's a real-world use of inspect.getattr_static in a tight loop, that's found in the stdlib.)

Benchmark results on a0305c5:

Time taken for objects with a property: 3.34
Time taken for objects with a classvar: 3.08
Time taken for objects with an instance var: 12.15
Time taken for objects with no var: 15.38
Time taken for nominal subclass instances: 25.13
Time taken for registered subclass instances: 21.65

Benchmark results with this PR:

Time taken for objects with a property: 2.82
Time taken for objects with a classvar: 2.85
Time taken for objects with an instance var: 8.62
Time taken for objects with no var: 14.01
Time taken for nominal subclass instances: 21.80
Time taken for registered subclass instances: 20.23

Benchmark script:

import time from typing import Protocol, runtime_checkable

@runtime_checkable class HasX(Protocol): x: int

class Foo: @property def x(self) -> int: return 42

class Bar: x = 42

class Baz: def init(self): self.x = 42

class Egg: ...

class Nominal(HasX): def init(self): self.x = 42

class Registered: ...

HasX.register(Registered)

num_instances = 500_000 foos = [Foo() for _ in range(num_instances)] bars = [Bar() for _ in range(num_instances)] bazzes = [Baz() for _ in range(num_instances)] basket = [Egg() for _ in range(num_instances)] nominals = [Nominal() for _ in range(num_instances)] registereds = [Registered() for _ in range(num_instances)]

def bench(objs, title): start_time = time.perf_counter() for obj in objs: isinstance(obj, HasX) elapsed = time.perf_counter() - start_time print(f"{title}: {elapsed:.2f}")

bench(foos, "Time taken for objects with a property") bench(bars, "Time taken for objects with a classvar") bench(bazzes, "Time taken for objects with an instance var") bench(basket, "Time taken for objects with no var") bench(nominals, "Time taken for nominal subclass instances") bench(registereds, "Time taken for registered subclass instances")

@AlexWaygood

@AlexWaygood

@Eclips4

Hello, Alex!
Just out of curiosity, why do you use a x.__dict__ instead of vars(x), it's related to a performance issues?

@AlexWaygood

Just out of curiosity, why do you use a x.__dict__ instead of vars(x), it's related to a performance issues?

There's not really any difference between the two; it's just a matter of taste. Some people consider vars(x) to be "more Pythonic" than accessing the __dict__ variable directly, but I don't really have a strong preference either way.

In this case, the main reason I'm accessing it via .__dict__ is just because that's what the existing code already does, and I don't see any reason to change the style here. It's also arguably more explicit in this case -- but again, that's just a subjective opinion about code readability, really.

sobolevn

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several things I can think of to speed up getattr_static.

  1. Change _check_instance to be

def _check_instance(obj, attr): try: instance_dict = object.getattribute(obj, "dict") except AttributeError: return _sentinel return dict.get(instance_dict, attr, _sentinel)

  1. Here type is called twice:

if (_check_class(type(klass_result), 'get') is not _sentinel and _check_class(type(klass_result), 'set') is not _sentinel):

Ideally we can also call _check_class once with several attributes 🤔
Because in this case it will allow us to decrease the amount of _static_mro calls.

I think that a variable might be a bit faster

@AlexWaygood

There are several things I can think of to speed up getattr_static.

Those sound great! My instinct is to tackle them in a separate PR, so that each change can be evaluated and benchmarked independently. Would you like to submit a PR?

@sobolevn

Yes, I will do it tomorrow 👍

@picnixz

Would it be ok to have _dunder_dict_descriptor_get = type.__dict__["__dict__"].__get__ instead of _dunder_dict_descriptor = type.__dict__["__dict__"] in order to minimize the number of accesses to __get__ when iterating over the classes ?

@AlexWaygood

Would it be ok to have _dunder_dict_descriptor_get = type.__dict__["__dict__"].__get__ instead of _dunder_dict_descriptor = type.__dict__["__dict__"] in order to minimize the number of accesses to __get__ when iterating over the classes ?

I considered it, but felt like it would make it significantly less readable (_dunder_dict_descriptor already isn't a great name, and _dunder_dict_descriptor_get feels even worse). Happy to reconsider if others agree that we should go that way, though.

@picnixz

_dunder_dict_descriptor already isn't a great name, and _dunder_dict_descriptor_get feels even worse

IMO, since we are already micro-optimizing an internal helper of an internal function, we may not necessarily need to be exact so perhaps we can name it _dunder_dict_of ?

@AlexWaygood

IMO, since we are already micro-optimizing an internal helper of an internal function, we may not necessarily need to be exact so perhaps we can name it _dunder_dict_of ?

That's a decent name. Or maybe _get_dunder_dict_of_class.

I'll make the change — thanks!

@AlexWaygood

@AlexWaygood

I pushed the change and updated the benchmark results in the PR description (it's maybe a teeny tiny bit faster than it was, but not by much)

@picnixz

it's maybe a teeny tiny bit faster than it was, but not by much

I think the impact will be clearer when we have a complicated inheritance diagram (e.g., large projects with abstract interfaces and/or mixins here and there).

carljm

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

@AlexWaygood AlexWaygood deleted the microoptimise-getattr_static-helpers branch

April 5, 2023 07:27

@AlexWaygood

There are several things I can think of to speed up getattr_static.

  1. Change _check_instance to be

def _check_instance(obj, attr): try: instance_dict = object.getattribute(obj, "dict") except AttributeError: return _sentinel return dict.get(instance_dict, attr, _sentinel)

  1. Here type is called twice:

if (_check_class(type(klass_result), 'get') is not _sentinel and _check_class(type(klass_result), 'set') is not _sentinel):

Ideally we can also call _check_class once with several attributes 🤔 Because in this case it will allow us to decrease the amount of _static_mro calls.

I think that a variable might be a bit faster

I tried these out locally, but unfortunately I can't measure any speedup from them :/

It's possible I'm not using the right benchmark to show a speedup, however -- happy to be proven wrong if you can get a speedup somewhere!

This was referenced

Apr 6, 2023

gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this pull request

Apr 8, 2023

@AlexWaygood @gaogaotiantian

warsaw pushed a commit to warsaw/cpython that referenced this pull request

Apr 11, 2023

@AlexWaygood @warsaw