Improve `setattr` performance of Pydantic models by caching setter functions by MarkusSintonen · Pull Request #10868 · pydantic/pydantic (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation41 Commits5 Checks53 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

Change Summary

Attribute setting has been pretty slow for BaseModel due to the extensive checks it has been doing for every __setattr__ call. PR improves performance of __setattr__ by memoizing the attribute specific handlers to the model class. This makes the attribute assigning some 7x faster. Also add missing benchmarks for attribute usage.

from timeit import timeit from pydantic import BaseModel

class Model(BaseModel): field: int

model = Model(field=1)

def run(): model.field = 2

Before 1.048

After 0.147

print(timeit(run, number=1000000))

fix #10853

Checklist

The pull request title is a good summary of the changes - it will be used in the changelog
Unit tests for the changes exist
Tests pass on CI
Documentation reflects the changes where applicable
My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @sydney-runkle

CodSpeed Performance Report

Merging #10868 will not alter performance

Comparing MarkusSintonen:fast-setattr (404b8b7) with main (30ee4f4)

Summary

✅ 44 untouched benchmarks

🆕 2 new benchmarks

Benchmarks breakdown

Benchmark	main	MarkusSintonen:fast-setattr	Change
🆕	test_getattr	N/A	54 µs	N/A
🆕	test_setattr	N/A	87.7 µs	N/A

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @MarkusSintonen,

Cool idea, thanks! I think memoization could be helpful here. Let me circle back with some colleagues to verify.

Specifically, @dmontagu, wdyt about this? I recall you've done a lot of work on these setattr branches.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

Smart idea to memoize most of the checks that are really only dependent on the class and attribute name. I made a couple notes and generally defer to Sydney on specific style nits, but the overall approach seems sensible to me and a good improvement.

Consider the PR approved by me, at least conceptually; I'm just not explicitly approving due to the nit comments maybe meriting some minor changes before merging.

Consider the PR approved by me, at least conceptually

Fantastic, thanks for the prompt review.

I've give this a nit-picky review this evening, then we can move forward!

This makes me think more about what else we could potentially memoize in the schema gen department...

One other thing I want to make sure of - this doesn't leave us with any pickling issues? I don't think so, given passing tests, but we should check.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think this is a smart idea. We might have to worry about the size of the cache for large models (with a lot of fields). If we encounter such issues, we could use proper functions defined once instead of creating lambdas everytime

Thanks, I think this is a smart idea. We might have to worry about the size of the cache for large models (with a lot of fields). If we encounter such issues, we could use proper functions defined once instead of creating lambdas everytime

I wouldnt worry about size of it as anyways all the fields are listed in various ClassVars. However if we want to remove the tiny overhead of field name strs we could push the handler fn into eg FieldInfo/ModelPrivateAttr.

functions defined once

I purposely didnt want to touch the model generation side to not make it anymore heavier than it already is. Because of the mentioned large models it could just do work for no good reason in case fields are not even used like this.

Not sure exactly what you mean by field name strs / didnt want to touch the model generation side, but what I wanted to say is we could do something like this:

HANDLERS = { 'descriptor': lambda m, val: attribute.set(m, val), 'cached_property': lambda m, val: m.dict.setitem(name, val), ... }

def _setattr_handler(name: str, value: Any): ... if hasattr(attribute, 'set'): return HANDLERS['descriptor'] ... elif isinstance(attr, cached_property): return HANDLERS['cached_property']

So that we don't create a new function every time.

Viicos changed the title~~Fix slow BaseModel.__setattr__~~ Improve __setattr__ performance of Pydantic models by caching setter functions

Nov 19, 2024

but what I wanted to say is we could do something like this

Ah yes I see! That would make sense yes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the simple dict. This looks good to me.

Overall this approach can look a bit weird as calling __setattr__ (i.e. at the instance level) mutates a class variable that will be valid for every instance. But functionally it makes sense.

Improve __setattr__ performance of Pydantic models by caching setter functions by MarkusSintonen · Pull Request #10868 · pydantic/pydantic (original) (raw)

Conversation

Change Summary

Before 1.048

After 0.147

Related issue number

Checklist

CodSpeed Performance Report

Merging #10868 will not alter performance

Summary

Benchmarks breakdown

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Improve `setattr` performance of Pydantic models by caching setter functions by MarkusSintonen · Pull Request #10868 · pydantic/pydantic (original) (raw)