PERF: Add type-hints in tzconversion.pyx by rhshadrach · Pull Request #55241 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation6 Commits2 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
- closes #xxxx (Replace xxxx with the GitHub issue number)
- Tests added and passed if fixing a bug or adding a new feature
- All code checks passed.
- Added type annotations to new arguments/methods/functions.
- Added an entry in the latest
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.
Part of #55179. With this, I'm seeing time plummet in the benchmark
tslibs.tz_convert.TimeTZConvert.time_tz_localize_to_utc(1000000, tzfile('/usr/share/zoneinfo/Asia/Tokyo'))
from 159ms (Cython 0.29) and 271ms (Cython 3.0.2) to ~5-6ms on both.
Any suggestions on how to write a whatsnew note here? Perf improvement when localizing time to UTC?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, nice
and yeah, listing a perf improvement in whatsnew sounds good
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice find
This is great. One potential future issue to be aware of though: this locks us in to having only 1D ndarrays passed to _get_utc_bounds
. This isn't a problem ATM since the calling function tz_localize_to_utc
is already locked in. But if we ever wanted to make tz_localize_to_utc
handle unknown-ndim (which is on my radar bc it would allow removing the potentially-copying @dtl.ravel_compat
from DatetimeArray.tz_localize
) this would become an issue. So no worries, just heads up.
In that case to keep perf we'd have to resort to fused types, is that right?
AFAIK there isn't a nice solution to the arbitrary-ndim use case. In practice it is always either 1 or 2 (maybe 0 in some corner cases) so we could make a fused type for just those. This won't be an issue anytime soon so this is just a "be aware of". im kind of hoping cython adds support for dtype-declaring ndarrays without declaring the dimension