PERF: Add type-hints in tzconversion.pyx by rhshadrach · Pull Request #55241 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation6 Commits2 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

rhshadrach

Part of #55179. With this, I'm seeing time plummet in the benchmark

tslibs.tz_convert.TimeTZConvert.time_tz_localize_to_utc(1000000, tzfile('/usr/share/zoneinfo/Asia/Tokyo'))

from 159ms (Cython 0.29) and 271ms (Cython 3.0.2) to ~5-6ms on both.

@rhshadrach

@rhshadrach

Any suggestions on how to write a whatsnew note here? Perf improvement when localizing time to UTC?

MarcoGorelli

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, nice

and yeah, listing a perf improvement in whatsnew sounds good

@rhshadrach

WillAyd

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find

@jbrockmendel

This is great. One potential future issue to be aware of though: this locks us in to having only 1D ndarrays passed to _get_utc_bounds. This isn't a problem ATM since the calling function tz_localize_to_utc is already locked in. But if we ever wanted to make tz_localize_to_utc handle unknown-ndim (which is on my radar bc it would allow removing the potentially-copying @dtl.ravel_compat from DatetimeArray.tz_localize) this would become an issue. So no worries, just heads up.

@rhshadrach

In that case to keep perf we'd have to resort to fused types, is that right?

@jbrockmendel

AFAIK there isn't a nice solution to the arbitrary-ndim use case. In practice it is always either 1 or 2 (maybe 0 in some corner cases) so we could make a fused type for just those. This won't be an issue anytime soon so this is just a "be aware of". im kind of hoping cython adds support for dtype-declaring ndarrays without declaring the dimension