POC/ENH: infer resolution in array_to_datetime by jbrockmendel · Pull Request #55741 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation12 Commits15 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

@jbrockmendel

xref #55564 implements the relevant logic in array_to_datetime, does not change any user-facing behavior yet.

Have not yet done any profiling, but we have to expect a slowdown.

@jbrockmendel

@jbrockmendel

@jbrockmendel

@jbrockmendel

@jbrockmendel

WillAyd

@jbrockmendel

WillAyd

tz_out = timezone(timedelta(seconds=tz_offset))
if infer_reso:
if state.creso_ever_changed:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what to do here but my first read of state.creso_ever_changed was confusion over the scope of ever; I am inferring from the way the function is currently written that ever refers to the lifetime of this function, but if we have parse states that can persist across multiple function calls that terminology can get a little vague

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have parse states that can persist across multiple function calls

we do not

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to keep it as part of the state then instead of just local to the function?

To be clear not a hold up on this PR for me. Just a consideration point as this continues to evolve

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bc it gets set inside a state-updating method that i dont want to duplicate in a bunch of places. also there are a couple of different places where we use DatetimeParseState and im trying to iron out the kinks in behavior differences between them.

@jbrockmendel

@jbrockmendel

@jbrockmendel

mroeschke

def test_infer_heterogeneous(self):
dtstr = "2023-10-27 18:03:05.678000"
arr = np.array([dtstr, dtstr[:-3], dtstr[:-7], None], dtype=object)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow up, it would be good to also add a test for different objects too e.g. np.array([dtstr, pydate, pydatetime, np.datetime64, int])

mroeschke

@mroeschke

jbrockmendel added a commit to jbrockmendel/pandas that referenced this pull request

Nov 15, 2023

@jbrockmendel

Labels

Non-Nano

datetime64/timedelta64 with non-nanosecond resolution