BUG+DEPR: undeprecate item, fix dt64/td64 output type by jbrockmendel · Pull Request #30175 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation22 Commits10 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
- closes API: consider undeprecating Series.item() ? #29250
- tests added / passed
- passes
black pandas - passes
git diff upstream/master -u -- "*.py" | flake8 --diff - whatsnew entry
Does the un-deprecation need a dedicated discussion, or is there consensus on that?
I'd be happy to split this into separate bug/depr pieces of reviewers prefer.
| stacklevel=2, |
|---|
| ) |
| return self.values.item() |
| if len(self) == 1: |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was adding this condition discussed somewhere? I would have thought just keep existing behaviour
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the bugfix part. dt64, dt64tz, and td64 we're currently incorrectly returning int
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm doesn't this break non-DTA though?
type(pd.Series(range(1)).item()) <class 'int'> type(pd.Series(range(1))[0]) <class 'numpy.int64'>
I thought one of the points of item was to return a Python object (at least in the Numpy world)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ current behavior
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should keep the behaviour of item to return a python scalar (where possible of course, so for datetime/timedelta it is fine to return a pandas Timestamp/Timedelta I think)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, will update.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has this been resolved?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-Authored-By: Stephan Hoyer shoyer@google.com
| if not needs_i8_conversion(self.dtype): |
|---|
| # numpy returns ints instead of datetime64/timedelta64 objects, |
| # which we need to wrap in Timestamp/Timedelta/Period regardless. |
| return self.values.item() |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for ExtensionArrays. We can discuss adding item to the interface, but I would rather (or at least for now) let ExtensionArrays take the path you have below that uses iteration (which should already handle the conversion to a python scalar)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be +1 on adding to EA arrays, why have inconsistency in code paths.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for ExtensionArrays
This uses .values, so will convert to ndarray and then call item. So it shouldn't be any more broken than what we have now.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have no objection to adding item to EAs separately
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it shouldn't be any more broken than what we have now.
And to fix that, the only thing that is needed is adding a and not is_extension_array_dtype(self.dtype): to the above if check.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im happy to do that here, will need tests in a follow-up
| if not needs_i8_conversion(self.dtype): |
|---|
| # numpy returns ints instead of datetime64/timedelta64 objects, |
| # which we need to wrap in Timestamp/Timedelta/Period regardless. |
| return self.values.item() |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be +1 on adding to EA arrays, why have inconsistency in code paths.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, question
| ) |
|---|
| return self.values.item() |
| if not ( |
| is_extension_array_dtype(self.dtype) or needs_i8_conversion(self.dtype) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this redundant? as all needs_i8_conversion are already EA
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, dt64 and td64 are need i8 conversion but are not EA
proost pushed a commit to proost/pandas that referenced this pull request
proost pushed a commit to proost/pandas that referenced this pull request