BUG: fix concat of Sparse with non-sparse dtypes by jorisvandenbossche · Pull Request #34338 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation8 Commits7 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Closes #34336
This adds tests with the behaviour as it was on 0.25.3 / 1.0.3, and some changes to get back to that behaviour.
(but whether this behaviour is fully "sane", I am not sure ..)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
I'm not totally sure about the expected behavior of concat([Sparse, Categorical])
. I suspect it was never properly discussed / intentionally implemented. Happy to just match 1.0.3 behavior for now htough.
Co-authored-by: Tom Augspurger TomAugspurger@users.noreply.github.com
OK, will go forward with this PR as is (matching the previous behaviour) to unblock the other concat PRs, but will open a set of follow-up issues.
@@ -1063,7 +1063,8 @@ def astype(self, dtype=None, copy=True): |
---|
""" |
dtype = self.dtype.update_dtype(dtype) |
subtype = dtype._subtype_with_str |
sp_values = astype_nansafe(self.sp_values, subtype, copy=copy) |
# TODO copy=False is broken for astype_nansafe with int -> float |
sp_values = astype_nansafe(self.sp_values, subtype, copy=True) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #34456 for this (and added link in the comment)
if is_sparse(arr.dtype) and not is_sparse(dtype): |
---|
# problem case: SparseArray.astype(dtype) doesn't follow the specified |
# dtype exactly, but converts this to Sparse[dtype] -> first manually |
# convert to dense array |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And opened #34459 for the general "what should concat(sparse, categorical) do?" question
@@ -20,7 +21,7 @@ |
---|
) |
from pandas.core.dtypes.generic import ABCCategoricalIndex, ABCRangeIndex, ABCSeries |
from pandas.core.arrays import ExtensionArray |
from pandas.core.arrays import ExtensionArray, SparseArray |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I already pushed a fix.
I don't really understand why, though, as SparseArray
is included in the pandas.core.arrays
init, just as ExtensionArray.
2 participants