REGR: Series.duplicated with category dtype and nulls raises ValueError · Issue #44351 · pandas-dev/pandas (original) (raw)
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
import pandas as pd
print(pd.version) tc = pd.Series( pd.Categorical( [True, False, True, False, pd.NA], categories=[True, False], ordered=True ) ) print(tc.duplicated())
Issue Description
xref #44292 (comment)
code sample based on test_drop_duplicates_categorical_bool
on 1.3.4 (and master) code sample gives
1.3.4
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_47357/1277064552.py in <module>
7 )
8 )
----> 9 print(tc.duplicated())
~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/series.py in duplicated(self, keep)
2215 dtype: bool
2216 """
-> 2217 res = self._duplicated(keep=keep)
2218 result = self._constructor(res, index=self.index)
2219 return result.__finalize__(self, method="duplicated")
~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/base.py in _duplicated(self, keep)
1230 self, keep: Literal["first", "last", False] = "first"
1231 ) -> np.ndarray:
-> 1232 return duplicated(self._values, keep=keep)
~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/algorithms.py in duplicated(values, keep)
925 duplicated : ndarray[bool]
926 """
--> 927 values, _ = _ensure_data(values)
928 return htable.duplicated(values, keep=keep)
929
~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/algorithms.py in _ensure_data(values)
139 # i.e. all-bool Categorical, BooleanArray
140 try:
--> 141 return np.asarray(values).astype("uint8", copy=False), values.dtype
142 except TypeError:
143 # GH#42107 we have pd.NAs present
ValueError: cannot convert float NaN to integer
Expected Behavior
on 1.2.5 code sample gives
1.2.5
0 False
1 False
2 True
3 True
4 False
dtype: bool
Installed Versions
.