REGR: Series.duplicated with category dtype and nulls raises ValueError · Issue #44351 · pandas-dev/pandas (original) (raw)

Reproducible Example

import pandas as pd

print(pd.version) tc = pd.Series( pd.Categorical( [True, False, True, False, pd.NA], categories=[True, False], ordered=True ) ) print(tc.duplicated())

Issue Description

xref #44292 (comment)

code sample based on test_drop_duplicates_categorical_bool

on 1.3.4 (and master) code sample gives

1.3.4
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_47357/1277064552.py in <module>
      7     )
      8 )
----> 9 print(tc.duplicated())

~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/series.py in duplicated(self, keep)
   2215         dtype: bool
   2216         """
-> 2217         res = self._duplicated(keep=keep)
   2218         result = self._constructor(res, index=self.index)
   2219         return result.__finalize__(self, method="duplicated")

~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/base.py in _duplicated(self, keep)
   1230         self, keep: Literal["first", "last", False] = "first"
   1231     ) -> np.ndarray:
-> 1232         return duplicated(self._values, keep=keep)

~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/algorithms.py in duplicated(values, keep)
    925     duplicated : ndarray[bool]
    926     """
--> 927     values, _ = _ensure_data(values)
    928     return htable.duplicated(values, keep=keep)
    929 

~/miniconda3/envs/pandas-1.3.4/lib/python3.9/site-packages/pandas/core/algorithms.py in _ensure_data(values)
    139             # i.e. all-bool Categorical, BooleanArray
    140             try:
--> 141                 return np.asarray(values).astype("uint8", copy=False), values.dtype
    142             except TypeError:
    143                 # GH#42107 we have pd.NAs present

ValueError: cannot convert float NaN to integer

Expected Behavior

on 1.2.5 code sample gives

1.2.5
0    False
1    False
2     True
3     True
4    False
dtype: bool

Installed Versions

.