ENH: Handle extension arrays in algorithms.diff by TomAugspurger · Pull Request #31025 · pandas-dev/pandas (original) (raw)

Hmm, one problem with BooleanArray though. We don't allow - between two BooleanArrays

In [4]: pd.Series([True, False, True, None, True], dtype='boolean').diff()

TypeError Traceback (most recent call last) in ----> 1 pd.Series([True, False, True, None, True], dtype='boolean').diff()

~/sandbox/pandas/pandas/core/series.py in diff(self, periods) 2317 dtype: float64 2318 """ -> 2319 result = algorithms.diff(self.array, periods) 2320 return self._constructor(result, index=self.index).finalize(self) 2321

~/sandbox/pandas/pandas/core/algorithms.py in diff(arr, n, axis) 1836 1837 if is_extension_array_dtype(dtype): -> 1838 return arr - arr.shift(n) 1839 1840 is_timedelta = False

~/sandbox/pandas/pandas/core/arrays/boolean.py in boolean_arithmetic_method(self, other) 750 751 with np.errstate(all="ignore"): --> 752 result = op(self._data, other) 753 754 # divmod returns a tuple

TypeError: numpy boolean subtract, the - operator, is not supported, use the bitwise_xor, the ^ operator, or the logical_xor function instead.

We do something similar with bool dtype, doing an xor instead.

    elif is_bool:
        out_arr[res_indexer] = arr[res_indexer] ^ arr[lag_indexer]

So perhaps we check dtype.kind and do xor for boolean kinds. I think that's reasonable.