any/all reductions on boolean object-typed Series · Issue #27709 · pandas-dev/pandas (original) (raw)

On implementing a boolean based ExtensionArray I stumbled on the case that boolean arrays with missing values (which can only be object-typed in pandas) are kind-of undefined behaviour in Pandas reductions with skipna=False:

The following case should return True according to the docstring of Series.any(skipna=False):

pd.Series([False, None]).any(skipna=False)

None

pd.Series([None, False]).any(skipna=False)

False

pd.Series([False, np.nan]).any(skipna=False)

nan

pd.Series([np.nan, False]).any(skipna=False)

nan

Whereas when you do the same operation on float columns the behaviour is as documented:

pd.Series([np.nan, 0.]).any(skipna=False)

True

pd.Series([0, np.nan]).any(skipna=False)

True

As I have not found a unit test for the above mentioned case with a boolean object column, I suspect that this is rather undefined behaviour then intended.

Three solutions come to my mind:

  1. Document this behaviour in the Series.any() docstring.
  2. Align the behaviour of pd.Series(booleans, dtype=object).any(…) with pd.Series(booleans, dtype=object).astype(float).any(…).
  3. Raise an error when calling any/all on a mixed typed boolean series.