any/all reductions on boolean object-typed Series · Issue #27709 · pandas-dev/pandas (original) (raw)
On implementing a boolean based ExtensionArray I stumbled on the case that boolean arrays with missing values (which can only be object
-typed in pandas
) are kind-of undefined behaviour in Pandas reductions with skipna=False
:
The following case should return True
according to the docstring of Series.any(skipna=False)
:
pd.Series([False, None]).any(skipna=False)
None
pd.Series([None, False]).any(skipna=False)
False
pd.Series([False, np.nan]).any(skipna=False)
nan
pd.Series([np.nan, False]).any(skipna=False)
nan
Whereas when you do the same operation on float
columns the behaviour is as documented:
pd.Series([np.nan, 0.]).any(skipna=False)
True
pd.Series([0, np.nan]).any(skipna=False)
True
As I have not found a unit test for the above mentioned case with a boolean object column, I suspect that this is rather undefined behaviour then intended.
Three solutions come to my mind:
- Document this behaviour in the
Series.any()
docstring. - Align the behaviour of
pd.Series(booleans, dtype=object).any(…)
withpd.Series(booleans, dtype=object).astype(float).any(…)
. - Raise an error when calling
any/all
on a mixed typed boolean series.