Bool Ops: One bug hides another · Issue #22092 · pandas-dev/pandas (original) (raw)

Boolean ops (&, |, ^) are less well-tested than most of the others. Going through some of these, some of them raise for weird reasons:

ser = pd.Series([True, False, True])
idx = pd.Index([False, True, True])

Starting with ser & idx, we'd expect to get Series([False, False, True]). Instead we get a ValueError because we accidentally go down the path treating Index as a scalar:

>>> ser & idx
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/ops.py", line 1481, in wrapper
    res_values = na_op(self.values, other)
  File "pandas/core/ops.py", line 1439, in na_op
    if not isna(y):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

OK, that's fixed easily enough. What about the reversed operation idx & ser? That means something entirely different, since Index.__and__ is an alias for Index.intersection, so we'd expect either a Series or Index containing both False and True. Instead we get a ValueError:

>>> idx & ser
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pandas/core/indexes/base.py", line 2658, in __and__
    return self.intersection(other)
  File "pandas/core/indexes/base.py", line 2823, in intersection
    if self.is_monotonic and other.is_monotonic:
  File "pandas/core/indexes/base.py", line 1407, in is_monotonic
    return self.is_monotonic_increasing
  File "pandas/core/indexes/base.py", line 1424, in is_monotonic_increasing
    return self._engine.is_monotonic_increasing
  File "pandas/_libs/index.pyx", line 214, in pandas._libs.index.IndexEngine.is_monotonic_increasing.__get__
    self._do_monotonic_check()
  File "pandas/_libs/index.pyx", line 228, in pandas._libs.index.IndexEngine._do_monotonic_check
    values = self._get_index_values()
  File "pandas/_libs/index.pyx", line 244, in pandas._libs.index.IndexEngine._get_index_values
    return self.vgetter()
  File "pandas/core/indexes/base.py", line 1847, in <lambda>
    return self._engine_type(lambda: self._ndarray_values, len(self))
  File "pandas/core/base.py", line 752, in _ndarray_values
    if is_extension_array_dtype(self):
  File "pandas/core/dtypes/common.py", line 1718, in is_extension_array_dtype
    arr_or_dtype = pandas_dtype(arr_or_dtype)
  File "pandas/core/dtypes/common.py", line 2026, in pandas_dtype
    if dtype in [object, np.object_, 'object', 'O']:
  File "pandas/core/generic.py", line 1726, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

The comparison dtype in [object, np.object_, 'object', 'O'] raises when dtype is array-like. If we change that condition to is_hashable(dtype) and dtype in [object, np.object_, 'object', 'O'], we get Index([False, True, True]), which is what we expected from the Series op, but not from the Index op.