ExtensionArray Ops discussion · Issue #19577 · pandas-dev/pandas (original) (raw)
Split from #19520
How to handle things like __eq__
, __add__
, etc.
Neither the Python 2 or Python 3 object defaults are appropriate, so I think we should make it abstract or provide a default implementation for some / all of these.
We should be able to do a default implementation for the simple case of binop(extension_array, extension_array) by
- masking nulls in either self | other
- cast self / other to ndarrays
- compare and apply null mask
@jorisvandenbossche raises the point that for some extension arrays, casting to an object ndarray can be expensive. It'd be unfortunate to do the casting if we're just going to raise a TypeError
when the binop
is done. In this case the subclass will need to override all those ops (or we could add class attribute like _comparable
and check that before doing __eq__
or __lt__
, etc).
How much coercion do we want to attempt? I would vote to stay simple and only attempt a comparison if
other is an instance of the same ExtensionArray type
other is a scalar of self.dtype.type.
Otherwise we return NotImplemented. Subclasses can of course choose more or less aggressive coercion rules.
When boxed in a Series
or Index
, I think the goal should be
- dispatch to the underlying
.values
comparsion - Reconstruct the appropriate output (box in series / index with name).