ExtensionArray Ops discussion · Issue #19577 · pandas-dev/pandas (original) (raw)

Split from #19520

How to handle things like __eq__, __add__, etc.

Neither the Python 2 or Python 3 object defaults are appropriate, so I think we should make it abstract or provide a default implementation for some / all of these.

We should be able to do a default implementation for the simple case of binop(extension_array, extension_array) by

@jorisvandenbossche raises the point that for some extension arrays, casting to an object ndarray can be expensive. It'd be unfortunate to do the casting if we're just going to raise a TypeError when the binop is done. In this case the subclass will need to override all those ops (or we could add class attribute like _comparable and check that before doing __eq__ or __lt__, etc).

How much coercion do we want to attempt? I would vote to stay simple and only attempt a comparison if

other is an instance of the same ExtensionArray type
other is a scalar of self.dtype.type.

Otherwise we return NotImplemented. Subclasses can of course choose more or less aggressive coercion rules.

When boxed in a Series or Index, I think the goal should be

  1. dispatch to the underlying .values comparsion
  2. Reconstruct the appropriate output (box in series / index with name).