Make SparseArray an ExtensionArray · Issue #21978 · pandas-dev/pandas (original) (raw)

We should make SparseArray a proper ExtensionArray.

It seems like this will be somewhat difficult to do properly when SparseArray subclasses ndarray. Basic things like np.asarray(sparse_array) don't match the required ExtensionArray API (#14167). Fixing this, especially when we subclass ndarray, is going to be difficult. I can't override the behavior of np.asarray(sparse_array) in Python.

So, some questions

  1. Do people rely on SparseArray being an ndarray subclass?
  2. Do we want to make a clean break, or introduce deprecations for things that will need changing (but with no clear upgrade path)?

My current preference is to just break things, but I don't use sparse. SparseArray would compose an ndarray of dense values and a SparseIndex, but it would no longer subclass ndarray.

CCing some people who seem to use pandas' sparse: @hexgnu @kernc @Licht-T