API: _constructor and subclasses · Issue #32638 · pandas-dev/pandas (original) (raw)

Followup to #31925.

Over there, we removed _ensure_type because it broke things like

import pandas as pd import pandas.testing as pdt

class MyDataFrame(pd.DataFrame): pass

mdf = MyDataFrame() df = pd.DataFrame()

In DataFrame.reindex_like, the derived class is compared to

the base class. The following line will throw

'AssertionError: <class 'pandas.core.frame.DataFrame'>'

mdf.reindex_like(df)

However, I don't think that example is very compelling. It's strange for MyDataFrame.reindex to return anything other than MyDataFrame. The root cause is MyDataFrame._constructor returning DataFrame, rather than MyDataFrame.

I think we should somehow get subclasses to have their _constructor return something that returns their subclass. We can

  1. Require this, by doing a runtime check on the DataFrame._constructor
  2. Change DataFrame._constructor to return type(self)

2 is nicer, but it requires that the subclasses __init__ method be compatible enough with DataFrame's. It's probably safe to assume that, but it might be an API breaking change.