API: _constructor and subclasses · Issue #32638 · pandas-dev/pandas (original) (raw)
Followup to #31925.
Over there, we removed _ensure_type
because it broke things like
import pandas as pd import pandas.testing as pdt
class MyDataFrame(pd.DataFrame): pass
mdf = MyDataFrame() df = pd.DataFrame()
In DataFrame.reindex_like, the derived class is compared to
the base class. The following line will throw
'AssertionError: <class 'pandas.core.frame.DataFrame'>'
mdf.reindex_like(df)
However, I don't think that example is very compelling. It's strange for MyDataFrame.reindex to return anything other than MyDataFrame. The root cause is MyDataFrame._constructor returning DataFrame
, rather than MyDataFrame
.
I think we should somehow get subclasses to have their _constructor return something that returns their subclass. We can
- Require this, by doing a runtime check on the
DataFrame._constructor
- Change
DataFrame._constructor
to returntype(self)
2 is nicer, but it requires that the subclasses __init__
method be compatible enough with DataFrame's. It's probably safe to assume that, but it might be an API breaking change.