Cumulative Methods with mixed frames casts to object · Issue #19296 · pandas-dev/pandas (original) (raw)
Looks like we cast the entire frame to object dtype, and then perform the cumulative functions:
In [18]: x = pd.DataFrame({ ...: "A": [1, 2, 3], ...: "B": [1, 2, 3.], ...: "C": [True, False, False], ...: })
In [19]: x.cumsum() Out[19]: A B C 0 1 1 True 1 3 3 1 2 6 6 1
In [20]: x.cumsum().dtypes Out[20]: A object B object C object dtype: object
In [21]: x.cummin().dtypes Out[21]: A object B object C object dtype: object
I think it'd be better to do these block-wise? The possible downside I see is that
In [24]: x[['A', 'B']].cumsum() Out[24]: A B 0 1.0 1.0 1 3.0 3.0 2 6.0 6.0
Will now be slower (presumably) since we'll have two cumsums to apply instead of one (after upcasting), but I think that'd be worth it for preserving the correct dtypes.