Cumulative Methods with mixed frames casts to object · Issue #19296 · pandas-dev/pandas (original) (raw)

Looks like we cast the entire frame to object dtype, and then perform the cumulative functions:

In [18]: x = pd.DataFrame({ ...: "A": [1, 2, 3], ...: "B": [1, 2, 3.], ...: "C": [True, False, False], ...: })

In [19]: x.cumsum() Out[19]: A B C 0 1 1 True 1 3 3 1 2 6 6 1

In [20]: x.cumsum().dtypes Out[20]: A object B object C object dtype: object

In [21]: x.cummin().dtypes Out[21]: A object B object C object dtype: object

I think it'd be better to do these block-wise? The possible downside I see is that

In [24]: x[['A', 'B']].cumsum() Out[24]: A B 0 1.0 1.0 1 3.0 3.0 2 6.0 6.0

Will now be slower (presumably) since we'll have two cumsums to apply instead of one (after upcasting), but I think that'd be worth it for preserving the correct dtypes.