DEPR: GroupBy.cumsum etc with axis=1 · Issue #51046 · pandas-dev/pandas (original) (raw)

df = pd.DataFrame(np.random.randn(10_000_000, 3)) grps = ["A", "B", "C", "D", "E"] * 2_000_000 gb = df.groupby(grps)

res = gb.cumsum(axis=1) alt = df.cumsum(axis=1)

tm.assert_frame_equal(res, alt)

AFAICT GroupBy.cumsum with axis=1 is equivalent to DataFrame.cumsum(axis=1), just slower bc it operates group-by-group and reconstructs (timeit 1.34s vs 3.68s)

Are there scenarios where these are not equivalent? If not, we should either deprecate the GroupBy version or at least have it dispatch to the more performant version.

Same goes for cumprod, cummin, cummax and probably pct_change, shift, rank