API: Table-wise rolling / expanding / EWM function application (original) (raw)

In #11603 (comment) (the main PR implementing the deferred API for rolling / expanding / ewm), we discussed how to specify table-wise applys. Groupby.apply(f) feeds the entire group (all columns) to f. For backwards-compatibility, .rolling(n).apply(f) needed to be column-wise.

#11603 (comment) mentions a possible API like what I added for .style

So it'd be df.rolling(n).apply(f, axis=None).
Do people like the axis=0 / 1 / None idiom? Is it obvious enough?

This is prompted by @josef-pkt's post on the mailinglist. Needing a rolling OLS.

An example:

In [2]: import numpy as np ...: import pandas as pd ...: ...: np.random.seed(0) ...: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 2)), columns=["A", "B"]) ...: df ...: Out[2]: A B 0 5 0 1 3 3 2 7 9 3 3 5 4 2 4 5 7 6 6 8 8 7 1 6 8 7 7 9 8 1

For a concrete example, get the table-wise max (this is equivalent to df.rolling(4).max().max(1))

In [10]: df.rolling(4).apply(np.max, axis=None) Out[10]: 0 NaN 1 NaN 2 NaN 3 9.0 4 9.0 5 9.0 6 8.0 7 8.0 8 8.0 9 8.0 dtype: float64

A real example is something like a rolling OLS:

import statsmodels.api as sm f = lambda x: sm.OLS.from_formula('A ~ B', data=x).fit() # wrong, but w/e

df.rolling(5).apply(f, axis=None)