API/ENH: master issue for pd.rolling_apply · Issue #8659 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
Catchall for rolling_*
issues:
- ENH: window functions need to exclude nuiscance columns #12537 exclude nuiscance columns
- ENH: support timedeltas with window functions #12536 fully support timedeltas
- Non-Reducing Return Values in Rolling .apply #4130 (return type of
rolling_apply
) - PERF/ENH: np.apply_along_axis -> reduce in moments.py/nanops.py #3185 (stop using
np.apply_along_axis
) - ENH/API: rolling_apply to pass frames to the rolled function (rather than ndarrays) #5071, Functions applied on .expanding() receive ndarrays rather than pandas objects #12950 (pass frames, operate 2-D)
- BUG: pandas rolling_quantile does not use interpolation #9413
rolling_quantile
- Implement high performance rolling_rank #9481
rolling_idxmax
- Rolling qcut #10759 rolling_qcut
- ENH: Support returning the same dtype as the caller for window ops (including extension dtypes) #11446, rolling.apply or rolling.agg doesn't work on string columns #20773 casting in
.apply
- Implementing rolling min/max functions that can retain the original type #12595 casting in all window functions (min/max are good)
- ENH: Add the moment function as DataFrame and Series method WITH namespacing #10702 namespacing (though prob just create a
Rolling
object), API: provide Rolling/Expanding/EWM objects for deferred rolling type calculations #10702 #11603 - TST: test_window/Datetimelike adjustments #12535 testing of datetimelike with nans
- BUG: window function count should count anything non-null #12541 count should be integer dtype (and deal with infs)
- API: Table-wise rolling / expanding / EWM function application #15095 Table-wise
Hi all,
I intended to apply a function that gives on each day a ranking based on means calculated from previous n-day's data. The natural way is to use pd.rolling_apply. A toy example:
In [93]: df = pd.DataFrame(np.random.randint(10, size=20).reshape(4, 5))
In [94]: df
Out[94]:
0 1 2 3 4
0 2 0 0 2 0
1 9 5 5 6 1
2 2 3 6 8 8
3 5 1 2 9 0
In [95]: import bottleneck as bn
In [96]: bn.nanrankdata(df.mean())
Out[96]: array([ 4. , 1.5, 3. , 5. , 1.5])
Up to now, it is cool. Then:
In [97]: pd.rolling_apply(df, 2, lambda x: bn.nanrankdata(bn.nanmean(x, axis=0)))
Out[97]:
0 1 2 3 4
0 NaN NaN NaN NaN NaN
1 1 1 1 1 1
2 1 1 1 1 1
3 1 1 1 1 1
This is clearly wrong. Is this a bug?