Unexpected behavior for rolling moments when center=True and min_periods < window · Issue #8269 · pandas-dev/pandas (original) (raw)
I noticed while using rolling moment functions with center=True
and with min_periods
< window
that the values at the end of the series were showing up as NaN
when they should have be finite floating point values. The reason for this behavior is that _rolling_moment()
concats extra NaN
values to the end of the input series when center=True
to allow the roll_generic()
function to continue on past the end of the original series and compute the values that will "come within view" when the result is centered. However, adding those NaN
values into the window can bork the calculation and change a result that should be finite into a NaN
.
This example illustrates the behavior:
n = 12
s = Series(np.random.randn(n))
s.plot(color='b')
win=7
minp = 5
pd.rolling_mean(s, win, min_periods=minp, center=False).plot(color='g')
pd.rolling_mean(s, win, min_periods=minp, center=True).plot(color='r')
ticks = plt.xticks(np.arange(0, n, 1.0))
I was able to mitigate this behavior, for one calculation at least, by modifying the cython function to essentially strip the trailing NaN
values for the calculations made towards the end of the series. This requires passing in the offset
calculated in _rolling_moment()
.
Fixing this is going to touch a lot of functions that depend on _rolling_moment()
. Before I go much further with this, I thought I'd raise the issue here and see if anyone else has insight or better ideas about how to fix this.