Unexpected behavior for rolling moments when center=True and min_periods < window · Issue #8269 · pandas-dev/pandas (original) (raw)

I noticed while using rolling moment functions with center=True and with min_periods < window that the values at the end of the series were showing up as NaN when they should have be finite floating point values. The reason for this behavior is that _rolling_moment() concats extra NaN values to the end of the input series when center=True to allow the roll_generic() function to continue on past the end of the original series and compute the values that will "come within view" when the result is centered. However, adding those NaN values into the window can bork the calculation and change a result that should be finite into a NaN.

This example illustrates the behavior:

n = 12
s = Series(np.random.randn(n))
s.plot(color='b')
win=7
minp = 5
pd.rolling_mean(s, win, min_periods=minp, center=False).plot(color='g')
pd.rolling_mean(s, win, min_periods=minp, center=True).plot(color='r')
ticks = plt.xticks(np.arange(0, n, 1.0))

image

I was able to mitigate this behavior, for one calculation at least, by modifying the cython function to essentially strip the trailing NaN values for the calculations made towards the end of the series. This requires passing in the offset calculated in _rolling_moment().

Fixing this is going to touch a lot of functions that depend on _rolling_moment(). Before I go much further with this, I thought I'd raise the issue here and see if anyone else has insight or better ideas about how to fix this.