rolling_mean with freq='D' returns all NaNs when there is exactly 1 data point per day · Issue #5955 · pandas-dev/pandas (original) (raw)

related to #3020

$ python Python 2.7.4 (default, Apr 23 2013, 12:22:04) [GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin Type "help", "copyright", "credits" or "license" for more information.

import datetime import pandas as pd pd.version '0.12.0' indices = [datetime.datetime(1975, 1, i, 12, 0) for i in range(1, 6)] series = pd.Series(range(1, 6), index=indices) series = series.map(lambda x: float(x)) # range() returns ints, so force to float series = series.sort_index() # already sorted, but just to be clear series # here's what our input series looks like 1975-01-01 12:00:00 1 1975-01-02 12:00:00 2 1975-01-03 12:00:00 3 1975-01-04 12:00:00 4 1975-01-05 12:00:00 5 dtype: float64 pd.rolling_mean(series, window=2, freq='D') # these results will be wrong 1975-01-01 NaN 1975-01-02 NaN 1975-01-03 NaN 1975-01-04 NaN 1975-01-05 NaN Freq: D, dtype: float64 better_series = series.append(pd.Series([3.0], index=[datetime.datetime(1975, 1, 3, 6, 0)])) better_series = better_series.sort_index() better_series # here's a revised input with more than one datapoint on one of the days 1975-01-01 12:00:00 1 1975-01-02 12:00:00 2 1975-01-03 06:00:00 3 1975-01-03 12:00:00 3 1975-01-04 12:00:00 4 1975-01-05 12:00:00 5 dtype: float64 pd.rolling_mean(better_series, window=2, freq='D') # These results will be correct and are what I expected above 1975-01-01 NaN 1975-01-02 1.5 1975-01-03 2.5 1975-01-04 3.5 1975-01-05 4.5 Freq: D, dtype: float64