BUG: ewma() doesn't adjust weights properly for missing values · Issue #7543 · pandas-dev/pandas (original) (raw)
ewma() simply ignores missing values, effectively calculating the exponentially weighted moving average on the compacted series without the missing values. I think this is incorrect, and that values should be weighted based on their absolute location.
In the code below, I reproduce the ewma() calculation using "wrong" weights, and then show what I believe the correct result should be using the "right" weights.
In [1]: from pandas import Series, ewma
In [2]: def simple_wma(x, w):
...: return x.multiply(w).cumsum() / w.cumsum()
...:
In [3]: s = Series([0, None, 100])
In [4]: com = 2
In [5]: alpha = 1/(1+com)
In [6]: wrong_weights_adjust_false = Series([(1-alpha), None, alpha])
In [7]: wrong_weights_adjust_true = Series([(1-alpha), None, 1])
In [8]: right_weights_adjust_false = Series([(1-alpha)**2, None, alpha])
In [9]: right_weights_adjust_true = Series([(1-alpha)**2, None, 1])
In [10]: ewma(s, com=com, adjust=False)
Out[10]:
0 0.000000
1 0.000000
2 33.333333
dtype: float64
In [11]: simple_wma(s, wrong_weights_adjust_false)
Out[11]:
0 0.000000
1 NaN
2 33.333333
dtype: float64
In [12]: simple_wma(s, right_weights_adjust_false)
Out[12]:
0 0.000000
1 NaN
2 42.857143
dtype: float64
In [13]: ewma(s, com=com, adjust=True)
Out[13]:
0 0
1 0
2 60
dtype: float64
In [14]: simple_wma(s, wrong_weights_adjust_true)
Out[14]:
0 0
1 NaN
2 60
dtype: float64
In [15]: simple_wma(s, right_weights_adjust_true)
Out[15]:
0 0.000000
1 NaN
2 69.230769
dtype: float64
In [16]: s2 = Series([0, 100])
In [17]: ewma(s2, com=com, adjust=False)
Out[17]:
0 0.000000
1 33.333333
dtype: float64
In [18]: ewma(s2, com=com, adjust=True)
Out[18]:
0 0
1 60
dtype: float64