pd.DataFrame.expanding bug with axis = 1 · Issue #23372 · pandas-dev/pandas (original) (raw)
import pandas as pd import numpy as np
df = pd.DataFrame(np.ones((10, 20)))
df.expanding(3, axis = 1).sum()
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
1 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
2 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
3 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
4 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
5 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
6 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
7 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
8 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
9 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 10.0 10.0 10.0 10.0
14 15 16 17 18 19
0 10.0 10.0 10.0 10.0 10.0 10.0
1 10.0 10.0 10.0 10.0 10.0 10.0
2 10.0 10.0 10.0 10.0 10.0 10.0
3 10.0 10.0 10.0 10.0 10.0 10.0
4 10.0 10.0 10.0 10.0 10.0 10.0
5 10.0 10.0 10.0 10.0 10.0 10.0
6 10.0 10.0 10.0 10.0 10.0 10.0
7 10.0 10.0 10.0 10.0 10.0 10.0
8 10.0 10.0 10.0 10.0 10.0 10.0
9 10.0 10.0 10.0 10.0 10.0 10.0
axis = 1 on pd.DataFrame.expanding doesn't seem to be working as expected; a horizontal expanding window. Instead, the window size appears to be constrained to the number of rows.
axis = 0 is working fine as a vertical expanding window
Expected Output
df.T.expanding(3, axis = 0).sum().T
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
1 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
2 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
3 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
4 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
5 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
6 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
7 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
8 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
9 NaN NaN 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0 14.0
14 15 16 17 18 19
0 15.0 16.0 17.0 18.0 19.0 20.0
1 15.0 16.0 17.0 18.0 19.0 20.0
2 15.0 16.0 17.0 18.0 19.0 20.0
3 15.0 16.0 17.0 18.0 19.0 20.0
4 15.0 16.0 17.0 18.0 19.0 20.0
5 15.0 16.0 17.0 18.0 19.0 20.0
6 15.0 16.0 17.0 18.0 19.0 20.0
7 15.0 16.0 17.0 18.0 19.0 20.0
8 15.0 16.0 17.0 18.0 19.0 20.0
9 15.0 16.0 17.0 18.0 19.0 20.0
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.7.1.final.0 python-bits: 64 OS: Linux OS-release: 4.17.19-1-MANJARO machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_SG.UTF-8 LOCALE: en_SG.UTF-8
pandas: 0.23.4
pytest: None
pip: 18.0
setuptools: 40.4.3
Cython: 0.29
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.12
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None