BUG: groupby.rolling: Originial index is not being preserved when using date_part of DatetimeIndex and as_index key word seems to have no effect · Issue #39433 · pandas-dev/pandas (original) (raw)


Code Sample, a copy-pastable example

import pandas as pd

data = [ ["A", "2018-01-01", 100], ["A", "2018-01-02", 200], ["B", "2018-01-01", 150], ["B", "2018-01-02", 250], ] df = pd.DataFrame(data, columns=["id", "date", "num"]) df["date"] = pd.to_datetime(df["date"]) df.set_index(["date"], inplace=True)

df_res1 = df.groupby([df.id, df.index.weekday]).rolling(window=2, min_periods=1).mean() df_res2 = df.groupby([df.id, df.index.weekday], as_index=False).rolling(window=2, min_periods=1).mean() df_res3 = df.groupby([df.id]).rolling(window=2, min_periods=1).mean() df_res4 = df.groupby([df.id], as_index=False).rolling(window=2, min_periods=1).mean() pd.show_versions() print(df_res1.head()) print(df_res1.index.names) print(df_res2.index.names) print(df_res3.index.names) print(df_res4.index.names)

if name == 'main': ...

Problem description

Comparing pandas 1.0.1 and 1.2.1 yields:

  1. When using a date part like df.index.weekday to group by and using some rolling window function like .mean() the original datetime index is not part of the resulting frames index anymore.
  2. The group by keyword argumen as_index seems to have no effect in that situation anymore

Expected Output

['id', 'date', 'date']
[None, 'date']
['id', 'date']
[None, 'date']

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 9d598a5
python : 3.8.4.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-45-generic
Version : #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.2.1
numpy : 1.19.1