BUG: DataFrameGroupBy.apply ignores group_keys setting when empty · Issue #60471 · pandas-dev/pandas (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
df = pd.DataFrame({'A': 'a a b'.split(), 'B': [1, 2, 3], 'C': [4, 6, 5]}) g1 = df.groupby('A', group_keys=False)
df = pd.DataFrame({'A': [], 'B': [], 'C': []}) g2 = df.groupby('A', group_keys=False) g3 = df.groupby('A', group_keys=True)
r1 = g1.apply(lambda x: x / x.sum()) r2 = g2.apply(lambda x: x / x.sum()) r3 = g3.apply(lambda x: x / x.sum())
print(r1.index) # Index([0, 1, 2], dtype='int64') print(r2.index) # Index([], dtype='float64', name='A') print(r3.index) # Index([], dtype='float64', name='A')
Issue Description
The group_keys parameter has no effect when the source dataframe is empty
Expected Behavior
group_keys=False should not include the group keys into the index regardless of whether the source dataframe is empty
I would expect results such as:
print(r2.index) # Index([], dtype='float64') print(r2.index) # RangeIndex(start=0, stop=0, step=1)
Installed Versions
INSTALLED VERSIONS
commit : 0691c5c
python : 3.10.11
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 141 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : es_ES.cp1252
pandas : 2.2.3
numpy : 1.26.4
pytz : 2024.2
dateutil : 2.9.0.post0
pip : 23.0.1
Cython : None
sphinx : None
IPython : 8.30.0
adbc-driver-postgresql: None
...
zstandard : None
tzdata : 2024.2
qtpy : None
pyqt5 : None