BUG: groupby.describe on a frame with duplicate column names · Issue #50806 · pandas-dev/pandas (original) (raw)
xref #46944
pd.set_option("display.max_columns", None)
df = DataFrame([[0, 1, 2, 3]])
df.columns = [0, 1, 2, 0]
gb = df.groupby(df[1])
print(gb.describe(percentiles=[]).to_string())
# 0 2
# count mean std min 50% max count mean std min 50% max count mean std min 50% max
# 1
# 1 1.0 0.0 NaN 0.0 0.0 0.0 1.0 3.0 NaN 3.0 3.0 3.0 1.0 2.0 NaN 2.0 2.0 2.0
With duplicate column names, describe only outputs the values for the first column. There should be two columns named 0
here. This issue occurs for any operation where the groupby code is using _selected_obj
within the _group_selection_context
.