BUG: ValueError on groupby with categoricals · Issue #34951 · pandas-dev/pandas (original) (raw)


In specific situations involving categorical columns, a groupby() on two or more columns runs into an error:

Code Sample

import pandas as pd col = pd.Categorical([0, 1]) df = pd.DataFrame({'A': col, 'B': col, 'C': col}) grouped = df.groupby(['A', 'B']).first()

ValueError: Shape of passed values is (4, 1), indices imply (2, 1)

Expected Output

Expected output would be something like:

>>> grouped
       C
A B     
0 0    0
  1  NaN
1 0  NaN
  1    1

Instead an exception is thrown, with the error message shown above.