BUG: Groupby(sort=False) with datetime-like Categorical raises ValueError · Issue #10505 · pandas-dev/pandas (original) (raw)

Related to #10501, but not the same. groupby can accept Categorical and sort keyword.

df = pd.DataFrame({'A': [1, 2, 3 ,4], 'B': [5, 6, 7, 8]})

# OK
df.groupby(pd.Categorical(['A', 'B', 'A', 'B'])).groups
# {'A': [0, 2], 'B': [1, 3]}

# OK
df.groupby(pd.Categorical(['A', 'B', 'A', 'B']), sort=False).groups
# {'A': [0, 2], 'B': [1, 3]}

If Categorical has datetime-like categories, groupby fails if sort=False is specified.

# OK
df.groupby(pd.Categorical(pd.DatetimeIndex(['2011', '2012', '2011', '2012']))).groups
# {numpy.datetime64('2011-01-01T09:00:00.000000000+0900'): [0, 2], 
#  numpy.datetime64('2012-01-01T09:00:00.000000000+0900'): [1, 3]}

# NG
df.groupby(pd.Categorical(pd.DatetimeIndex(['2011', '2012', '2011', '2012'])), sort=False).groups
# ValueError: items in new_categories are not the same as in old categories