BUG: sort=False ignored when grouping with a categorical column (original) (raw)

Hello everyone,

I stumbled upon the following behavior of groubpy with categorical, which seems at least inconsistent with the way groupby usually operates.

When grouping on a string type column with sort=False, the order of the groups is the order in which the keys first appear in the column.

However, when grouping with a categorical column, the groups seem to be always ordered by the categorical, even when sort=False.

import pandas as pd d = {'foo': [10, 8, 5, 6, 4, 1, 7], 'bar': [10, 20, 30, 40, 50, 60, 70], 'baz': ['d', 'c', 'e', 'a', 'a', 'd', 'c']} df = pd.DataFrame(d) cat = pd.cut(df['foo'], np.linspace(0, 10, 5)) df['range'] = cat groups = df.groupby('range', sort=True)

Expected behaviour

result = groups.agg('mean')

Why are the categorical still sorted in this case ?

groups2 = df.groupby('range', sort=False) result2 = groups2.agg('mean')

I would expect an output like this one: keep the order in which the groups

are first encountered

groups3 = df.groupby('baz', sort=False) result3 = groups3.agg('mean')

bar foo
range
(0, 2.5] 60 1.0
(2.5, 5] 40 4.0
(5, 7.5] 55 6.5
(7.5, 10] 15 CC
bar foo
range
(0, 2.5] 60 1.0
(2.5, 5] 40 4.0
(5, 7.5] 55 6.5
(7.5, 10] 15 CC
bar foo
baz
d 35 5.5
c 45 7.5
e 30 5.0
a 45 9.0

pd.version Out[110]: '0.15.1'

Setting as_index=False does not change the presented bahavior.