BUG: sort=False ignored when grouping with a categorical column (original) (raw)

Hello everyone,

I stumbled upon the following behavior of groubpy with categorical, which seems at least inconsistent with the way groupby usually operates.

When grouping on a string type column with sort=False, the order of the groups is the order in which the keys first appear in the column.

However, when grouping with a categorical column, the groups seem to be always ordered by the categorical, even when sort=False.

import pandas as pd d = {'foo': [10, 8, 5, 6, 4, 1, 7], 'bar': [10, 20, 30, 40, 50, 60, 70], 'baz': ['d', 'c', 'e', 'a', 'a', 'd', 'c']} df = pd.DataFrame(d) cat = pd.cut(df['foo'], np.linspace(0, 10, 5)) df['range'] = cat groups = df.groupby('range', sort=True)

Expected behaviour

result = groups.agg('mean')

Why are the categorical still sorted in this case ?

groups2 = df.groupby('range', sort=False) result2 = groups2.agg('mean')

I would expect an output like this one: keep the order in which the groups

are first encountered

groups3 = df.groupby('baz', sort=False) result3 = groups3.agg('mean')

bar	foo
range
(0, 2.5]	60	1.0
(2.5, 5]	40	4.0
(5, 7.5]	55	6.5
(7.5, 10]	15	CC

bar	foo
range
(0, 2.5]	60	1.0
(2.5, 5]	40	4.0
(5, 7.5]	55	6.5
(7.5, 10]	15	CC

bar	foo
baz
d	35	5.5
c	45	7.5
e	30	5.0
a	45	9.0

pd.version Out[110]: '0.15.1'

Setting as_index=False does not change the presented bahavior.