Categorical in dataframe is sorted lexically. · Issue #7848 · pandas-dev/pandas (original) (raw)
code
df = pd.DataFrame({"id":[6,5,4,3,2,1], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']}) df["grade"] = pd.Categorical(df["raw_grade"]) df['grade'].cat.reorder_levels(['b', 'e', 'a'])
sorts 'grade' according to the order of the levels
df.sort(columns=['grade'])
correct output
id raw_grade grade
4 5 b b
3 4 b b
0 1 e e
2 6 a a
1 3 a a
5 2 a a
code
sorts 'grade' lexically
df.sort(columns=['grade', 'id'])
wrong output
id raw_grade grade
4 2 a a
3 3 a a
0 6 a a
2 4 b b
1 5 b b
5 1 e e
When there is more than one element in the columns list, the Categoricals columns are sorted lexically.
pandas: 0.14.1-78-g24b309f