pd.concat reordering categorical levels lexically · Issue #7864 · pandas-dev/pandas (original) (raw)

Look at dfx

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"id":[1,2,3,4,5,6], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']}) ...: df["grade"] = pd.Categorical(df["raw_grade"]) ...: df['grade'].cat.reorder_levels(['e', 'a', 'b']) ...:

In [3]: df1 = df[0:3] ...: df2 = df[3:] ...:

In [4]: df['grade'].cat.levels Out[4]: Index([u'e', u'a', u'b'], dtype='object')

In [5]: df1['grade'].cat.levels Out[5]: Index([u'e', u'a', u'b'], dtype='object')

In [6]: df2['grade'].cat.levels Out[6]: Index([u'e', u'a', u'b'], dtype='object')

In [7]: dfx = pd.concat([df1, df2])

In [8]: dfx['grade'].cat.levels Out[8]: Index([u'a', u'b', u'e'], dtype='object')

version: pandas: 0.14.1-78-g24b309f

This is still the case after either of PR #7768 and PR #7850.