rank incorrectly orders ordered categories · Issue #15420 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

import pandas as pd a = pd.DataFrame(['first', 'second', 'third', 'fourth', 'fifth', 'sixth'], columns=['A']) a['A'] = a['A'].astype('category', ).cat.set_categories( ['first', 'second', 'third', 'fourth', 'fifth', 'sixth'], ordered=True) a['A'].rank()

outputs:

0 2.0

1 4.0

2 6.0

3 3.0

4 1.0

5 5.0

Problem description

rank seems to be ignoring the order of ordered categories.

Expected Output

0    1.0
1    2.0
2    3.0
3    4.0
4    5.0
5    6.0

Output of pd.show_versions()

``` INSTALLED VERSIONS ------------------ commit: None python: 3.6.0.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-59-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.2.0
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None