PERF: categorical rank · Issue #15498 · pandas-dev/pandas (original) (raw)

Skip to content

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sign up

Appearance settings

@jreback

Description

@jreback

xref #15422 (comment)

easy enough after #15422 to rank the categories themselves rather than using expanded values; prob most relevant for object dtypes.

In [15]: s = Series(tm.makeCategoricalIndex(100000))

In [16]: res = Series(np.array(s.cat.rename_categories(Series(s.cat.categories).rank()))).rank()

In [17]: res2 = s.rank()

In [18]: res.equals(res2)
Out[18]: True

In [19]: %timeit Series(np.array(s.cat.rename_categories(Series(s.cat.categories).rank()))).rank()
100 loops, best of 3: 4.39 ms per loop

In [20]: %timeit s.rank()
10 loops, best of 3: 132 ms per loop