PERF: categorical rank · Issue #15498 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
xref #15422 (comment)
easy enough after #15422 to rank the categories themselves rather than using expanded values; prob most relevant for object
dtypes.
In [15]: s = Series(tm.makeCategoricalIndex(100000))
In [16]: res = Series(np.array(s.cat.rename_categories(Series(s.cat.categories).rank()))).rank()
In [17]: res2 = s.rank()
In [18]: res.equals(res2)
Out[18]: True
In [19]: %timeit Series(np.array(s.cat.rename_categories(Series(s.cat.categories).rank()))).rank()
100 loops, best of 3: 4.39 ms per loop
In [20]: %timeit s.rank()
10 loops, best of 3: 132 ms per loop