PERF: Series(pyarrow-backed).rank by lukemanley · Pull Request #50264 · pandas-dev/pandas (original) (raw)

Yes, we could. However, I observed this:

import pandas as pd 
import pandas.core.algorithms as algos
import numpy as np

arr = pd.array(np.random.randn(10**6), dtype="float64[pyarrow]")

%timeit algos.rank(arr)
# 1.41 s ± 12.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit algos.rank(arr.to_numpy())
# 525 ms ± 9.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Probably worth a look in algos.rank to see whats going on. Ok to do as a separate PR or should I look at that first?