PERF: Series(pyarrow-backed).rank by lukemanley · Pull Request #50264 · pandas-dev/pandas (original) (raw)
Yes, we could. However, I observed this:
import pandas as pd
import pandas.core.algorithms as algos
import numpy as np
arr = pd.array(np.random.randn(10**6), dtype="float64[pyarrow]")
%timeit algos.rank(arr)
# 1.41 s ± 12.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit algos.rank(arr.to_numpy())
# 525 ms ± 9.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Probably worth a look in algos.rank
to see whats going on. Ok to do as a separate PR or should I look at that first?