PERF: DataFrame.join with how=left|right and sort=True by lukemanley · Pull Request #56919 · pandas-dev/pandas (original) (raw)

import pandas as pd

data = [f"i-{i}" for i in range(200_000)]
idx1 = pd.Index(data, dtype="string[pyarrow_numpy]")
idx2 = pd.Index(data[1:], dtype="string[pyarrow_numpy]")

%timeit idx1.join(idx2, how="left", sort=True)

# 330 ms ± 6.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)     -> main
# 182 ms ± 2.68 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)  -> PR