PERF: ArrowExtensionArray comparison methods by lukemanley · Pull Request #50524 · pandas-dev/pandas (original) (raw)

Perf improvement in ArrowExtensionArray comparison methods when array contains nulls. Improvement is from avoiding conversion to object array.

import pandas as pd
import numpy as np

data = np.random.randn(10**6)
data[0] = np.nan

arr = pd.array(data, dtype="float64[pyarrow]")

%timeit arr > 0

# 80.5 ms ± 619 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)    <- main
# 4.05 ms ± 11.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  <- PR