PERF: keep using ObjectEngine for ExtensionArrays for 1.5 by jorisvandenbossche · Pull Request #48472 · pandas-dev/pandas (original) (raw)

Closes #45652, see #45514 (comment) for context.

@jbrockmendel I think we discussed this at some point a few months ago, but I forgot about it, that if we didn't improve ExtensionEngine for nullable dtypes before 1.5, we could continue falling back to ObjectEngine for now, as we did before. @phofl is working on that now, but so not for 1.5.

This PR just does that for all EAs, but maybe we could also only do this for our built-in ones? (where we know that conversion to object dtype is expensive, but not super expensive, while for external EAs we don't really know the cost of that)

idx_ea = pd.Index(np.arange(1_000_000), dtype="Int64")
indexer = np.arange(500, 1000)

In [2]: %timeit idx_ea.get_indexer(indexer)
25.3 ms ± 3.51 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)    # <-- main
211 µs ± 9.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)    # <-- PR

This PR might need to bring over some additional changes from #45514