PERF: don't sort data twice in groupby apply when not using libreduction fast_apply by jorisvandenbossche · Pull Request #40176 · pandas-dev/pandas (original) (raw)

See #40171 (comment) for context, noticed that we were calling splitter._get_sorted_data() twice when using the non-fast_apply fallback.

Using the benchmark case from groupby.Apply.time_scalar_function_single/multi_col (like in #40171 (comment)), but then with bigger data (10 ** 6 instead of 10 ** 4):

N = 10 ** 6
labels = np.random.randint(0, 2000, size=N)
labels2 = np.random.randint(0, 3, size=N)
df = DataFrame(
    {
        "key": labels,
        "key2": labels2,
        "value1": np.random.randn(N),
        "value2": ["foo", "bar", "baz", "qux"] * (N // 4),
    }
)
df_am = df._as_manager("array")

In [2]: %timeit df_am.groupby("key").apply(lambda x: 1)
252 ms ± 17.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  <-- master
166 ms ± 5.7 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)  <-- PR