PERF: repeated slicing along index in groupby by jbrockmendel · Pull Request #40353 · pandas-dev/pandas (original) (raw)

import numpy as np
import pandas as pd

N = 10 ** 4
labels = np.random.randint(0, 2000, size=N)
labels2 = np.random.randint(0, 3, size=N)
df = pd.DataFrame(
    {
        "key": labels,
        "key2": labels2,
        "value1": np.random.randn(N),
        "value2": ["foo", "bar", "baz", "qux"] * (N // 4),
    }
)

%prun -s cumtime for i in range(100): df.groupby("key").apply(lambda x: 1)  # on master as of this writing

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    4.170    4.170 {built-in method builtins.exec}
        1    0.001    0.001    4.170    4.170 <string>:1(<module>)
      100    0.001    0.000    4.156    0.042 groupby.py:911(apply)
      100    0.004    0.000    4.150    0.042 groupby.py:960(_python_apply_general)
      100    0.497    0.005    4.084    0.041 ops.py:263(apply)
   199000    0.176    0.000    3.223    0.000 ops.py:1005(__iter__)
   198900    0.248    0.000    2.986    0.000 ops.py:1046(_chop)
   198900    0.339    0.000    2.522    0.000 managers.py:796(get_slice)
   198900    0.188    0.000    1.631    0.000 managers.py:803(<listcomp>)
   596700    0.823    0.000    1.443    0.000 blocks.py:324(getitem_block)
   198900    0.135    0.000    0.409    0.000 base.py:4638(_getitem_slice)

we're still spending 8% of the runtime in get_slice itself, which is why i think avoiding re-doing these checks is worthwhile.

sure, could update get_slice to also use getitem_block_index (i think the first draft of this PR started before get_slice reliably actually got a slice object)