bpo-4356: Add key parameter to functions in bisect module by remilapeyre · Pull Request #11781 · python/cpython (original) (raw)

Hi Raymond, thanks for your comment.

I may be missing something but I'm not convinced the key function will get called multiple time per value.

I did some research before implementing this and I think your first point comes from the implementation of the key parameter in sorted() where the result of key on each element of the iterable is cached to avoid computing multiple time. ISTM that this is only necessary because the worst case complexity for sorting is n*ln(n) and a given element will get compared to multiple others to find its place in the resulting collection.

In binary search, most elements are not touched and it element e that is compared to the input x is only touched once so key(e) should only be computed once. As I commented in the code, I cached the result of key(x) so it would not be computed at each iteration. After all, each comparison done are against the same value (key(x)) so it would be wasteful to do them more than once, whether we use an auxiliary function or not.

About your second point, I think you say this because of the branching in the hot path. I did some tests before posting the pull request. The performance seems to be the same (I'm not sure this is a good to measure it thought, I would love some input on that):

➜  cpython git:(add-key-argument-to-bisect) python3 -m timeit -s "import bisect" "bisect.bisect(range(1_000_000_000_000_000), 25)"
50000 loops, best of 5: 5.23 usec per loop
➜  cpython git:(add-key-argument-to-bisect) ./python.exe -m timeit -s "import bisect" "bisect.bisect(range(1_000_000_000_000_000), 25)"
50000 loops, best of 5: 4.74 usec per loop
➜  cpython git:(add-key-argument-to-bisect) ./python.exe -m timeit -s "import bisect" "bisect.bisect(range(1_000_000_000_000_000), 25, key=lambda e: e)"
20000 loops, best of 5: 10 usec per loop

I guess the branch predictor does a good job here (?) and why no change of performance is seen (does someone know a good reference on branch predictors, out-of-order execution and other low-level performance details? I would like to learn more about them).

If I'm not making any mistake, the key argument can be safely be added here and the sorted collection is not necessary.

Am I missing the point completely?