bpo-26301: Add a fast-path for list[index] in BINARY_SUBSCR opcode by pablogsal · Pull Request #9853 · python/cpython (original) (raw)
Based on a path by Zach Byrne and @vstinner.
./python -m perf timeit "l = [1,2,3,4,5,6]" "l[3]" -o old_index.json
./python -m perf timeit "l = [1,2,3,4,5,6]" "l[3]" -o new_index.json
./python -m perf compare_to old_index.json ../cpython/new_index.json
Mean +- std dev: [old_index] 209 ns +- 8 ns -> [new_index] 194 ns +- 4 ns: 1.08x faster (-7%)
Cache information
BASELINE
Performance counter stats for './python -c
l = [1,2,3,4,5,6]
for _ in range(1000):
l[3]' (200 runs):
665,682 cache-references:u ( +- 0.08% )
4,030 cache-misses:u # 0.605 % of all cache refs ( +- 7.73% )
60,529,180 cycles:u ( +- 0.12% )
87,335,803 instructions:u # 1.44 insn per cycle ( +- 0.01% )
22,946,153 branches:u ( +- 0.01% )
1,018 faults:u ( +- 0.01% )
0 migrations:u
0.024963 +- 0.000185 seconds time elapsed ( +- 0.74% )
PATCH
Performance counter stats for './python -c l = [1,2,3,4,5,6]
for _ in range(1000):
l[3]' (200 runs):
670,231 cache-references:u ( +- 0.09% )
3,261 cache-misses:u # 0.487 % of all cache refs ( +- 7.23% )
60,717,884 cycles:u ( +- 0.13% )
86,998,772 instructions:u # 1.43 insn per cycle ( +- 0.01% )
22,873,947 branches:u ( +- 0.01% )
1,017 faults:u ( +- 0.01% )
0 migrations:u
0.025440 +- 0.000182 seconds time elapsed ( +- 0.72% )