Message 259570 - Python tracker (original) (raw)

tl; dr I'm disappointed. According to the statistics module, running the bm_regex_v8.py benchmark more times with more iterations make the benchmark more unstable... I expected the opposite...

Patch version 2:

To measure the stability of perf.py, I pinned perf.py to CPU cores which are isolated of the system using Linux "isolcpus" kernel parameter. I also forced the CPU frequency governor to "performance" and enabled "no HZ full" on these cores.

I ran perf.py 5 times on regex_v8.

Calibration (original => patched):

Approximated duration of the benchmark (original => patched):

(I made a mistake, so I was unable to get the exact duration.)

Hum, maybe timings are not well chosen because the benchmark is really slow (minutes vs seconds) :-/

Standard deviation, --fast:

Standard deviation, (no option):

Variance, --fast:

Variance, (no option):

Legend:

It's not easy to compare these values since the number of iterations is very different (1 => 16) and so timings are very different (ex: 0.059 sec => 0.950 sec). I guess that it's ok to compare percentages.

I used the stability.py script, attached to this issue, to compute deviation and variance from the "Min" line of the 5 runs. The script takes the output of perf.py as input.

I'm not sure that 5 runs are enough to compute statistics.

--

Raw data.

Original perf.py.

$ grep ^Min original.fast Min: 0.059236 -> 0.045948: 1.29x faster Min: 0.059005 -> 0.044654: 1.32x faster Min: 0.059601 -> 0.044547: 1.34x faster Min: 0.060605 -> 0.044600: 1.36x faster

$ grep ^Min original Min: 0.060479 -> 0.044762: 1.35x faster Min: 0.059002 -> 0.045689: 1.29x faster Min: 0.058991 -> 0.044587: 1.32x faster Min: 0.060231 -> 0.044364: 1.36x faster Min: 0.059165 -> 0.044464: 1.33x faster

Patched perf.py.

$ grep ^Min patched.fast Min: 0.950717 -> 0.711018: 1.34x faster Min: 0.968413 -> 0.730810: 1.33x faster Min: 0.976092 -> 0.847388: 1.15x faster Min: 0.964355 -> 0.711083: 1.36x faster Min: 0.976573 -> 0.712081: 1.37x faster

$ grep ^Min patched Min: 0.968810 -> 0.729109: 1.33x faster Min: 0.973615 -> 0.731308: 1.33x faster Min: 0.974215 -> 0.734259: 1.33x faster Min: 0.978781 -> 0.709915: 1.38x faster Min: 0.955977 -> 0.729387: 1.31x faster

$ grep ^Calibration patched.fast Calibration: num_runs=50, num_loops=16 (0.73 sec per run > min_time 0.50 sec, estimated total: 36.4 sec) Calibration: num_runs=50, num_loops=16 (0.75 sec per run > min_time 0.50 sec, estimated total: 37.3 sec) Calibration: num_runs=50, num_loops=16 (0.75 sec per run > min_time 0.50 sec, estimated total: 37.4 sec) Calibration: num_runs=50, num_loops=16 (0.73 sec per run > min_time 0.50 sec, estimated total: 36.6 sec) Calibration: num_runs=50, num_loops=16 (0.73 sec per run > min_time 0.50 sec, estimated total: 36.7 sec)

$ grep ^Calibration patched Calibration: num_runs=100, num_loops=16 (0.73 sec per run > min_time 0.50 sec, estimated total: 73.0 sec) Calibration: num_runs=100, num_loops=16 (0.75 sec per run > min_time 0.50 sec, estimated total: 75.3 sec) Calibration: num_runs=100, num_loops=16 (0.73 sec per run > min_time 0.50 sec, estimated total: 73.2 sec) Calibration: num_runs=100, num_loops=16 (0.74 sec per run > min_time 0.50 sec, estimated total: 73.7 sec) Calibration: num_runs=100, num_loops=16 (0.73 sec per run > min_time 0.50 sec, estimated total: 72.9 sec)