Message 347649 - Python tracker (original) (raw)

Decreasing the total wall time for a default --enable-optimizations build would be a good thing for everyone, provided the resulting interpreter remains "effectively similar" in speed. If you somehow manage to find something that actually speeds up the resulting interpreter, amazing!

I spent quite a lot of time making different PGO builds and comparing with pyperformance. The current PGO task is really slow. Just running the PROFILE_TASK takes 24 minutes on my decently fast PC.

Using this set of tests seems to work pretty well:

PROFILE_TASK=-m test.regrtest --pgo
test_collections
test_dataclasses
test_difflib
test_embed
test_float
test_functools
test_generators
test_int
test_itertools
test_json
test_logging
test_long
test_ordered_dict
test_pickle
test_pprint
test_re
test_set
test_statistics
test_struct
test_tabnanny
test_xml_etree

Instead of 24 minutes, the above task takes one and a half minutes. pyperformance results seem largely unchanged. Comparison below. Tuning the tests to get the best pyperformance result is a bit dangerous and perhaps running the whole test suite is safer (i.e. we are not optimizing for specific benchmarks). I didn't tweak the list too much. I added test_int, test_long, test_struct and test_itertools as a result of my pyperformance runs. Not too surprising those are important modules.

I think the set of tests above should do a pretty good job of covering the hot code paths in most Python programs. So, maybe it is good enough given the massive speedup in build time.

+-------------------------+----------+------------------------------+ | Benchmark | task-all | task-short | +=========================+==========+==============================+ | 2to3 | 311 ms | 315 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | chaos | 111 ms | 108 ms: 1.02x faster (-2%) | +-------------------------+----------+------------------------------+ | crypto_pyaes | 114 ms | 112 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | dulwich_log | 78.0 ms | 78.7 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | fannkuch | 470 ms | 452 ms: 1.04x faster (-4%) | +-------------------------+----------+------------------------------+ | float | 118 ms | 117 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | go | 253 ms | 255 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | json_dumps | 12.5 ms | 11.8 ms: 1.06x faster (-6%) | +-------------------------+----------+------------------------------+ | json_loads | 26.3 us | 25.4 us: 1.04x faster (-3%) | +-------------------------+----------+------------------------------+ | logging_format | 9.53 us | 9.66 us: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | logging_silent | 198 ns | 196 ns: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | mako | 15.2 ms | 15.8 ms: 1.04x slower (+4%) | +-------------------------+----------+------------------------------+ | meteor_contest | 98.2 ms | 96.8 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | nbody | 135 ms | 133 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | nqueens | 97.2 ms | 96.6 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | pathlib | 19.4 ms | 19.7 ms: 1.02x slower (+2%) | +-------------------------+----------+------------------------------+ | pickle | 8.10 us | 9.07 us: 1.12x slower (+12%) | +-------------------------+----------+------------------------------+ | pickle_dict | 23.1 us | 18.6 us: 1.25x faster (-20%) | +-------------------------+----------+------------------------------+ | pickle_list | 3.64 us | 2.81 us: 1.30x faster (-23%) | +-------------------------+----------+------------------------------+ | pickle_pure_python | 470 us | 460 us: 1.02x faster (-2%) | +-------------------------+----------+------------------------------+ | pidigits | 169 ms | 173 ms: 1.02x slower (+2%) | +-------------------------+----------+------------------------------+ | python_startup | 7.94 ms | 8.02 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | python_startup_no_site | 5.44 ms | 5.49 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | raytrace | 495 ms | 490 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | regex_dna | 172 ms | 179 ms: 1.04x slower (+4%) | +-------------------------+----------+------------------------------+ | regex_effbot | 2.95 ms | 2.85 ms: 1.04x faster (-3%) | +-------------------------+----------+------------------------------+ | regex_v8 | 20.7 ms | 21.5 ms: 1.04x slower (+4%) | +-------------------------+----------+------------------------------+ | richards | 68.9 ms | 69.8 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | scimark_sparse_mat_mult | 4.57 ms | 4.29 ms: 1.07x faster (-6%) | +-------------------------+----------+------------------------------+ | spectral_norm | 134 ms | 133 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | sqlalchemy_declarative | 161 ms | 163 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | sqlalchemy_imperative | 30.6 ms | 31.0 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | sqlite_synth | 2.90 us | 2.95 us: 1.02x slower (+2%) | +-------------------------+----------+------------------------------+ | sympy_expand | 422 ms | 418 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+ | sympy_integrate | 19.0 ms | 19.2 ms: 1.01x slower (+1%) | +-------------------------+----------+------------------------------+ | sympy_sum | 89.6 ms | 91.7 ms: 1.02x slower (+2%) | +-------------------------+----------+------------------------------+ | telco | 6.06 ms | 6.28 ms: 1.04x slower (+4%) | +-------------------------+----------+------------------------------+ | tornado_http | 178 ms | 181 ms: 1.02x slower (+2%) | +-------------------------+----------+------------------------------+ | unpickle_list | 3.97 us | 3.78 us: 1.05x faster (-5%) | +-------------------------+----------+------------------------------+ | unpickle_pure_python | 326 us | 324 us: 1.00x faster (-0%) | +-------------------------+----------+------------------------------+ | xml_etree_generate | 90.6 ms | 91.0 ms: 1.00x slower (+0%) | +-------------------------+----------+------------------------------+ | xml_etree_process | 72.0 ms | 71.4 ms: 1.01x faster (-1%) | +-------------------------+----------+------------------------------+

Not significant (15): deltablue; django_template; hexiom; html5lib; logging_simple; regex_compile; scimark_fft; scimark_lu; scimark_monte_carlo; scimark_sor; sympy_str; unpack_sequence; unpickle; xml_etree_parse; xml_etree_iterparse