[Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build) (original) (raw)
Brett Cannon brett at python.org
Mon Oct 1 01:12:47 CEST 2012
- Previous message: [Python-Dev] benchmarks: Force map to a list to guarantee the calculations are performed under
- Next message: [Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I am presenting the talk "Python 3.3: Trust Me, It's Better Than 2.7" as PyCon Argentina and Brasil (and US if they accept the talk). As part of that talk I need to be able to benchmark Python 3.3 against 2.7 (both from tip) using the unladen benchmarks (which now include benchmarks from PyPy that can be relatively easily ported to Python 3).
To make sure the unladen benchmarks run fine against Python 3.3, I did a fast run of the benchmarks. I figured people might be interested in the quick-and-dirty results on my 2 GHz Intel Core i7 MacBook Pro w/ 8 GB RAM and no attempt to control for performance beyond not actively browsing the web. As I said, quick-and-dirty and not authoritative; all done just to make sure all the benchmarks could run to completion (including the django, html5lib, and genshi benchmarks which are only on my laptop ATM until those projects cut a release with official Python 3 support).
One thing to keep in mind is that many benchmarks use a raw str for things, so the benchmarks often compare Python 2.7 str vs. Python 3.3 str (i.e. str vs. unicode). While this might seem unfair, this is what real-world comparisons in performance will be from users so it's an (somewhat unfair) comparison that we just have to live with. I might take the time to try to make some tests run under both raw strings and unicode so both comparisons are available.
If you care about helping out with the benchmarks (e.g. helping spot where the iteration counts should be higher, etc.) then head over to the speed at mailing list.
python3 perf.py -T --basedir ../benchmarks -f -b py3k ../cpython/builds/2.7-wide/bin/python ../cpython/builds/3.3/bin/python3.3
... output about the command line for the benchmarks ...
2to3
0.785234 -> 0.722169: 1.09x faster
call_method
Min: 0.491433 -> 0.414841: 1.18x faster Avg: 0.493640 -> 0.416564: 1.19x faster Significant (t=127.21) Stddev: 0.00170 -> 0.00162: 1.0513x smaller
call_method_slots
Min: 0.492749 -> 0.416280: 1.18x faster Avg: 0.497888 -> 0.419275: 1.19x faster Significant (t=61.72) Stddev: 0.00433 -> 0.00237: 1.8304x smaller
call_method_unknown
Min: 0.575536 -> 0.427234: 1.35x faster Avg: 0.577286 -> 0.433428: 1.33x faster Significant (t=66.09) Stddev: 0.00117 -> 0.00835: 7.1621x larger
call_simple
Min: 0.413011 -> 0.338923: 1.22x faster Avg: 0.415862 -> 0.340699: 1.22x faster Significant (t=111.94) Stddev: 0.00223 -> 0.00134: 1.6616x smaller
chaos
Min: 0.375286 -> 0.435456: 1.16x slower Avg: 0.382798 -> 0.459515: 1.20x slower Significant (t=-5.01) Stddev: 0.01116 -> 0.03234: 2.8980x larger
fastpickle
Min: 0.853560 -> 0.770580: 1.11x faster Avg: 0.879498 -> 0.776249: 1.13x faster Significant (t=8.24) Stddev: 0.02771 -> 0.00407: 6.7995x smaller
float
Min: 0.476596 -> 0.391101: 1.22x faster Avg: 0.486164 -> 0.411553: 1.18x faster Significant (t=9.07) Stddev: 0.01049 -> 0.01511: 1.4411x larger
formatted_logging
Min: 0.346703 -> 0.451643: 1.30x slower Avg: 0.351218 -> 0.454626: 1.29x slower Significant (t=-51.50) Stddev: 0.00376 -> 0.00246: 1.5265x smaller
genshi
Min: 0.275107 -> 0.294309: 1.07x slower Avg: 0.287433 -> 0.299026: 1.04x slower Significant (t=-3.82) Stddev: 0.01077 -> 0.00467: 2.3044x smaller
go
Min: 0.719160 -> 0.781042: 1.09x slower Avg: 0.729322 -> 0.798135: 1.09x slower Significant (t=-8.54) Stddev: 0.01300 -> 0.01248: 1.0415x smaller
hexiom2
203.842661 -> 187.107363: 1.09x faster
iterative_count
Min: 0.145088 -> 0.153285: 1.06x slower Avg: 0.146369 -> 0.154425: 1.06x slower Significant (t=-9.21) Stddev: 0.00134 -> 0.00142: 1.0569x larger
json_dump_v2
Min: 3.512367 -> 4.040813: 1.15x slower Avg: 3.521879 -> 4.057966: 1.15x slower Significant (t=-64.29) Stddev: 0.01071 -> 0.01526: 1.4247x larger
json_load
Min: 1.024560 -> 0.642353: 1.60x faster Avg: 1.025255 -> 0.644000: 1.59x faster Significant (t=426.59) Stddev: 0.00049 -> 0.00194: 3.9240x larger
mako_v2
Min: 0.137584 -> 0.287701: 2.09x slower Avg: 0.140620 -> 0.293204: 2.09x slower Significant (t=-296.14) Stddev: 0.00243 -> 0.00272: 1.1195x larger
meteor_contest
Min: 0.284739 -> 0.254285: 1.12x faster Avg: 0.286174 -> 0.255323: 1.12x faster Significant (t=38.02) Stddev: 0.00124 -> 0.00133: 1.0725x larger
nbody
Min: 0.491416 -> 0.336127: 1.46x faster Avg: 0.493339 -> 0.337467: 1.46x faster Significant (t=185.50) Stddev: 0.00164 -> 0.00092: 1.7927x smaller
normal_startup
Min: 0.639285 -> 0.898157: 1.40x slower Avg: 0.645513 -> 0.901586: 1.40x slower Significant (t=-90.10) Stddev: 0.00575 -> 0.00270: 2.1309x smaller
nqueens
Min: 0.399351 -> 0.429575: 1.08x slower Avg: 0.403643 -> 0.430284: 1.07x slower Significant (t=-9.83) Stddev: 0.00603 -> 0.00053: 11.3092x smaller
pathlib
Min: 0.137462 -> 0.170506: 1.24x slower Avg: 0.145370 -> 0.172849: 1.19x slower Significant (t=-11.09) Stddev: 0.01232 -> 0.00128: 9.6403x smaller
pidigits
Min: 0.400265 -> 0.379307: 1.06x faster Avg: 0.401755 -> 0.381171: 1.05x faster Significant (t=14.65) Stddev: 0.00259 -> 0.00178: 1.4496x smaller
raytrace
Min: 1.770596 -> 1.958350: 1.11x slower Avg: 1.773719 -> 1.968401: 1.11x slower Significant (t=-44.19) Stddev: 0.00439 -> 0.00882: 2.0099x larger
regex_effbot
Min: 0.076566 -> 0.098124: 1.28x slower Avg: 0.077491 -> 0.098696: 1.27x slower Significant (t=-54.47) Stddev: 0.00052 -> 0.00069: 1.3227x larger
regex_v8
Min: 0.091530 -> 0.109116: 1.19x slower Avg: 0.092308 -> 0.113627: 1.23x slower Significant (t=-5.72) Stddev: 0.00088 -> 0.00829: 9.4271x larger
richards
Min: 0.257974 -> 0.232134: 1.11x faster Avg: 0.259248 -> 0.234325: 1.11x faster Significant (t=23.80) Stddev: 0.00144 -> 0.00185: 1.2823x larger
simple_logging
Min: 0.326569 -> 0.416797: 1.28x slower Avg: 0.331694 -> 0.418844: 1.26x slower Significant (t=-36.32) Stddev: 0.00523 -> 0.00122: 4.3004x smaller
spectral_norm
Min: 0.483011 -> 0.741558: 1.54x slower Avg: 0.487128 -> 0.749741: 1.54x slower Significant (t=-57.40) Stddev: 0.00512 -> 0.00886: 1.7299x larger
startup_nosite
Min: 0.220444 -> 0.374521: 1.70x slower Avg: 0.222773 -> 0.376785: 1.69x slower Significant (t=-176.17) Stddev: 0.00166 -> 0.00221: 1.3331x larger
threaded_count
Min: 0.171352 -> 0.151892: 1.13x faster Avg: 0.183180 -> 0.153634: 1.19x faster Significant (t=8.12) Stddev: 0.00801 -> 0.00140: 5.7241x smaller
unpack_sequence
Min: 0.000075 -> 0.000061: 1.23x faster Avg: 0.000101 -> 0.000065: 1.54x faster Significant (t=206.90) Stddev: 0.00001 -> 0.00000: 3.2374x smaller
The following not significant results are hidden, use -v to show them: chameleon, fannkuch, fastunpickle, regex_compile, silent_logging
django
Min: 0.868956 -> 0.894571: 1.03x slower Avg: 0.873620 -> 0.905274: 1.04x slower Significant (t=-6.97) Stddev: 0.00313 -> 0.00966: 3.0912x larger
genshi
Min: 0.269615 -> 0.286348: 1.06x slower Avg: 0.272206 -> 0.290708: 1.07x slower Significant (t=-12.29) Stddev: 0.00253 -> 0.00526: 2.0793x larger
html5lib
12.279808 -> 11.862586: 1.04x faster -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20120930/ea3e95d3/attachment.html>
- Previous message: [Python-Dev] benchmarks: Force map to a list to guarantee the calculations are performed under
- Next message: [Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]