[Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build) (original) (raw)

Brett Cannon brett at python.org
Mon Oct 1 01:12:47 CEST 2012

Previous message: [Python-Dev] benchmarks: Force map to a list to guarantee the calculations are performed under
Next message: [Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

I am presenting the talk "Python 3.3: Trust Me, It's Better Than 2.7" as PyCon Argentina and Brasil (and US if they accept the talk). As part of that talk I need to be able to benchmark Python 3.3 against 2.7 (both from tip) using the unladen benchmarks (which now include benchmarks from PyPy that can be relatively easily ported to Python 3).

To make sure the unladen benchmarks run fine against Python 3.3, I did a fast run of the benchmarks. I figured people might be interested in the quick-and-dirty results on my 2 GHz Intel Core i7 MacBook Pro w/ 8 GB RAM and no attempt to control for performance beyond not actively browsing the web. As I said, quick-and-dirty and not authoritative; all done just to make sure all the benchmarks could run to completion (including the django, html5lib, and genshi benchmarks which are only on my laptop ATM until those projects cut a release with official Python 3 support).

One thing to keep in mind is that many benchmarks use a raw str for things, so the benchmarks often compare Python 2.7 str vs. Python 3.3 str (i.e. str vs. unicode). While this might seem unfair, this is what real-world comparisons in performance will be from users so it's an (somewhat unfair) comparison that we just have to live with. I might take the time to try to make some tests run under both raw strings and unicode so both comparisons are available.

If you care about helping out with the benchmarks (e.g. helping spot where the iteration counts should be higher, etc.) then head over to the speed at mailing list.

python3 perf.py -T --basedir ../benchmarks -f -b py3k ../cpython/builds/2.7-wide/bin/python ../cpython/builds/3.3/bin/python3.3

... output about the command line for the benchmarks ...

2to3

0.785234 -> 0.722169: 1.09x faster

call_method

Min: 0.491433 -> 0.414841: 1.18x faster Avg: 0.493640 -> 0.416564: 1.19x faster Significant (t=127.21) Stddev: 0.00170 -> 0.00162: 1.0513x smaller

call_method_slots

Min: 0.492749 -> 0.416280: 1.18x faster Avg: 0.497888 -> 0.419275: 1.19x faster Significant (t=61.72) Stddev: 0.00433 -> 0.00237: 1.8304x smaller

call_method_unknown

Min: 0.575536 -> 0.427234: 1.35x faster Avg: 0.577286 -> 0.433428: 1.33x faster Significant (t=66.09) Stddev: 0.00117 -> 0.00835: 7.1621x larger

call_simple

Min: 0.413011 -> 0.338923: 1.22x faster Avg: 0.415862 -> 0.340699: 1.22x faster Significant (t=111.94) Stddev: 0.00223 -> 0.00134: 1.6616x smaller

chaos

Min: 0.375286 -> 0.435456: 1.16x slower Avg: 0.382798 -> 0.459515: 1.20x slower Significant (t=-5.01) Stddev: 0.01116 -> 0.03234: 2.8980x larger

fastpickle

Min: 0.853560 -> 0.770580: 1.11x faster Avg: 0.879498 -> 0.776249: 1.13x faster Significant (t=8.24) Stddev: 0.02771 -> 0.00407: 6.7995x smaller

float

Min: 0.476596 -> 0.391101: 1.22x faster Avg: 0.486164 -> 0.411553: 1.18x faster Significant (t=9.07) Stddev: 0.01049 -> 0.01511: 1.4411x larger

formatted_logging

Min: 0.346703 -> 0.451643: 1.30x slower Avg: 0.351218 -> 0.454626: 1.29x slower Significant (t=-51.50) Stddev: 0.00376 -> 0.00246: 1.5265x smaller

genshi

Min: 0.275107 -> 0.294309: 1.07x slower Avg: 0.287433 -> 0.299026: 1.04x slower Significant (t=-3.82) Stddev: 0.01077 -> 0.00467: 2.3044x smaller

go

Min: 0.719160 -> 0.781042: 1.09x slower Avg: 0.729322 -> 0.798135: 1.09x slower Significant (t=-8.54) Stddev: 0.01300 -> 0.01248: 1.0415x smaller

hexiom2

203.842661 -> 187.107363: 1.09x faster

iterative_count

Min: 0.145088 -> 0.153285: 1.06x slower Avg: 0.146369 -> 0.154425: 1.06x slower Significant (t=-9.21) Stddev: 0.00134 -> 0.00142: 1.0569x larger

json_dump_v2

Min: 3.512367 -> 4.040813: 1.15x slower Avg: 3.521879 -> 4.057966: 1.15x slower Significant (t=-64.29) Stddev: 0.01071 -> 0.01526: 1.4247x larger

json_load

Min: 1.024560 -> 0.642353: 1.60x faster Avg: 1.025255 -> 0.644000: 1.59x faster Significant (t=426.59) Stddev: 0.00049 -> 0.00194: 3.9240x larger

mako_v2

Min: 0.137584 -> 0.287701: 2.09x slower Avg: 0.140620 -> 0.293204: 2.09x slower Significant (t=-296.14) Stddev: 0.00243 -> 0.00272: 1.1195x larger

meteor_contest

Min: 0.284739 -> 0.254285: 1.12x faster Avg: 0.286174 -> 0.255323: 1.12x faster Significant (t=38.02) Stddev: 0.00124 -> 0.00133: 1.0725x larger

nbody

Min: 0.491416 -> 0.336127: 1.46x faster Avg: 0.493339 -> 0.337467: 1.46x faster Significant (t=185.50) Stddev: 0.00164 -> 0.00092: 1.7927x smaller

normal_startup

Min: 0.639285 -> 0.898157: 1.40x slower Avg: 0.645513 -> 0.901586: 1.40x slower Significant (t=-90.10) Stddev: 0.00575 -> 0.00270: 2.1309x smaller

nqueens

Min: 0.399351 -> 0.429575: 1.08x slower Avg: 0.403643 -> 0.430284: 1.07x slower Significant (t=-9.83) Stddev: 0.00603 -> 0.00053: 11.3092x smaller

pathlib

Min: 0.137462 -> 0.170506: 1.24x slower Avg: 0.145370 -> 0.172849: 1.19x slower Significant (t=-11.09) Stddev: 0.01232 -> 0.00128: 9.6403x smaller

pidigits

Min: 0.400265 -> 0.379307: 1.06x faster Avg: 0.401755 -> 0.381171: 1.05x faster Significant (t=14.65) Stddev: 0.00259 -> 0.00178: 1.4496x smaller

raytrace

Min: 1.770596 -> 1.958350: 1.11x slower Avg: 1.773719 -> 1.968401: 1.11x slower Significant (t=-44.19) Stddev: 0.00439 -> 0.00882: 2.0099x larger

regex_effbot

Min: 0.076566 -> 0.098124: 1.28x slower Avg: 0.077491 -> 0.098696: 1.27x slower Significant (t=-54.47) Stddev: 0.00052 -> 0.00069: 1.3227x larger

regex_v8

Min: 0.091530 -> 0.109116: 1.19x slower Avg: 0.092308 -> 0.113627: 1.23x slower Significant (t=-5.72) Stddev: 0.00088 -> 0.00829: 9.4271x larger

richards

Min: 0.257974 -> 0.232134: 1.11x faster Avg: 0.259248 -> 0.234325: 1.11x faster Significant (t=23.80) Stddev: 0.00144 -> 0.00185: 1.2823x larger

simple_logging

Min: 0.326569 -> 0.416797: 1.28x slower Avg: 0.331694 -> 0.418844: 1.26x slower Significant (t=-36.32) Stddev: 0.00523 -> 0.00122: 4.3004x smaller

spectral_norm

Min: 0.483011 -> 0.741558: 1.54x slower Avg: 0.487128 -> 0.749741: 1.54x slower Significant (t=-57.40) Stddev: 0.00512 -> 0.00886: 1.7299x larger

startup_nosite

Min: 0.220444 -> 0.374521: 1.70x slower Avg: 0.222773 -> 0.376785: 1.69x slower Significant (t=-176.17) Stddev: 0.00166 -> 0.00221: 1.3331x larger

threaded_count

Min: 0.171352 -> 0.151892: 1.13x faster Avg: 0.183180 -> 0.153634: 1.19x faster Significant (t=8.12) Stddev: 0.00801 -> 0.00140: 5.7241x smaller

unpack_sequence

Min: 0.000075 -> 0.000061: 1.23x faster Avg: 0.000101 -> 0.000065: 1.54x faster Significant (t=206.90) Stddev: 0.00001 -> 0.00000: 3.2374x smaller

The following not significant results are hidden, use -v to show them: chameleon, fannkuch, fastunpickle, regex_compile, silent_logging

django

Min: 0.868956 -> 0.894571: 1.03x slower Avg: 0.873620 -> 0.905274: 1.04x slower Significant (t=-6.97) Stddev: 0.00313 -> 0.00966: 3.0912x larger

genshi

Min: 0.269615 -> 0.286348: 1.06x slower Avg: 0.272206 -> 0.290708: 1.07x slower Significant (t=-12.29) Stddev: 0.00253 -> 0.00526: 2.0793x larger

html5lib

12.279808 -> 11.862586: 1.04x faster -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-dev/attachments/20120930/ea3e95d3/attachment.html>

Previous message: [Python-Dev] benchmarks: Force map to a list to guarantee the calculations are performed under
Next message: [Python-Dev] Benchmarking Python 3.3 against Python 2.7 (wide build)
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list