msg175503 - (view) |
Author: Philipp Hagemeister (phihag) * |
Date: 2012-11-13 16:19 |
On my system, {} has become significantly slower in 3.3: $ python3.2 -m timeit -n 1000000 '{}' 1000000 loops, best of 3: 0.0314 usec per loop $ python3.3 -m timeit -n 1000000 '{}' 1000000 loops, best of 3: 0.0892 usec per loop $ hg id -i ee7b713fec71+ $ ./python -m timeit -n 1000000 '{}' 1000000 loops, best of 3: 0.0976 usec per loop Is this because of the dict randomization? |
|
|
msg175504 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-13 16:43 |
I confirm that. $ ./python -m timeit -n 1000000 '{};{};{};{};{};{};{};{};{};{}' 2.6: 0.62 usec 2.7: 0.57 usec 3.1: 0.55 usec 3.2: 0.48 usec 3.3: 1.5 usec Randomization is not affecting it. |
|
|
msg175507 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-13 17:03 |
Ah, this is an effect of PEP 412. The difference exists only for creating dicts up to 5 items (inclusive). |
|
|
msg175511 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-13 17:34 |
May be using the free list for keys will restore performance. |
|
|
msg175528 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-11-13 22:32 |
Does this regression impact any real-world program? |
|
|
msg175531 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2012-11-14 01:23 |
> Does this regression impact any real-world program? That is a blow-off response. A huge swath of the language is affected by dictionary performance (keyword args, module lookups, attribute lookup, etc). Most programs will be affected to some degree -- computationally bound programs will notice more and i/o bound programs won't notice at all. |
|
|
msg175546 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-11-14 07:20 |
> > Does this regression impact any real-world program? > > That is a blow-off response. A huge swath of the language is affected > by dictionary performance (keyword args, module lookups, attribute > lookup, etc). I was merely suggesting to report actual (non-micro) benchmark numbers. This issue is about dict creation, not dict lookups. |
|
|
msg175551 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-14 08:46 |
As I understand, the new dict created on every call of function with keyword arguments. This slow down every such call about 0.1 µsec. This is about 10% of int('42', base=16). In the sum, some programs can slow down to a few percents. Direct comparison of 3.2 and 3.3 is meaningless because of the influence of other factors (first of all PEP 393). |
|
|
msg175561 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-11-14 11:13 |
> As I understand, the new dict created on every call of function with > keyword arguments. This slow down every such call about 0.1 µsec. > This is about 10% of int('42', base=16). Ok, but `int('42', base=16)` is about the fastest function call with keyword arguments one can think about :-) In other words, the overhead will probably not be noticeable for most function calls. |
|
|
msg175563 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-14 11:20 |
> Ok, but `int('42', base=16)` is about the fastest function call with > keyword arguments one can think about :-) Not as fast as a call without keywords, `int('42', 16)`. :-( But this is a different issue. |
|
|
msg175567 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-14 13:13 |
Here is a really simple patch. It speed up to 1.9x an empty dict creation (still 1.6x slower than 3.2). Make your measurements of real-world programs. |
|
|
msg175581 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-11-14 18:56 |
Ok, here are some benchmark results with your patch: ### call_method ### Min: 0.313816 -> 0.305150: 1.03x faster Avg: 0.316292 -> 0.305678: 1.03x faster Significant (t=21.07) Stddev: 0.00188 -> 0.00050: 3.7398x smaller Timeline: http://tinyurl.com/couzwbe ### call_method_unknown ### Min: 0.323125 -> 0.312034: 1.04x faster Avg: 0.325697 -> 0.313115: 1.04x faster Significant (t=17.62) Stddev: 0.00258 -> 0.00099: 2.6059x smaller Timeline: http://tinyurl.com/clwacj3 ### call_simple ### Min: 0.215008 -> 0.220773: 1.03x slower Avg: 0.215666 -> 0.221657: 1.03x slower Significant (t=-23.28) Stddev: 0.00062 -> 0.00078: 1.2721x larger Timeline: http://tinyurl.com/cv6gxft ### fastpickle ### Min: 0.554353 -> 0.569768: 1.03x slower Avg: 0.558192 -> 0.574705: 1.03x slower Significant (t=-6.90) Stddev: 0.00363 -> 0.00393: 1.0829x larger Timeline: http://tinyurl.com/c8s2dqk ### float ### Min: 0.292259 -> 0.279563: 1.05x faster Avg: 0.296011 -> 0.286154: 1.03x faster Significant (t=2.99) Stddev: 0.00417 -> 0.00609: 1.4585x larger Timeline: http://tinyurl.com/c34ofe5 ### iterative_count ### Min: 0.115091 -> 0.118527: 1.03x slower Avg: 0.116174 -> 0.119266: 1.03x slower Significant (t=-6.82) Stddev: 0.00068 -> 0.00075: 1.1148x larger Timeline: http://tinyurl.com/bsmuuso ### json_dump_v2 ### Min: 2.591838 -> 2.652666: 1.02x slower Avg: 2.599335 -> 2.687225: 1.03x slower Significant (t=-8.09) Stddev: 0.00715 -> 0.02322: 3.2450x larger Timeline: http://tinyurl.com/clxutkt ### json_load ### Min: 0.378329 -> 0.370132: 1.02x faster Avg: 0.379840 -> 0.371222: 1.02x faster Significant (t=13.68) Stddev: 0.00112 -> 0.00085: 1.3237x smaller Timeline: http://tinyurl.com/c7ycte2 ### nbody ### Min: 0.259945 -> 0.251294: 1.03x faster Avg: 0.261186 -> 0.251843: 1.04x faster Significant (t=14.95) Stddev: 0.00125 -> 0.00062: 2.0274x smaller Timeline: http://tinyurl.com/cg8z6yg ### nqueens ### Min: 0.295304 -> 0.275767: 1.07x faster Avg: 0.298382 -> 0.278839: 1.07x faster Significant (t=10.19) Stddev: 0.00244 -> 0.00353: 1.4429x larger Timeline: http://tinyurl.com/bo2dn4x ### pathlib ### Min: 0.100494 -> 0.102371: 1.02x slower Avg: 0.101787 -> 0.104876: 1.03x slower Significant (t=-8.15) Stddev: 0.00103 -> 0.00159: 1.5432x larger Timeline: http://tinyurl.com/cb9w79t ### richards ### Min: 0.166939 -> 0.172828: 1.04x slower Avg: 0.167945 -> 0.174199: 1.04x slower Significant (t=-9.65) Stddev: 0.00092 -> 0.00112: 1.2241x larger Timeline: http://tinyurl.com/cdros5x ### silent_logging ### Min: 0.070485 -> 0.066857: 1.05x faster Avg: 0.070538 -> 0.067041: 1.05x faster Significant (t=44.64) Stddev: 0.00003 -> 0.00017: 5.4036x larger Timeline: http://tinyurl.com/buz5h9j ### 2to3 ### 0.490925 -> 0.478927: 1.03x faster ### chameleon ### Min: 0.050694 -> 0.049364: 1.03x faster Avg: 0.051030 -> 0.049759: 1.03x faster Significant (t=8.45) Stddev: 0.00041 -> 0.00041: 1.0036x larger Timeline: b'http://tinyurl.com/d5czdjt' ### mako ### Min: 0.183295 -> 0.177191: 1.03x faster Avg: 0.184944 -> 0.179300: 1.03x faster Significant (t=15.54) Stddev: 0.00119 -> 0.00137: 1.1555x larger Timeline: b'http://tinyurl.com/dx6xrvx' In the end, it's mostly a wash. |
|
|
msg175583 - (view) |
Author: Jesús Cea Avión (jcea) *  |
Date: 2012-11-14 19:44 |
Antoine, I would consider this a performance regression to solve for 3.3.1. Small dictionary creation is everywhere in CPython. |
|
|
msg175586 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-11-14 20:13 |
> Antoine, I would consider this a performance regression to solve for > 3.3.1. Small dictionary creation is everywhere in CPython. Again, feel free to provide real-world benchmark numbers proving the regression. I've already posted some figures. |
|
|
msg175615 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-11-15 12:07 |
> In the end, it's mostly a wash. Yes, it's predictable. The gain is too small and undistinguished on a background of random noise. May be more long-running benchmark will show more reliable results, but in any case, the gain is small. My long-running hard-optimized simulation is about 5% faster on 3.2 or patched 3.4 comparing to 3.3. But this is not a typical program. |
|
|
msg183211 - (view) |
Author: Christian Heimes (christian.heimes) *  |
Date: 2013-02-28 11:06 |
The patch gives a measurable speedup (./python is a patched 3.3.0+). IMO we should apply it. It's small and I can see no harm, too. $ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}'"; echo cmd;evalcmd; eval cmd;evalcmd; done python2.7 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.162 usec per loop python3.2 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.142 usec per loop python3.3 -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.669 usec per loop ./python -R -m timeit -n 10000000 '{};{};{};{};{};{};{};{};{};{}' 10000000 loops, best of 3: 0.381 usec per loop $ for PY in python2.7 python3.2 python3.3 ./python; do cmd="$PY -R -m timeit -n 10000000 'int(\"1\", base=16)'"; echo cmd;evalcmd; eval cmd;evalcmd; done python2.7 -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.268 usec per loop python3.2 -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.302 usec per loop python3.3 -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.477 usec per loop ./python -R -m timeit -n 10000000 'int("1", base=16)' 10000000 loops, best of 3: 0.356 usec per loop |
|
|
msg205362 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2013-12-06 11:44 |
So what we should do with this? |
|
|
msg205363 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2013-12-06 11:48 |
You said it yourself: """The gain is too small and undistinguished on a background of random noise""". Closing would be reasonable, unless someone exhibits a reasonable benchmark where there is a significant difference. In any case, it's too late for perf improvements on 3.4. |
|
|
msg205397 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2013-12-06 19:35 |
This patch looks correct and like the right thing to do. I recommend applying it to 3.5. |
|
|
msg224937 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2014-08-06 15:04 |
Antoine, are you still oppose to this patch? |
|
|
msg224939 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2014-08-06 15:26 |
The patch adds complication to the already complicated memory management of dicts. It increases maintenance burden in critical code. Have we found any case where it makes a tangible difference? (I'm not talking about timeit micro-benchmarks) |
|
|
msg261565 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-03-11 13:02 |
Closed in the favor of . |
|
|
msg261576 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-03-11 15:33 |
Serhiy Storchaka: > Closed in the favor of . Cool! |
|
|