msg273243 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2016-08-20 19:43 |
I noticed that quite some time during number formatting is spent parsing the format spec. The attached patch speeds up float formatting by 5-15% and integer formatting by 20-30% for me overall when using f-strings (Ubuntu 16.04, 64bit). |
|
|
msg273244 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-08-20 21:10 |
What benchmarks you used? |
|
|
msg273246 - (view) |
Author: Stefan Behnel (scoder) *  |
Date: 2016-08-20 21:27 |
You can easily see it by running timeit on fstrings, e.g. patched: $ ./python -m timeit 'f"{34276394612:15}"' 1000000 loops, best of 3: 0.352 usec per loop $ ./python -m timeit 'f"{34.276394612:8.6f}"' 1000000 loops, best of 3: 0.497 usec per loop and original Py3.6 master: $ ./python -m timeit 'f"{34276394612:15}"' 1000000 loops, best of 3: 0.435 usec per loop $ ./python -m timeit 'f"{34.276394612:8.6f}"' 1000000 loops, best of 3: 0.589 usec per loop It doesn't make much of a difference if you use constants or variables, BTW. |
|
|
msg273368 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-08-22 12:41 |
LGTM. I expect that with the patch for the effect of this optimization is even more significant. Years ago I wrote much larger patch that includes similar but even more aggressive microoptimization. But the effect was not very impressive for str.format. For f-strings it should be more significant. I'm going to revive my old patch and compare it with your short patch. |
|
|
msg273373 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-08-22 13:10 |
- while (pos<end && Py_ISDIGIT(PyUnicode_READ_CHAR(s, pos))) + while (pos<end && Py_ISDIGIT(PyUnicode_READ(ukind, udata, pos))) Great change. It's really bad for performance to use such inefficient macro in a loop: PyUnicode_READ_CHAR() uses 2 nested "if" :-/ faster_format.patch LGTM except of Serhiy's comment. To get best performances, it's even better to specialize Unicode code to have 4 versions: ascii, latin1, ucs2, ucs4. The "stringlib" does that using C "templates". |
|
|
msg273858 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2016-08-29 12:54 |
I haven't found my old patch and tried to reimplement it. Added new file in stringlib. The original patch speeds up microbenchmarks only on about 4% on my computer (32-bit Linux). This is small, but the patch is simple. Moving some functions to template file adds yet about 2%. This is too small for adding new file. Thus I'll commit the original patch, with small style changes. |
|
|
msg273860 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2016-08-29 13:00 |
New changeset 9bddf9e72c96 by Serhiy Storchaka in branch 'default': Issue #27818: Speed up parsing width and precision in format() strings for https://hg.python.org/cpython/rev/9bddf9e72c96 |
|
|
msg273917 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2016-08-30 14:15 |
> Moving some functions to template file adds yet about 2%. This is too small for adding new file. I agree. I checked just after writing my previous comment and in fact str%args doesn't use a C template neither, it also uses PyUnicode_READ() which is fine. The code is probably already fast enough. |
|
|