msg160103 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-06 18:00 |
I propose a complex patch, which significantly speeds up UTF-8 decoding. Now decoder faster even decoder in 3.2 (except in a few unreal patological cases). Also the decoder code reduced and simplified (formerly decoding code was repeated in at least three places). As a side effect ASCII decoding now faster on some platforms (). Related issues: [] Faster utf-8 decoding [] faster utf-8 decoding [] Faster ascii decoding [] Faster utf-16 decoder [] Faster utf-32 decoder [] Faster utf-8 decoding Here are the results of benchmarking (numbers is speed in MB/s). On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz: 3.2 3.3(vanilla) patched utf-8 'A'*10000 1199 (+69%) 1721 (+18%) 2032 utf-8 'A'*9999+'\x80' 1189 (+25%) 996 (+49%) 1488 utf-8 'A'*9999+'\u0100' 1192 (-25%) 887 (+1%) 894 utf-8 'A'*9999+'\u8000' 1178 (-24%) 888 (+0%) 890 utf-8 'A'*9999+'\U00010000' 1177 (-29%) 872 (-4%) 837 utf-8 '\x80'*10000 220 (+74%) 172 (+122%) 382 utf-8 '\x80'+'A'*9999 1192 (+5%) 376 (+232%) 1250 utf-8 '\x80'*9999+'\u0100' 220 (+54%) 160 (+112%) 339 utf-8 '\x80'*9999+'\u8000' 220 (+54%) 160 (+112%) 339 utf-8 '\x80'*9999+'\U00010000' 221 (+49%) 176 (+88%) 330 utf-8 '\u0100'*10000 220 (+74%) 163 (+134%) 382 utf-8 '\u0100'+'A'*9999 1177 (+4%) 382 (+219%) 1220 utf-8 '\u0100'+'\x80'*9999 220 (+74%) 163 (+134%) 382 utf-8 '\u0100'*9999+'\u8000' 220 (+74%) 163 (+134%) 382 utf-8 '\u0100'*9999+'\U00010000' 220 (+50%) 180 (+83%) 330 utf-8 '\u8000'*10000 261 (+66%) 191 (+126%) 432 utf-8 '\u8000'+'A'*9999 1197 (+1%) 384 (+216%) 1212 utf-8 '\u8000'+'\x80'*9999 216 (+77%) 163 (+134%) 382 utf-8 '\u8000'+'\u0100'*9999 215 (+77%) 164 (+132%) 381 utf-8 '\u8000'*9999+'\U00010000' 261 (+46%) 201 (+89%) 380 utf-8 '\U00010000'*10000 248 (+44%) 198 (+80%) 357 utf-8 '\U00010000'+'A'*9999 1192 (-5%) 383 (+196%) 1135 utf-8 '\U00010000'+'\x80'*9999 220 (+73%) 180 (+111%) 380 utf-8 '\U00010000'+'\u0100'*9999 220 (+73%) 180 (+111%) 380 utf-8 '\U00010000'+'\u8000'*9999 261 (+54%) 201 (+100%) 403 ascii 'A'*10000 233 (+971%) 1876 (+33%) 2496 On 32-bit Linux, Intel Atom N570 @ 1.66GHz: 3.2 3.3(vanilla) patched utf-8 'A'*10000 345 (+81%) 596 (+5%) 623 utf-8 'A'*9999+'\x80' 335 (+41%) 303 (+56%) 474 utf-8 'A'*9999+'\u0100' 336 (-23%) 123 (+110%) 258 utf-8 'A'*9999+'\u8000' 337 (-24%) 123 (+108%) 256 utf-8 'A'*9999+'\U00010000' 336 (-24%) 261 (-3%) 254 utf-8 '\x80'*10000 88 (+66%) 65 (+125%) 146 utf-8 '\x80'+'A'*9999 334 (+8%) 124 (+190%) 360 utf-8 '\x80'*9999+'\u0100' 88 (+43%) 65 (+94%) 126 utf-8 '\x80'*9999+'\u8000' 88 (+43%) 65 (+94%) 126 utf-8 '\x80'*9999+'\U00010000' 89 (+40%) 65 (+92%) 125 utf-8 '\u0100'*10000 88 (+85%) 65 (+151%) 163 utf-8 '\u0100'+'A'*9999 336 (+2%) 77 (+345%) 343 utf-8 '\u0100'+'\x80'*9999 88 (+86%) 65 (+152%) 164 utf-8 '\u0100'*9999+'\u8000' 88 (+86%) 65 (+152%) 164 utf-8 '\u0100'*9999+'\U00010000' 88 (+57%) 65 (+112%) 138 utf-8 '\u8000'*10000 98 (+79%) 69 (+154%) 175 utf-8 '\u8000'+'A'*9999 339 (+3%) 77 (+353%) 349 utf-8 '\u8000'+'\x80'*9999 89 (+84%) 66 (+148%) 164 utf-8 '\u8000'+'\u0100'*9999 88 (+86%) 65 (+152%) 164 utf-8 '\u8000'*9999+'\U00010000' 98 (+58%) 69 (+125%) 155 utf-8 '\U00010000'*10000 104 (+46%) 79 (+92%) 152 utf-8 '\U00010000'+'A'*9999 339 (-5%) 124 (+160%) 323 utf-8 '\U00010000'+'\x80'*9999 88 (+84%) 68 (+138%) 162 utf-8 '\U00010000'+'\u0100'*9999 88 (+83%) 68 (+137%) 161 utf-8 '\U00010000'+'\u8000'*9999 98 (+63%) 72 (+122%) 160 ascii 'A'*10000 132 (+499%) 758 (+4%) 791 |
|
|
msg160107 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-05-06 20:01 |
64-bit Linux, Intel Core i5 2500K: 3.2 3.3 patched utf-8 'A'*10000 2550 (+198%) 6828 (+11%) 7607 utf-8 'A'*9999+'\x80' 2501 (+118%) 2415 (+126%) 5456 utf-8 'A'*9999+'\u0100' 2501 (-20%) 2297 (-13%) 1996 utf-8 'A'*9999+'\u8000' 2494 (-14%) 2291 (-7%) 2133 utf-8 'A'*9999+'\U00010000' 2494 (-11%) 2293 (-3%) 2219 utf-8 '\x80'*10000 422 (+135%) 517 (+92%) 991 utf-8 '\x80'+'A'*9999 2513 (+12%) 860 (+228%) 2820 utf-8 '\x80'*9999+'\u0100' 426 (+102%) 525 (+64%) 862 utf-8 '\x80'*9999+'\u8000' 426 (+104%) 538 (+62%) 871 utf-8 '\x80'*9999+'\U00010000' 428 (+105%) 523 (+68%) 878 utf-8 '\u0100'*10000 425 (+140%) 517 (+97%) 1019 utf-8 '\u0100'+'A'*9999 2488 (+2%) 820 (+211%) 2549 utf-8 '\u0100'+'\x80'*9999 426 (+139%) 517 (+97%) 1019 utf-8 '\u0100'*9999+'\u8000' 426 (+139%) 529 (+93%) 1019 utf-8 '\u0100'*9999+'\U00010000' 426 (+106%) 509 (+72%) 876 utf-8 '\u8000'*10000 573 (+28%) 490 (+50%) 733 utf-8 '\u8000'+'A'*9999 2500 (+1%) 822 (+208%) 2528 utf-8 '\u8000'+'\x80'*9999 426 (+139%) 530 (+92%) 1018 utf-8 '\u8000'+'\u0100'*9999 428 (+138%) 509 (+100%) 1018 utf-8 '\u8000'*9999+'\U00010000' 573 (+17%) 447 (+51%) 673 utf-8 '\U00010000'*10000 562 (+24%) 552 (+26%) 696 utf-8 '\U00010000'+'A'*9999 2512 (+3%) 939 (+175%) 2584 utf-8 '\U00010000'+'\x80'*9999 423 (+140%) 553 (+84%) 1017 utf-8 '\U00010000'+'\u0100'*9999 426 (+139%) 549 (+85%) 1017 utf-8 '\U00010000'+'\u8000'*9999 572 (+18%) 479 (+41%) 674 |
|
|
msg160110 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-06 21:48 |
Thank your, Antoine. Finally Intel Core is defeated! If someone wants to repeat tests, see benchmark tools in . |
|
|
msg160112 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-06 22:11 |
The patch updated in accordance with Antoine cosmetic comments. |
|
|
msg160305 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-05-09 16:50 |
There's a Mac-specific portion in the patch, it would be nice if someone could check that it works. |
|
|
msg160306 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-09 18:05 |
It would be good if someone checked on Macs work with command line arguments, including non-valid utf8. The difficulty is that you need to check on both Macs with 16-bit and with 32-bit wchar_t. |
|
|
msg160307 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-09 18:32 |
Issue4388 is related to this Mac-specific portion of the patch. |
|
|
msg160308 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-05-09 18:41 |
> It would be good if someone checked on Macs work with command line > arguments, including non-valid utf8. The difficulty is that you need > to check on both Macs with 16-bit and with 32-bit wchar_t. Actually, it should be enough to run the test suite, since we should have tests for this. As for different wchar_t widths, that's the kind of thing we can leave to the buildbots (assuming our OS X buildbots come back alive some day :-)). |
|
|
msg160309 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-09 19:29 |
I hacked the code (commented out "#if __APPLE__" in Objects/unicodeobject.c and Modules/python.c) to start this branch on Linux and ran the test (test_cmd_line) with C locale. It passed. Then I broke decoder and ran the test again to get the error. I can now confirm that the code works correctly on a platform with a 32-bit wchar_t. |
|
|
msg160311 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2012-05-09 20:13 |
> Actually, it should be enough to run the test suite, since we should > have tests for this. I just ran the test suite ("python -m test") on OS X 10.6.8 with 'decode_utf8_5.patch' applied. (64-bit --with-pydebug build of Python.) No test failures. test header: == CPython 3.3.0a3+ (default:840cb46d0395+, May 9 2012, 20:55:18) [GCC 4.2.1 (Apple Inc. build 5664)] == Darwin-10.8.0-i386-64bit little-endian == /Users/mdickinson/Python/cpython/build/test_python_39794 Fragment of configure output relevant to wchar looked like this: checking wchar.h usability... yes checking wchar.h presence... yes checking for wchar.h... yes checking size of wchar_t... 4 checking for UCS-4 tcl... no checking whether wchar_t is signed... yes no usable wchar_t found |
|
|
msg160312 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-05-09 20:18 |
> The difficulty is that you need to check on both Macs > with 16-bit and with 32-bit wchar_t. I don't think that the size of wchar_t is configurable: it should always be 32 bits on Mac OS X. |
|
|
msg160346 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-05-10 14:38 |
New changeset e08c3791f035 by Antoine Pitrou in branch 'default': Issue #14738: Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka. http://hg.python.org/cpython/rev/e08c3791f035 |
|
|
msg160347 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-05-10 14:38 |
The patch is now committed. Well done and thanks for your contribution. |
|
|
msg160447 - (view) |
Author: Serhiy Storchaka (serhiy.storchaka) *  |
Date: 2012-05-11 19:45 |
Thanks Martin for review, which has allowed me to make a quality patch, and for promotion of further research. Thanks Antoine for review, benchmarks, commit, and for the original optimization, which served as the basis for my patch. |
|
|
msg160462 - (view) |
Author: STINNER Victor (vstinner) *  |
Date: 2012-05-12 07:09 |
If the commit makes Python 3.3 faster than Python 3.2, it is an optimisation that should be documented in the What's New in Python 3.3 document. |
|
|