Issue 25318: Add _PyBytesWriter API to optimize Unicode encoders (original) (raw)

Created on 2015-10-05 12:01 by vstinner, last changed 2022-04-11 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bench_utf8_result.txt vstinner,2015-10-05 12:02
bench_ucs1_result.txt vstinner,2015-10-05 12:04
bytes_writer.patch vstinner,2015-10-05 12:05 review
Messages (14)
msg252322 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-05 12:01
Attached patch is the first step to optimize Unicode encoders: it adds a _PyBytesWriter API. This API is responsible to use the most efficient buffer depending on the need: * it's possible to use a small buffer directly allocated on the C stack * otherwise a Python bytes object is allocated * it's possible to overallocate the bytes objcet to reduce the number of calls to _PyBytes_Resize() The patch only adds the new API, don't expect any speed up. I just added a small optimization: the overallocation is disabled in UCS1 encoder (ASCII and Latin1) for the last write.
msg252323 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-05 12:02
Result of bench.py attached to issue #25267: attached bench_utf8_result.txt. ------------------------------------------------------+-------------+--------------- Summary                                               | utf8_before utf8_after ------------------------------------------------------+-------------+--------------- ignore: "\udcff" * length 7.63 us (*) 7.91 us ignore: "a" * length + "\udcff" 10.7 us (*) 10.8 us ignore: ("a" * 99 + "\udcff" * 99) * length 2.17 ms (*) 2.16 ms ignore: ("\udcff" * 99 + "a") * length 843 us (*) 866 us ignore: "\udcff" + "a" * length 10.7 us (*) 11 us replace: "\udcff" * length 7.87 us (*) 8.43 us (+7%) replace: "a" * length + "\udcff" 10.8 us (*) 10.9 us replace: ("a" * 99 + "\udcff" * 99) * length 2.46 ms (*) 2.46 ms replace: ("\udcff" * 99 + "a") * length 907 us (*) 939 us replace: "\udcff" + "a" * length 10.9 us (*) 11 us surrogateescape: "\udcff" * length 14.2 us (*) 17.2 us (+21%) surrogateescape: "a" * length + "\udcff" 10.6 us (*) 10.7 us surrogateescape: ("a" * 99 + "\udcff" * 99) * length 3.19 ms (*) 3.4 ms (+7%) surrogateescape: ("\udcff" * 99 + "a") * length 1.64 ms (*) 1.87 ms (+13%) surrogateescape: "\udcff" + "a" * length 10.6 us (*) 10.7 us surrogatepass: "\udcff" * length 23.1 us (*) 23.5 us surrogatepass: "a" * length + "\udcff" 10.7 us (*) 10.8 us surrogatepass: ("a" * 99 + "\udcff" * 99) * length 4.39 ms (*) 4.44 ms surrogatepass: ("\udcff" * 99 + "a") * length 2.43 ms (*) 2.47 ms surrogatepass: "\udcff" + "a" * length 10.6 us (*) 10.8 us backslashreplace: "\udcff" * length 65.7 us (*) 64.3 us backslashreplace: "a" * length + "\udcff" 15.7 us (*) 15 us backslashreplace: ("a" * 99 + "\udcff" * 99) * length 12 ms (*) 15.9 ms (+32%) backslashreplace: ("\udcff" * 99 + "a") * length 11.1 ms (*) 13.5 ms (+22%) backslashreplace: "\udcff" + "a" * length 16.4 us (*) 15.1 us (-8%) ------------------------------------------------------+-------------+--------------- Total 41.4 ms (*) 48.3 ms (+17%) ------------------------------------------------------+-------------+---------------
msg252324 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-05 12:04
Results of bench.py attached to issue #25227 (ASCII and Latin1 encoders): attached bench_ucs1_result.txt file. --------+-------------+----------- Summary | ucs1_before ucs1_after --------+-------------+----------- ascii 1.69 ms (*) 1.69 ms latin1 1.7 ms (*) 1.69 ms --------+-------------+----------- Total 3.39 ms (*) 3.39 ms --------+-------------+-----------
msg252325 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-05 12:12
A few months ago, I wrote a previous implementation of the _PyBytesWriter API which embedded the "current pointer" inside _PyBytesWriter API. The problem was that GCC produced less efficient code than expect for the hotspot of the encoder. In the new implementation (attached patch), the "current pointer" is unchanged: it's still a variable local to the encoder function. Instead, the current pointer became a *parameter* to all _PyBytesWriter *functions*. I expect to not touch performances of encoders for valid encoded strings (when the code calling error handlers is not used), which is important since we have very good performance here. _PyBytesWriter is not restricted to the code to allocate the buffer. -- bytes_writer.patch: + char stackbuf[256]; Oh, I forgot to mention this other small optimization. I also added a small buffer allocated on the C stack for the UCS1 encoder (ASCII, Latin1). It may optimize a little bit encoding when the output string is smaller than 256 bytes when the error handler is used. The optimization comes from the very efficient UTF-8 encoder.
msg252335 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-05 16:17
My previous abandonned attempt was the issue #17742. "Add _PyBytesWriter API to optimize Unicode encoders" Oh, I forgot to mention and it may also be used to optimize bytes % args. More generally, any code generating a bytes object with an unknown length is advance. Said differently: _PyBytesWriter can be used when precomputing the output length is more expensive. str % args now uses _PyUnicodeWriter but building an Unicode string is even more complex because of the different Unicode "kinds": 1, 2 or 4 bytes per character.
msg252570 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-10-08 22:59
New changeset 1a2175149c5e by Victor Stinner in branch 'default': Issue #25318: Add _PyBytesWriter API https://hg.python.org/cpython/rev/1a2175149c5e
msg252571 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-08 23:04
Oh, I was surprised to see same or worse performances for UTF-8/backslashreplace. In fact, I forgot to enable overallocation. With overallocation, it is now faster ;-) I modified the API to put the "stack buffer" inside _PyBytesWriter API directly. I also reworked _PyBytesWriter_Alloc() to call _PyBytesWriter_Prepare() so _PyBytesWriter_Alloc() now supports overallocation as well. It was part of _PyBytesWriter design to support overallocation at the first allocation (_PyBytesWriter_Alloc), that's why we have _PyBytesWriter_Alloc() *and* _PyBytesWriter_Init(): it's possible to set overallocate=1 between init and alloc. I pushed my change since it didn't kill performances. It's only a little bit smaller but on very short encode: less than 500 ns. In other cases, it's the same performances or faster.
msg252573 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-10-08 23:46
New changeset 59f4806a5add by Victor Stinner in branch 'default': Optimize backslashreplace error handler https://hg.python.org/cpython/rev/59f4806a5add
msg252574 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-10-09 00:32
New changeset c134eddcb347 by Victor Stinner in branch 'default': Issue #25318: Move _PyBytesWriter to bytesobject.c https://hg.python.org/cpython/rev/c134eddcb347
msg252579 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-09 00:51
I created the issue #25349 "Use _PyBytesWriter for bytes%args".
msg252580 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-10-09 00:52
New changeset e9c1404d6bd9 by Victor Stinner in branch 'default': Issue #25318: Fix compilation error https://hg.python.org/cpython/rev/e9c1404d6bd9
msg252582 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-09 01:27
The FreeBSD 9.x buildbot is grumpy. http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.x/builds/3495/steps/test/logs/stdio Assertion failed: (start[writer->allocated] == 0), function _PyBytesWriter_CheckConsistency, file Objects/bytesobject.c, line 3809. Fatal Python error: Aborted Current thread 0x0000000801807400 (most recent call first): File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/test/test_pep277.py", line 150 in test_listdir File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/case.py", line 600 in run File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/case.py", line 648 in __call__ File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 122 in run File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 84 in __call__ File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 122 in run File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 84 in __call__ File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/runner.py", line 176 in run ...
msg252583 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015-10-09 01:39
New changeset 9cf89366bbcb by Victor Stinner in branch 'default': Issue #25318: Avoid sprintf() in backslashreplace() https://hg.python.org/cpython/rev/9cf89366bbcb New changeset 0a522f68d275 by Victor Stinner in branch 'default': Issue #25318: Fix backslashreplace() https://hg.python.org/cpython/rev/0a522f68d275 New changeset c53dcf1d6967 by Victor Stinner in branch 'default': Issue #25318: cleanup code _PyBytesWriter https://hg.python.org/cpython/rev/c53dcf1d6967
msg252602 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015-10-09 12:18
Buildbots still like this new API :-) (no test failure recently) I reworked the API a little bit to make its usage simpler in Unicode encoders. I started to open new issues to using this new API in more functions producing byte strings. I consider that this issue can now be closed. I'm happy, the API looks good to me and the modified code is faster.
History
Date User Action Args
2022-04-11 14:58:22 admin set github: 69505
2015-10-09 12🔞57 vstinner set status: open -> closedresolution: fixedmessages: +
2015-10-09 01:39:00 python-dev set messages: +
2015-10-09 01:27:37 vstinner set messages: +
2015-10-09 00:52:49 python-dev set messages: +
2015-10-09 00:51:25 vstinner set messages: +
2015-10-09 00:32:57 python-dev set messages: +
2015-10-08 23:46:53 python-dev set messages: +
2015-10-08 23:04:15 vstinner set messages: +
2015-10-08 22:59:55 python-dev set nosy: + python-devmessages: +
2015-10-05 16:17:29 vstinner set messages: +
2015-10-05 12:12:22 vstinner set messages: +
2015-10-05 12:05:32 vstinner set files: + bytes_writer.patchkeywords: + patch
2015-10-05 12:04:41 vstinner set files: + bench_ucs1_result.txt
2015-10-05 12:04:04 vstinner set messages: +
2015-10-05 12:02:50 vstinner set files: + bench_utf8_result.txtmessages: +
2015-10-05 12:01:28 vstinner create