[Python-Dev] Usage of += on strings in loops in stdlib (original) (raw)

Victor Stinner victor.stinner at gmail.com
Thu Feb 14 01:21:40 CET 2013


Hi,

I wrote quick hack to expose _PyUnicodeWriter as _string.UnicodeWriter: http://www.haypocalc.com/tmp/string_unicode_writer.patch

And I wrote a (micro-)benchmark: http://www.haypocalc.com/tmp/bench_join.py ( The benchmark uses only ASCII string, it would be interesting to test latin1, BMP and non-BMP characters too. )

UnicodeWriter (using the "writer += str" API) is the fastest method in most cases, except for data = ['a'*104] * 102 (in this case, it's 8x slower!). I guess that the overhead comes for the overallocation which then require to shrink the buffer (shrinking may copy the whole string). The overallocation factor may be adapted depending on the size.

If computing the final length is cheap (eg. if it's always the same), it's always faster to use UnicodeWriter with a preallocated buffer. The "UnicodeWriter +=; preallocate" test uses a precomputed length (ok, it's cheating!).

I also implemented UnicodeWriter.append method to measure the overhead of a method lookup: it's expensive :-)

--

Platform: Linux-3.6.10-2.fc16.x86_64-x86_64-with-fedora-16-Verne Python unicode implementation: PEP 393 Date: 2013-02-14 01:00:06 CFLAGS: -Wno-unused-result -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes SCM: hg revision=659ef9d360ae+ tag=tip branch=default date="2013-02-13 15:25 +0000" CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Python version: 3.4.0a0 (default:659ef9d360ae+, Feb 14 2013, 00:35:19) [GCC 4.6.3 20120306 (Red Hat 4.6.3-2)] Bits: int=32, long=64, long long=64, pointer=64

[ data = ['a'] * 10**2 ]

4.21 us: UnicodeWriter +=; preallocate 4.86 us (+15%): UnicodeWriter append; lookup attr once 4.99 us (+18%): UnicodeWriter +=

6.35 us (+51%): str += str 6.45 us (+53%): io.StringIO; lookup attr once 7.02 us (+67%): "".join(list) 7.46 us (+77%): UnicodeWriter append 8.77 us (+108%): io.StringIO

[ data = ['abc'] * 10**4 ]

356 us: UnicodeWriter append; lookup attr once 375 us (+5%): UnicodeWriter +=; preallocate 376 us (+6%): UnicodeWriter +=

495 us (+39%): io.StringIO; lookup attr once 614 us (+73%): "".join(list) 629 us (+77%): UnicodeWriter append 716 us (+101%): str += str 737 us (+107%): io.StringIO

[ data = ['a'*104] * 101 ]

3.67 us: str += str 3.76 us: UnicodeWriter +=; preallocate

3.95 us (+8%): UnicodeWriter += 4.01 us (+9%): UnicodeWriter append; lookup attr once 4.06 us (+11%): "".join(list) 4.24 us (+15%): UnicodeWriter append 4.59 us (+25%): io.StringIO; lookup attr once 4.77 us (+30%): io.StringIO

[ data = ['a'*104] * 102 ]

41.2 us: UnicodeWriter +=; preallocate 43.8 us (+6%): str += str 45.4 us (+10%): "".join(list) 45.9 us (+11%): io.StringIO; lookup attr once 48.3 us (+17%): io.StringIO

370 us (+797%): UnicodeWriter += 370 us (+798%): UnicodeWriter append; lookup attr once 377 us (+816%): UnicodeWriter append

[ data = ['a'*104] * 104 ]

38.9 ms: UnicodeWriter +=; preallocate 39 ms: "".join(list) 39.1 ms: io.StringIO; lookup attr once 39.4 ms: UnicodeWriter append; lookup attr once 39.5 ms: io.StringIO 39.6 ms: UnicodeWriter += 40.1 ms: str += str 40.1 ms: UnicodeWriter append

Victor

2013/2/13 Antoine Pitrou <solipsis at pitrou.net>:

Le Wed, 13 Feb 2013 09:02:07 +0100, Victor Stinner <victor.stinner at gmail.com> a écrit :

I added a PyUnicodeWriter internal API to optimize str%args and str.format(args). It uses a buffer which is overallocated, so it's basically like CPython str += str optimization. I still don't know how efficient it is on Windows, since realloc() is slow on Windows (at least on old Windows versions).

We should add an official and public API to concatenate strings. There's io.StringIO already. Regards Antoine.


Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com



More information about the Python-Dev mailing list