[Python-Dev] Optimize Unicode strings in Python 3.3 (original) (raw)

Victor Stinner victor.stinner at gmail.com
Wed May 30 00:44:05 CEST 2012

Previous message: [Python-Dev] Optimize Unicode strings in Python 3.3
Next message: [Python-Dev] Optimize Unicode strings in Python 3.3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

* Use a PyUCS4 buffer and then convert to the canonical form (ASCII, UCS1 or UCS2). Approach taken by io.StringIO. io.StringIO is not only used to write, but also to read and so a PyUCS4 buffer is a good compromise. * PyAccu API: optimized version of chunks=[]; for ...: ... chunks.append(text); return ''.join(chunks). * Two steps: compute the length and maximum character of the output string, allocate the output string and then write characters. str%args was using it. * Optimistic approach. Start with a ASCII buffer, enlarge and widen (to UCS2 and then UCS4) the buffer when new characters are written. Approach used by the UTF-8 decoder and by str%args since today.

I ran extensive benchmarks on these 4 methods for str%args and str.format(args).

The "two steps" method is not promising: parsing the format string twice is slower than other methods.

The PyAccu API is faster than a Py_UCS4 buffer to concatenate a lot of strings, but it is slower in many other cases.

I implemented the last method as the new internal "_PyUnicodeWriter" API: resize / widen the string buffer when writing new characters. I implemented more optimizations:

overallocate the buffer to limit the cost of realloc()
write characters directly in the buffer, avoid temporary buffers when possible (it is possible in most cases)
disable overallocation when formating the last argument
don't copy by value but copy by reference if the result is just a string (optimization already implemented indirectly in the PyAccu API)

The _PyUnicodeWriter is the fastest method: it gives a speed up of 30% over the Py_UCS4 / PyAccu in general, and from 60% to 100% in some specific cases!

I also compared str%args and str.format() with Python 2.7 (byte strings), 3.2 (UTF-16 or UCS-4) and 3.3 (PEP 393): Python 3.3 is as fast as Python 2.7 and sometimes faster! (Whereras Python 3.2 is 10 to 30% slower than Python 2 in general)

I wrote a tool to run benchmarks and to compare results: https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py https://bitbucket.org/haypo/misc/src/tip/python/bench_str.py

Run the benchmark: ./python benchmark.py --file=FILE script bench_str.py

Compare results: ./python benchmark.py compare_to FILE1 FILE2 FILE3 ...

Python 2.7 vs 3.2 vs 3.3:

http://bugs.python.org/file25685/REPORT_32BIT_2.7_3.2_writer http://bugs.python.org/file25687/REPORT_64BIT_2.7_3.2_writer http://bugs.python.org/file25757/report_windows7

Warning: For the Windows benchmark, Python 3.3 is compiled in 32 bits, whereas 2.7 and 3.2 are compiled in 64 bits (formatting integers is slower in 32 bits).

UCS4 vs PyAccu vs _PyUnicodeWriter:

http://bugs.python.org/file25686/REPORT_32BIT_3.3 http://bugs.python.org/file25688/REPORT_64BIT_3.3

Victor

Previous message: [Python-Dev] Optimize Unicode strings in Python 3.3
Next message: [Python-Dev] Optimize Unicode strings in Python 3.3
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list