[Python-Dev] Usage of += on strings in loops in stdlib (original) (raw)

Maciej Fijalkowski fijall at gmail.com
Wed Feb 13 09:12:24 CET 2013


On Wed, Feb 13, 2013 at 10:02 AM, Victor Stinner <victor.stinner at gmail.com> wrote:

I added a PyUnicodeWriter internal API to optimize str%args and str.format(args). It uses a buffer which is overallocated, so it's basically like CPython str += str optimization. I still don't know how efficient it is on Windows, since realloc() is slow on Windows (at least on old Windows versions).

We should add an official and public API to concatenate strings. I know that PyPy has already its own API. Example: writer = UnicodeWriter() for item in data: writer += item # i guess that it's faster than writer.append(item) return str(writer) # or writer.getvalue() ? I don't care of the exact implementation of UnicodeWriter, it just have to be as fast or faster than ''.join(data). I don't remember if PyUnicodeWriter is faster than StringIO or slower. I created an issue for that: http://bugs.python.org/issue15612 Victor

it's in pypy.builders (StringBuilder and UnicodeBuilder). The API does not really matter, as long as there is a way to preallocate certain size (which I don't think there is in StringIO for example). bytearray comes close but has a relatively inconvinient API and any pure-python bytearray wrapper will not be fast on CPython.



More information about the Python-Dev mailing list