[Python-Dev] RFC: Add a new builtin strarray type to Python? (original) (raw)
Victor Stinner victor.stinner at haypocalc.com
Sat Oct 1 22:06:11 CEST 2011
- Previous message: [Python-Dev] RFC: Add a new builtin strarray type to Python?
- Next message: [Python-Dev] RFC: Add a new builtin strarray type to Python?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Since the integration of the PEP 393, str += str is not more super-fast (but just fast).
Oh oh. str+=str is now 1450x slower than ''.join() pattern. Here is a benchmark (see attached script, bench_build_str.py):
Python 3.3
str += str : 14548 ms ''.join() : 10 ms StringIO.write: 12 ms StringBuilder : 30 ms array('u') : 67 ms
Python 3.2
str += str : 9 ms ''.join() : 9 ms StringIO.write: 9 ms StringBuilder : 30 ms array('u') : 77 ms
(FYI results are very different in Python 2)
I expect performances similar to StringIO.write if strarray is implemented using a Py_UCS4 buffer, as io.StringIO.
PyPy has a UnicodeBuilder class (in pypy.builders): it has append(), append_slice() and build() methods. In PyPy, it is the fastest method to build a string:
PyPy 1.6
''.join() : 16 ms StringIO.join : 24 ms StringBuilder : 9 ms array('u') : 66 ms
It is even faster if you specify the size to the constructor: 3 ms.
I'm writing this email to ask you if this type solves a real issue, or if we can just prove the super-fast str.join(list of str).
Hum, it looks like "What is the most efficient string concatenation method in python?" in a frequently asked question. There is a recent thread on python- ideas mailing list:
"Create a StringBuilder class and use it everywhere" http://code.activestate.com/lists/python-ideas/11147/ (I just subscribed to this list.)
Another alternative is a "string-join" object. It is discussed (and implemented) in the following issue, and PyPy has also an optional implementation:
http://bugs.python.org/issue1569040 http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string- join-objects
Note: Python 2 has UserString.MutableString (and Python 3 has collections.UserString).
Victor -------------- next part -------------- A non-text attachment was scrubbed... Name: bench_build_str.py Type: text/x-python Size: 1566 bytes Desc: not available URL: <http://mail.python.org/pipermail/python-dev/attachments/20111001/7c57e2f9/attachment-0001.py>
- Previous message: [Python-Dev] RFC: Add a new builtin strarray type to Python?
- Next message: [Python-Dev] RFC: Add a new builtin strarray type to Python?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]