[Python-Dev] efficient string concatenation (yep, from 2004) (original) (raw)

Christian Tismer tismer at stackless.com
Thu Feb 14 00:49:19 CET 2013


Hi Lennart,

Sent from my Ei4Steve

On Feb 13, 2013, at 8:42, Lennart Regebro <regebro at gmail.com> wrote:

Something is needed - a patch for PyPy or for the documentation I guess. Not arguing that it wouldn't be good, but I disagree that it is needed. This is only an issue when you, as in your proof, have a loop that does concatenation. This is usually when looping over a list of strings that should be concatenated together. Doing so in a loop with concatenation may be the natural way for people new to Python, but the "natural" way to do it in Python is with a ''.join() call. This: s = ''.join(('X' for x in xrange(x))) Is more than twice as fast in Python 2.7 than your example. It is in fact also slower in PyPy 1.9 than Python 2.7, but only with a factor of two: Python 2.7: time for 10000000 concats = 0.887 Pypy 1.9: time for 10000000 concats = 1.600 (And of course s = 'X'* x takes only a bout a hundredth of the time, but that's cheating. ;-) //Lennart

This all does not really concern me, as long as it roughly has the same order of magnitude, or better the same big Oh. I'm not concerned by a constant factor. I'm concerned by a freezing machine that suddenly gets 10000 times slower because the algorithms never explicitly state their algorithmic complexity. ( I think I said this too often, today?)

As a side note: Something similar happened to me when somebody used "range" in Python3.3. He ran the same code on Python 2.7. with a crazy effect of having to re-boot: Range() on 2.7 with arguments from some arbitrary input file. A newbie error that was hard to understand, because he was tought thinking 'xrange' when writing 'range'. Hard for me to understand because I am no longer able to make these errors at all, or even expect them.

Cheers - Chris



More information about the Python-Dev mailing list