[Python-Dev] efficient string concatenation (yep, from 2004) (original) (raw)
Christian Tismer tismer at stackless.com
Wed Feb 13 13:39:58 CET 2013
- Previous message: [Python-Dev] efficient string concatenation (yep, from 2004)
- Next message: [Python-Dev] efficient string concatenation (yep, from 2004)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 13.02.13 13:10, Steven D'Aprano wrote:
On 13/02/13 10:53, Christian Tismer wrote:
Hi friends,
efficient string concatenation has been a topic in 2004. Armin Rigo proposed a patch with the name of the subject, more precisely: /[Patches] [ python-Patches-980695 ] efficient string concatenation// //on sourceforge.net, on 2004-06-28.// / This patch was finally added to Python 2.4 on 2004-11-30. Some people might remember the larger discussion if such a patch should be accepted at all, because it changes the programming style for many of us from "don't do that, stupid" to "well, you may do it in CPython", which has quite some impact on other implementations (is it fast on Jython, now?). I disagree. If you look at the archives on the python-list@ and tutor at python.org mailing lists, you will see that whenever string concatenation comes up, the common advice given is to use join. The documentation for strings is also clear that you should not rely on this optimization: http://docs.python.org/2/library/stdtypes.html#typesseq And quadratic performance for repeated concatenation is not unique to Python: it applies to pretty much any language with immutable strings, including Java, C++, Lua and Javascript. It changed for instance my programming and teaching style a lot, of course! Why do you say, "Of course"? It should not have changed anything.
You are right, I was actually over the top with my rant and never recommend string concatenation when working with real amounts of data. The surprise was just so big.
I tend to use whatever fits best for small initialization of some modules, where the fact that concat is cheap lets me stop thinking of big Oh. Although it probably does not matter much, it makes me feel incomfortable to do something with potentially bad asymptotics.
Best practice remains the same: - we should still use join for repeated concatenations; - we should still avoid + except for small cases which are not performance critical; - we should still teach beginners to use join; - while this optimization is nice to have, we cannot rely on it being there when it matters.
I agree that CPython does say this clearly. Actually I was complaining about the PyPy documentation which does not mention this, and because PyPy is so very compatible already.
2004 when this stuff came up was the time where PyPy already was quite active, but the Psyco mindset was still around, too. Maybe my slightly shocked reaction originates from there, and my implicit assumption was never corrected ;-)
cheers - chris
-- Christian Tismer :^) <mailto:tismer at stackless.com> Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : Starship http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/
- Previous message: [Python-Dev] efficient string concatenation (yep, from 2004)
- Next message: [Python-Dev] efficient string concatenation (yep, from 2004)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]