[Python-Dev] RFD: how to build strings from lots of slices? (original) (raw)

Fredrik Lundh Fredrik Lundh" <effbot@telia.com
Sun, 27 Feb 2000 13:41:06 +0100


Ka-Ping Yee wrote:

> 1. introduce a PySliceListObject, which behaves like a > simple sequence of strings, but stores them as slices. =20 It occurred to me when i read this that all slices could be references within the original string, since strings are immutable. That is, =20 s[x:y] =20 could return a PyStringRefObject that behaves just like a string, but contains a length y - x and a pointer to &(s->obsval) + x instead of the character data itself. The creation of this PyStringRefObject would increment the reference count of s by 1. =20 Perhaps this has been suggested before.

as an experiment, I actually implemented this for the original unicode string type (where "split" and "slice" returned slice references, not string copies).

here are some arguments against it:

a) bad memory behaviour if you slice small strings out of huge input strings -- which may surprise newbies.

b) harder to interface to underlying C libraries -- the current string implementation guarantees that a Python string is also a C string (with a trailing null).

personally, I don't care much about (a) (e.g. match objects already keep references to the input string, and if this is a real problem, you can always use a more elaborate data structure...).

(b) is a bit harder to ignore, though.