[Python-Dev] String views (original) (raw)

Jim Jewett jimjjewett at gmail.com
Fri Sep 2 01:33:36 CEST 2005


Tim Delaney writes:

One of the big disadvantages of string views is that they need to keep the original object around, no matter how big it is. But in the case of partition, much of the time the original string survives for at least a similar period to the partitions.

Michael Chermside writes:

Didn't several of Raymond's examples use the idiom:

part1, , s = s.partition(firstsep) part2, , s = s.partition(secondsep) part3, , s = s.partition(thirdsep)

Yes, but in those cases, generally the entire original string was being kept by at least some part_#, so there really wasn't any wasted space. The problem only really shows up when a single 5-byte string keeps a 10K buffer alive. If it supports 2000 such strings, then everything is fine.

Skip writes:

I'm skeptical about performance as well, but not for that reason. A string object can have a referent field. If not NULL, it refers to another string object which is INCREFed in the usual way. At string deallocation, if the referent is not NULL, the referent is DECREFed. If the referent is NULL, obsval is freed.

Michael Chermside writes:

Won't work. A string may have multiple referrents, so a single referent field isn't sufficient.

I think you're looking at it backwards. A string would use a reference to a (series of characters) instead of ob_sval, just as dictionaries point to a table instead of small_table.

The catch (as Tim mentioned) is that the underlying series of characters might be much larger than this string needs. If it isn't shared, then the extra is wasted.

One way to deal with this might be have the strings clean up when they're called. If the string's length multiplied by the number of references to the buffer is much less than the size of the buffer, then the string should make its own small copy. Whether the complication is worth it, I don't know.

-jJ



More information about the Python-Dev mailing list