[Python-Dev] String views (original) (raw)

skip@pobox.com skip at pobox.com
Fri Sep 2 05:14:52 CEST 2005


>> I'm skeptical about performance as well, but not for that reason.  A
>> string object can have a referent field.  If not NULL, it refers to
>> another string object which is INCREFed in the usual way.  At string
>> deallocation, if the referent is not NULL, the referent is DECREFed.
>> If the referent is NULL, ob_sval is freed.

Michael> Won't work. A string may have multiple referrents, so a single
Michael> referent field isn't sufficient.

Hmmm... I implemented it last night (though it has yet to be tested). I suspect it will work. Here's my PyStringObject struct:

typedef struct {
    PyObject_VAR_HEAD
    long ob_shash;
    int ob_sstate;
    PyObject *ob_referent;
    char *ob_sval;
} PyStringObject;

(minus the invariants which I have yet to check). Suppose url is a string object whose value is "http://www.python.org/", and that it has a reference count of 1 and isn't a view onto another string. Its ob_referent field would be NULL. (Maybe it would be better named "ob_target".) If we then execute

before, sep, after = url.partition(":")

upon return before, sep and after would be string objects whose ob_referent field refers to url and url's reference count would be 4. Their ob_sval fields would point to the start of their piece of url. When the reference counts of before, sep and after reach zero, they are reclaimed. Since they each have a non-NULL ob_referent field, the target object is DECREFed, but the ob_sval field is not freed. In the case of url, when its reference count reaches zero, since its ob_referent field is NULL, its ob_sval field is freed.

The only tricky business was PyString_AsString. If the argument object is a view you have to "un-view" it by copying the interesting bits and DECREFing the ob_referent. This is because of the NUL termination guarantee.

I wonder if the use of views would offset the overhead of returning to a double-malloc allocation.

Skip



More information about the Python-Dev mailing list