[Python-Dev] PEP 393: Special-casing ASCII-only strings (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Thu Sep 15 23:39:13 CEST 2011
- Previous message: [Python-Dev] PEP 393: Special-casing ASCII-only strings
- Next message: [Python-Dev] PEP 393: Special-casing ASCII-only strings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I like it. If we start which such optimization, we can also also remove data from strings allocated by the new API (it can be computed: object pointer + size of the structure). See my email for my proposition of structures: Re: [Python-Dev] PEP 393 review Thu Aug 25 00:29:19 2011
I agree it is tempting to drop the data pointer. However, I'm not sure how many different structures we would end up with, and how the aliasing rules would defeat this (you cannot interpret a struct X* as a struct Y*, unless either X is the first field of Y or vice versa).
Thinking about this, the following may work:
- ASCIIObject: state, length, hash, wstr*, data follow
- SingleBlockUnicode: ASCIIObject, wstr_len, utf8*, utf8_len, data follow
- UnicodeObject: SingleBlockUnicode, data pointer, no data follow
This is essentially your proposal, except that the wstr_len is dropped for ASCII strings, and that it uses nested structs.
The single-block variants would always be "ready", the full unicode object is ready only if the data pointer is set.
I'll try it out, unless somebody can punch a hole into this proposal :-)
Regards, Martin
- Previous message: [Python-Dev] PEP 393: Special-casing ASCII-only strings
- Next message: [Python-Dev] PEP 393: Special-casing ASCII-only strings
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]