[Python-Dev] Question about the current implementation of str (original) (raw)
Nick Coghlan ncoghlan at gmail.com
Sat Apr 9 03🔞10 EDT 2016
- Previous message (by thread): [Python-Dev] Question about the current implementation of str
- Next message (by thread): [Python-Dev] Question about the current implementation of str
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 9 April 2016 at 10:56, Larry Hastings <larry at hastings.org> wrote:
I have a straightforward question about the str object, specifically the PyUnicodeObject. I've tried reading the source to answer the question myself but it's nearly impenetrable. So I was hoping someone here who understands the current implementation could answer it for me. Although the str object is immutable from Python's perspective, the C object itself is mutable. For example, for dynamically-created strings the hash field may be lazy-computed and cached inside the object. I was wondering if there were other fields like this. For example, are there similar lazy-computed cached objects for the different encoded versions (utf8 utf16) of the str? What would really help an exhaustive list of the fields of a str object that may ever change after the object's initial creation.
https://www.python.org/dev/peps/pep-0393/#specification should have most of the relevant details.
Aside from the hash and the interned-or-not flag in the state, most things should be locked once the string is ready, except that generating the utf-8 and wchar_t forms is deferred until they're needed if they're not the same as the canonical form.
Cheers, Nick.
-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
- Previous message (by thread): [Python-Dev] Question about the current implementation of str
- Next message (by thread): [Python-Dev] Question about the current implementation of str
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]