[Python-Dev] Byte string class hierarchy (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Thu Aug 19 00:38:31 CEST 2004
- Previous message: [Python-Dev] Byte string class hierarchy
- Next message: [Python-Dev] Byte string class hierarchy
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jack Jansen wrote:
genericbytes mutablebytes bytes genericstring string unicode
I think this hiearchy is wrong. unicode is not a specialization of genericybytes: a unicode strings is made out of characters, not out of bytes.
The basic type for all bytes, buffers and strings is genericbytes. This abstract base type is neither mutable nor immutable, and has the interface that all of the types would share. Mutablebytes adds slice assignment and such. Bytes, on the other hand, adds hashing.
There is a debate on whether such a type is really useful. Why do you need hashing on bytes?
genericstring is the magic stuff that's there already that makes unicode and string interoperable for hashing and dict keys and such.
Interoperability, in Python, does not necessarily involve a common base type.
Casting to a basetype is always free and doesn't copy anything
And, of course, there is no casting at all in Python.
Operations like concatenation return the most specialised class.
Assuming the hieararchy on the top of your message, what does that mean? Suppose I want to concatenate unicode and string: which of them is more specialized?
Read() is guaranteed only to return genericbytes, but if you open a file in textmode they'll returns strings, and we should add the ability to open files for unicode and probably mutablebytes too.
I think Guido's proposal is that read(), in text mode, returns Unicode strings, and (probably) that there is no string type in Python anymore. read() on binary files would return a mutable byte array.
Regards, Martin
- Previous message: [Python-Dev] Byte string class hierarchy
- Next message: [Python-Dev] Byte string class hierarchy
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]