[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)
Terry Reedy tjreedy at udel.edu
Wed Apr 29 23:03:30 CEST 2009
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thomas Breuel wrote:
Sure. However, that requires you to provide meaningful, reproducible counter-examples, rather than a stenographic formulation that might hint some problem you apparently see (which I believe is just not there).
Well, here's another one: PEP 383 would disallow UTF-8 encodings of half surrogates.
By my reading, the current Unicode 5.1 definition of 'UTF-8' disallows that.
But such encodings are currently supported by Python, and they are used as part of CESU-8 coding. That's, in fact, a common way of converting UTF-16 to UTF-8. How are you going to deal with existing code that relies on being able to code half surrogates as UTF-8?
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]