[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)
Hrvoje Niksic hrvoje.niksic at avl.com
Tue Apr 28 14:46:11 CEST 2009
- Previous message: [Python-Dev] PEP 383 (again)
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thomas Breuel wrote:
But the biggest problem with the proposal is that it isn't needed: if you want to be able to turn arbitrary byte sequences into unicode strings and back, just set your encoding to iso8859-15. That already works and it doesn't require any changes.
Are you proposing to unconditionally encode file names as iso8859-15, or to do so only when undecodeable bytes are encountered?
If you unconditionally set encoding to iso8859-15, then you are effectively reverting to treating file names as bytes, regardless of the locale. You're also angering a lot of European users who expect iso8859-2, etc.
If you switch to iso8859-15 only in the presence of undecodable UTF-8, then you have the same round-trip problem as the PEP: both b'\xff' and b'\xc3\xbf' will be converted to u'\u00ff' without a way to unambiguously recover the original file name.
- Previous message: [Python-Dev] PEP 383 (again)
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]