[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)
Aahz aahz at pythoncraft.com
Thu Apr 30 04:50:50 CEST 2009
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Apr 30, 2009, Cameron Simpson wrote:
The lengthy discussion mostly revolves around: - Glenn points out that strings that came not from listdir, and that are not well-formed unicode (== "have bare surrogates in them") but that were intended for use as filenames will conflict with the PEP's scheme - programs must know that these strings came from outside and must be translated into the PEP's funny-encoding before use in the os.* functions. Previous to the PEP they would get used directly and encode differently after the PEP, thus producing different POSIX filenames. Breakage. - Glenn would like the encoding to use Unicode scalar values only, using a rare-in-filenames character. That would avoid the issue with "outside' strings that contain surrogates. To my mind it just moves the punning from rare illegal strings to merely uncommon but legal characters. - Some parties think it would be better to not return strings from os.listdir but a subclass of string (or at least a duck-type of string) that knows where it came from and is also handily recognisable as not-really-a-string for purposes of deciding whether is it PEP-funny-encoded by direct inspection.
Assuming people agree that this is an accurate summary, it should be incorporated into the PEP.
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/
"If you think it's expensive to hire a professional to do the job, wait until you hire an amateur." --Red Adair
- Previous message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Next message: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]