[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)

Cameron Simpson cs at zip.com.au
Fri Apr 24 01:47:24 CEST 2009


On 22Apr2009 21:17, Martin v. L�wis <martin at v.loewis.de> wrote: | > -1. On UNIX, character data is not sufficient to represent paths. We | > must, must, must continue to have a simple bytes interface to these | > APIs. || I'd like to respond to this concern in three ways: || 1. The PEP doesn't remove any of the existing interfaces. So if the | interfaces for byte-oriented file names in 3.0 work fine for you, | feel free to continue to use them.

Ok. I think I had read things as supplanting byte-oriented interfaces with this exciting new strings-can-do-it-all approach.

| 2. Even if they were taken away (which the PEP does not propose to do), | it would be easy to emulate them for applications that want them. | For example, listdir could be wrapped as || def listdirb(bytestring): | fse = sys.getfilesystemencoding()

Alas, no, because there is no sys.getfilesystemencoding() at the POSIX level. It's only the user's current locale stuff on a UNIX system, and has nothing to do with the filesystem because UNIX filesystems don't have encodings.

In particular, because the "best" (or to my mind "misleading") you can do for this is report what the current user thinks: http://docs.python.org/library/sys.html#sys.getfilesystemencoding then there's no guarrentee that what is chosen has any releationship to what was in use when the files being consulted were made.

Now, if I were writing listdir_b() I'd want to be able to do something along these lines:

Then I'd have some confidence that I had got hold of the bytes as they had come from the underlying UNIX system call, and a way to get those bytes back to a UNIX system call intact.

| string = bytestring.decode(fse, "python-escape") | for fn in os.listdir(string): | yield fn.encoded(fse, "python-escape") || 3. I still disagree that we must, must, must continue to provide these | interfaces. I don't understand from the rest of your message what | would actually break if people would use the proposed interfaces.

My other longer message describes what would break, if I understand your proposal.

Cameron Simpson <cs at zip.com.au> DoD#743 http://www.cskk.ezoshosting.com/cs/



More information about the Python-Dev mailing list