[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)

"Martin v. Löwis" martin at v.loewis.de
Wed Apr 29 07:52:23 CEST 2009


C. File on disk with the invalid surrogate code, accessed via the str interface, no decoding happens, matches in memory the file on disk with the byte that translates to the same surrogate, accessed via the bytes interface. Ambiguity.

Is that an alternative to A and B? I guess it is an adjunct to case B, the current PEP. It is what happens when using the PEP on a system that provides both bytes and str interfaces, and both get used.

Your formulation is a bit too stenographic to me, but please trust me that there is no ambiguity in the case you construct.

By "accessed via the str interface", I assume you do something like

fn = "some string" open(fn)

You are wrong in assuming "no decoding happens", and that "matches in memory the file on disk" (whatever that means - how do I match a file on disk in memory??????). What happens instead is that fn gets encoded with the file system encoding, and the python-escape handler. This will not produce an ambiguity.

If you think there is an ambiguity in that you can use both the byte interface and the string interface to access the same file: this would be a ridiculous interpretation. Of course you can access /etc/passwd both as "/etc/passwd" and b"/etc/passwd", there is nothing ambiguous about that.

Regards, Martin



More information about the Python-Dev mailing list