[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces (original) (raw)

Terry Reedy tjreedy at udel.edu
Thu Apr 30 23:39:10 CEST 2009


James Y Knight wrote:

On Apr 30, 2009, at 5:42 AM, Martin v. Löwis wrote:

I think you are right. I have now excluded ASCII bytes from being mapped, effectively not supporting any encodings that are not ASCII compatible. Does that sound ok? Yes. The practical upshot of this is that users who brokenly use "jaJP.SJIS" as their locale (which, note, first requires editing some files in /var/lib/locales manually to enable its use..) may still have python not work with invalid-in-shift-jis filenames. Since that locale is widely recognized as a bad idea to use, and is not supported by any distros, it certainly doesn't bother me that it isn't 100% supported in python. It seems like the most common reason why people want to use SJIS is to make old pre-unicode apps work right in WINE -- in which case it doesn't actually affect unix python at all. I'd personally be fine with python just declaring that the filesystem-encoding will always be utf-8b and ignore the locale...but I expect some other people might complain about that. Of course, application authors can decide to do that themselves by calling sys.setfilesystemencoding('utf-8b') at the start of their program.

It seems to me that the 3.1+ doc set (or wiki) could be usefully extended with a How-to on working with filenames. I am not sure that everything useful fits anywhere in particular the ref manuals.



More information about the Python-Dev mailing list