[Python-Dev] Support of UTF-16 and UTF-32 source encodings (original) (raw)

Chris Angelico rosuav at gmail.com
Sat Nov 14 20:15:28 EST 2015

Previous message (by thread): [Python-Dev] Support of UTF-16 and UTF-32 source encodings
Next message (by thread): [Python-Dev] Support of UTF-16 and UTF-32 source encodings
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sun, Nov 15, 2015 at 12:06 PM, Steve Dower <steve.dower at python.org> wrote:

The native encoding on Windows has been UTF-16 since Windows NT. Obviously we've survived without Python tokenization support for a long time, but every API uses it.

I've hit a few cases where it would have been handy for Python to be able to detect it, though nothing I couldn't work around. Saying it is rarely used is rather exposing your own unawareness though - it could arguably be the most commonly used encoding (depending on how you define "used").

What matters here is: How likely is it that an arbitrary Python script (or, say, "arbitrary text file") is encoded UTF-16 rather than something ASCII-compatible? I think even Notepad defaults to UTF-8 for files, now. The fact that it's sending text to the GUI subsystem in UTF-16 is immaterial here.

Can the py.exe launcher handle a UTF-16 shebang? (I'm pretty sure Unix program loaders won't.) That alone might be a reason for strongly encouraging ASCII-compat encodings.

ChrisA

Previous message (by thread): [Python-Dev] Support of UTF-16 and UTF-32 source encodings
Next message (by thread): [Python-Dev] Support of UTF-16 and UTF-32 source encodings
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list