[Python-Dev] Support of UTF-16 and UTF-32 source encodings (original) (raw)

Laura Creighton lac at openend.se
Sun Nov 15 09:43:58 EST 2015


In a message of Sun, 15 Nov 2015 12:56:18 +0000, Paul Moore writes:

On 15 November 2015 at 07:23, Stephen J. Turnbull <stephen at xemacs.org> wrote:

I don't see any good reason for allowing non-ASCII-compatible encodings in the reference CPython interpreter.

From PEP 263: Any encoding which allows processing the first two lines in the way indicated above is allowed as source code encoding, this includes ASCII compatible encodings as well as certain multi-byte encodings such as ShiftJIS. It does not include encodings which use two or more bytes for all characters like e.g. UTF-16. The reason for this is to keep the encoding detection algorithm in the tokenizer simple. So this pretty much confirms that double-byte encodings are not valid for Python source files. Paul

Steve Turnbull, who lives in Japan, and speaks and writes Japanese is saying that "he cannot see any reason for allowing non-ASCII compatible encodings in Cpython".

This makes me wonder.

Is this along the lines of 'even in Japan we do not want such things' or along the lines of 'when in Japan we want such things we want to so brutally do so much more, so keep the reference implementation simple, and don't try to help us with this seems-like-a-good-idea-but-isnt-in-practice' ideas like this one, or ....

Laura



More information about the Python-Dev mailing list