[Python-Dev] PEP 263 -- Python Source Code Encoding (original) (raw)

Guido van Rossum guido@python.org
Tue, 26 Feb 2002 16:53:55 -0500

Previous message: [Python-Dev] PEP 263 -- Python Source Code Encoding
Next message: [Python-Dev] PEP 263 -- Python Source Code Encoding
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

In phase 2, the encoding will apply to all strings. So it will not be possible to put arbitrary byte sequences in a string literal, atleast if the encoding disallows certain byte sequences (like UTF-8, or ASCII). Since this is currently possible, we have a backwards compatibility problem.

I would say that any program that currently uses non-ASCII in string literals (whether Unicode or 8-bit literals) is strictly spoken undefined. For cases where a specific encoding is used, the solution is easy: add an explicit encoding. Other cases are simply garbage and should use \xDD escapes instead.

Maybe an implementation phase 1a should be introduced that warns about the occurrence of non-ASCII characters anywhere in the source code when no encoding is specified.

--Guido van Rossum (home page: http://www.python.org/~guido/)

Previous message: [Python-Dev] PEP 263 -- Python Source Code Encoding
Next message: [Python-Dev] PEP 263 -- Python Source Code Encoding
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]