[Python-3000] Support for PEP 3131 (original) (raw)

Guido van Rossum guido at python.org
Fri May 25 05:09:33 CEST 2007


On 5/24/07, Ka-Ping Yee <python at zesty.ca> wrote:

To pit this as "ascii lovers vs. non-ascii lovers" is a pretty large oversimplification. You could name them "people who want to be able to know what the code says" and "people who don't mind not being able to know what the code says". Or you could name them "people who want Python's lexical syntax to be something they fully understand" and "people who don't mind the extra complexity". Or "people who don't want Python's lexical syntax to be tied to a changing external standard" and "people who don't mind the extra variability."

However you characterize them, keep in mind that those in the former group are asking for default behaviour that 100% of Python users already use and understand. There's no cost to keeping identifiers ASCII-only because that's what Python already does. I think that's a pretty strong reason for making the new, more complex behaviour optional.

If there's a security argument to be made for restricting the alphabet used by code contributions (even by co-workers at the same company), I don't see why ASCII-only projects should have it easier than projects in other cultures.

It doesn't look like any kind of global flag passed to the interpreter would scale -- once I am using a known trusted contribution that uses a different character set than mine, I would have to change the global setting to be more lenient, and the leniency would affect all code I'm using.

A more useful approach would seem to be a set of auditing tools that can be applied routinely to all new contributions (e.g. as a pre-commit hook when using a source control system), or to all code in a given directory, download, etc. I don't see this as all that different from using e.g. PyChecker of PyLint.

While I routinely perform visual code inspections (code review is the law at Google, and I wrote the tool used internally to do these), I certainly don't see this as a security audit -- I use it as a mentoring activity and to reach agreement over issues as diverse as coding style, architecture and implementation techniques between trusting colleagues. Scanning for stray non-ASCII characters is best left to automated tools.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-3000 mailing list