[Python-3000] PEP 3131 accepted (original) (raw)

Ka-Ping Yee python at zesty.ca
Wed May 23 00:08:05 CEST 2007


On Thu, 17 May 2007, Guido van Rossum wrote:

I have accepted PEP 3131.

I'm surprised that this happened so quickly. I oppose this proposal quite strongly.

Currently Python has the property that the character set is a fully known quantity. There currently exists a choice of keyboard, a choice of editor, and a set of literacy skills that is sufficient for any Python code in the world.

Adopting PEP 3131 destroys this property. It is not just that particular communities (e.g. English speakers) will be unable to understand code by other particular communities (e.g. Japanese speakers); that is relatively minor and arguably already the case. The real problem is that it will be impossible for anyone, no matter what their background, to acquire the resources necessary to handle all Python code. There will exist no keyboard that enables one to edit any Python program, and probably no editor. There will not be a single human being alive who can know or recognize the whole character set. Using APIs in a few different languages would yield a program that no one could understand.

Today, if a non-English speaker asks you how to learn Python, you can answer that question. You can explain Python's syntax and semantics, and tell them they need to know the 26 letters of the Roman alphabet. After PEP 3131, you won't be able to answer their question -- because it will be impossible for any human being to enumerate, let alone possess, the knowledge required to read an arbitrary piece of Python code.

PEP 3131 will also cause problems for code review. Because many characters have indistinguishable appearances, there will be no mapping between what you see when you look at code and what the code actually says. So it will no longer be possible to look at a piece of Python code on your screen or on paper and be sure you know what it means, or even know that it is valid Python syntax. It will be much easier to write programs that look right but do the wrong thing, which is particularly bad if you are concerned with security.

I like the idea that, after studying and working with Python for a modest amount of time, one can acquire a complete understanding of the language that affords confidence in the ability to read arbitrary programs written in Python, make changes to anything written in Python, and reuse any libraries or modules written in Python. (It is for the same reason that Python has a small and limited set of keywords that Python should have a small character set.) I don't like how PEP 3131 would not only take such abilities away from me, but remove them from the realm of possibility altogether.

Of course, nothing stops one from creating a new language (say, "UniPython") that consists of Python with Unicode identifiers. One could even write a translator from UniPython to Python, thus making it straightforward to run UniPython programs. But it would be much better for this to be a separate language that no one is expected to fully understand, so that Python can remain a language that one can fully understand.

-- ?!ng



More information about the Python-3000 mailing list