[Python-3000] Support for PEP 3131 (original) (raw)

Collin Winter collinw at gmail.com
Sun May 13 17:22:03 CEST 2007


On 5/12/07, Guillaume Proux <gproux+py3000 at gmail.com> wrote: [snip]

In this respect, I strongly believe that support non-ASCII identifiers as proposed by PEP3131 would improve a number of things: - discussion and uptake of python in "non-ascii" countries - ability for children to learn programming in their own language (I started programming at 7 years old and would have been very disturbed if I could not use my own language to type in programs) - increase of the number of new "interesting" packages from non-ascii countries - ability for local programmers and local companies to provide "bridges" between international (english) APIs and local APIs. - Increase the number of python users (from 7 to 77 years old)

Says you. So far, all I've seen from PEP 3131's supporters is a lot of hollow assertions and idle theorizing: "Python will be easier to use for people using non-ASCII character sets", "Python will be easier to learn for those raised with non-Roman-influenced languages", etc, etc. Until I see some kind of evidence, something to back up these claims, I'm going to assume you're wrong.

Have there been studies on this kind of thing? Has there been any research into whether a mixture of English keywords and, say, Japanese and English identifiers makes a given programming language easier to learn and use? If so, why aren't they referenced in the PEP or linked in any emails? Given the lack of evidence presented so far, my operating assumption is that the PEP's supporters -- including you -- are making things up to support a conclusion that they might wish to be true.

In my humble opinion, now that UTF8 is accepted as the standard source code encoding, it is very difficult to understand why we should start putting restrictions on the kind of identifiers that are used (which would force people to comment line by line as they do now!).

When I am programming in Python, I am VERY DISTURBED when the code I write contains much comment. It needs to be readable just by glancing at it. However, for most of the people who are core python developers, you should ask what is the typical reading speed for "ascii" characters for a e.g. standard Japanese pupil. You would be very surprised how slow that is. In my opinion (after leaving in Japan for quite a bit), people are very slow to read ASCII characters and this definitely restrain their programming productivity and expressiveness.

See, that's the thing I have yet to see addressed: there's been lot of stress on "being able to write variable/class/method names in Arabic/Mandarin/Hindi will make it easier for native speakers to understand", but as far as I know, no-one has yet addressed how these non-English identifiers will mesh with the existing English keywords and English standard library functions. You say that being able to write identifiers in Cyrillic will make Python easier for Russian natives to read, to make Python code as you say, "readable just by glancing at it". But the fact is any native-language identifiers will be surrounded in a sea of English: keywords, the standard library, almost all open-source packages, etc. How does that impact your readability guesses?

Also, method/function names are traditionally expressed in English as verb phrases (e.g., "isElementVisible()") which dovetail nicely with Anglo-centric keywords like "if" and "for ... in ...". How do identifiers in languages with dramatically different grammars like Japanese -- or worse, different reading orders like Farsi and Hebrew -- interact with "if", "while" and the new "x if y else z" expression, which are deeply rooted in English grammar? My suspicion is, at least for right-to-left languages like Arabic, not well, if at all.

Lastly, I take issue with one of the PEP's guidelines under the "Policy Specification" section: "All identifiers in the Python standard library...SHOULD use English words wherever feasible" (emphasis in the original). Are we now going to admit the possibility that part of the standard library will be written in English, some parts will be written in Spanish and this one module over there will be written in Czech? Absolutely ludicrous.

Come-on-tell-us-how-you-really-feel-ly, Collin Winter



More information about the Python-3000 mailing list