[Python-3000] PEP 3131 accepted (original) (raw)

Josiah Carlson jcarlson at uci.edu
Wed May 23 18:23:28 CEST 2007

Previous message: [Python-3000] PEP 3131 accepted
Next message: [Python-3000] PEP 3131 accepted
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

"Stephen J. Turnbull" <stephen at xemacs.org> wrote:

Josiah Carlson writes:

> From identical character glyph issues (which have been discussed > off and on for at least a year), In my experience, this is not a show-stopping problem.

I never claimed that this, by itself, was a showstopper.

And my post should not be seen as a "these are all the problems that I have seen with PEP 3131". Those are merely the issues that have been discussed over and over, for which I (and seemingly others) are still concerned with, regardless of the hundreds of posts here and in comp.lang.python seeking to convince us that "they are not a problem".

Emacs/MULE has had it for 20 years because of the (horrible) design decision to attach charset information to each character in the representation of text. Thus, MULE distinguishes between NO-BREAK SPACE and NO-BREAK SPACE (the same!) depending on whether the containing text "is" ISO 8859-15 or "is" ISO 8859-1. (Semantically this is different from the identical glyph, different character problem, since according to ISO 8859 those characters are identical. However, as a practical matter, the problem of detecting and dealing with the situation is the same as in MULE the character codes are different.)

How does Emacs deal with this? Simple. We provide facilities to identify identical characters (not relevant to PEP 3131, probably), to highlight suspicious characters (proposed, not actually implemented AFAIK, since identification does what almost all users want), and to provide information on characters in the editing buffer. The remaining problems with coding confusion are due to deficient implementation (mea maxima culpa). I consider this to be an editor/presentation problem, not a language definition issue.

This particular excuse pisses me off the most. "If you can't differentiate, then your font or editor sucks." Thank you for passing judgement on my choice of font or editor, but Ka-Ping already stated why this argument is bullshit: there does not currently exist a font where one can differentiate all the glyphs, and further, even if one could visually differentiate similar glyphs, remembering the 64,000+ glyphs that are available in just the primary unicode plane to differentiate them, is a herculean task.

Never mind the fact that people use dozens, perhaps hundreds of different editors to write and maintain Python code, that the 'Emacs works' argument is poor at best. Heck, Thomas Bushnell made the same argument when I spoke with him 2 1/2 years ago (though he also included Vim as an alternative to Emacs); it smelled like bullshit then, and it smells like bullshit now.

Note that Ka-Ping's worry about the infinite extensibility of Unicode relative to any human being's capacity is technically not a problem. You simply have your editor substitute machine-generated identifiers for each identifier that contains characters outside of the user's preferred set (eg, using hex codes to restrict to ASCII), then review the code. When you discover what an identifier's semantics are, you give it a mnemonic name according to the local style guide. Expensive, yes. But cost is a management problem, not the kind of conceptual problem Ka-Ping claims is presented by multilingual identifiers. Python is still, in this sense, a finitely generated language.

That's a bullshit argument, and you know it. "Just use hex escapes"? Modulo unicode comments and strings, all Python programs are easily read in default fonts available on every platform on the planet today. But with 3131, people accepting 3rd party code need to break 15+ years of "what you see is what is actually there" by verifying the character content of every identifier? That's a silly and unnecessary workload addition for anyone who wants to accept patches from 3rd parties, and relies on the same "your tools suck" argument to invalidate concerns over unicode glyph similarity.

Speaking of which, do you know of a fixed-width font that is able to allow for the visual distinction of all unicode glyphs in the primary plane, or even the portion that Martin is proposing we support? This also "is not a show-stopper", but it certainly reduces audience satisfaction by a large margin.

> to editing issues (being that I write and maintain a Python editor)

Multilingual editing (except for non-LTR scripts) is pretty much a solved problem, in theory, although adding it to any given implementation can be painful. However, since there are many programmer's editors that can handle multilingual text already, that is not a strong argument against PEP 3131.

Another "your tools suck" argument. While my editor has been able to handle unicode content for a couple years now (supporting all encodings available to Python), every editor that wants to properly support the adding of unicode text in any locale will necessitate the creation of charmap-like interfaces in basically every editor.

But really, I'm glad that Emacs works for you and has solved this problem for you. I honestly tried to use it 4 years ago, spent a couple weeks with it. But it didn't work for me, and I've spent the last 4 years writing an editor because it and the other 35 editors I tried at the time didn't work for me (as have the dozens of others for the exact same reason). But of course, our tools suck, and because we can't use Emacs, we are already placed in a 2nd tier ghettoized part of the Python community of "people with tools that suck".

Thank you for hitting home that unless people use Emacs, their tools suck. I still don't believe that my concerns have been addressed. And I certainly don't believe that those Ka-Ping brought up (which are better than mine) have been addressed. But hey, my tools suck, so obviusly my concerns regarding using my tools to edit Python in the future don't matter. Thank you for the vote of confidence.

Josiah

Previous message: [Python-3000] PEP 3131 accepted
Next message: [Python-3000] PEP 3131 accepted
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-3000 mailing list