[Python-Dev] len(chr(i)) = 2? (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Mon Nov 22 16:37:21 CET 2010


On Mon, Nov 22, 2010 at 10:47 PM, M.-A. Lemburg <mal at egenix.com> wrote:

Please also note that we have used the terms UCS-2 and UCS-4 in Python2 for 9+ years now and users are just starting to learn the difference and get acquainted with the fact that Python uses these two forms.

Confronting them with "narrow" and "wide" builds is only going to cause more confusion, not less, and adding those strings to Python package files isn't going to help much either, since the terms don't convey any relationship to Unicode:

I was personally surprised to learn in this discussion that there had even been an attempt to change the names of the two build variants to anything other than UCS2/UCS4. The concrete API implementations certainly still use those two terms to prevent inadvertent linkage with the wrong version of the C API.

For practical purposes, UCS2/UCS4 convey far more inherent information than narrow/wide:

*(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article)

All that just armed with Google, without even looking at the Python docs specifically.

So don't just think about "what will developers know?", also think about "what will developers know, and what will a quick trip to a search engine tell them?". And once you take that stance, the overly generic narrow/wide terms fail, badly.

+1 for MAL's suggested tweaks to the Py3k configure options.

Cheers, Nick.

-- Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-Dev mailing list