[Python-Dev] len(chr(i)) = 2? (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Fri Nov 19 23:46:08 CET 2010
- Previous message: [Python-Dev] len(chr(i)) = 2?
- Next message: [Python-Dev] len(chr(i)) = 2?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
It'S rather common to confuse a transfer encoding with a storage format. UCS2 and UCS4 refer to code units (the storage format).
Actually, they don't. Instead, they refer to "coded character sets", in W3C terminology: mapping of characters to natural numbers. See
http://unicode.org/faq/basic_q.html#14
The term "UCS-2" is a character set that can encode only encode 65536 characters; it thus refers to Unicode 1.1. According to the Unicode Consortium's FAQ, the term UCS-2 should be avoided these days.
IMO, we should go back to the Python2 terms UCS2 and UCS4 which are correct and provide a clear description of what Python uses internally for code units.
No, we shouldn't. The term UCS-2 is deprecated, see above.
Regards, Martin
- Previous message: [Python-Dev] len(chr(i)) = 2?
- Next message: [Python-Dev] len(chr(i)) = 2?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]