[Python-Dev] Python and the Unicode Character Database (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Fri Dec 3 00:19:20 CET 2010
- Previous message: [Python-Dev] Python and the Unicode Character Database
- Next message: [Python-Dev] Python and the Unicode Character Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Am 02.12.2010 23:43, schrieb M.-A. Lemburg:
Eric Smith wrote:
The current behavior should go nowhere; it is not useful. Something very similar to the current behavior (but done correctly) should go into the locale module.
I agree with everything Martin says here. I think the basic premise is: you won't find strings "in the wild" that use non-ASCII digits but do use the ASCII dot as a decimal point. And that's what float() is looking for. (And that doesn't even begin to address what it expects for an exponent 'e'.) http://en.wikipedia.org/wiki/Decimalmark "In China, comma and space are used to mark digit groups because dot is used as decimal mark."
I may be misinterpreting that, but I think that refers to the case of writing numbers using Arabic digits.
"Chinese" digits are, e.g., used in the Suzhou numerals
http://en.wikipedia.org/wiki/Suzhou_numerals
This doesn't have a decimal point at all. Instead, the second line (below or left to the actual digits) describes the power of ten and the unit of measurement (i.e. similar to scientific notation, but with ideographs for the powers of ten).
In another writing system, they use 点 (U+70B9) as the decimal separator, see
http://en.wikipedia.org/wiki/Chinese_numerals#Fractional_values
In the same system, the integral part uses multipliers, i.e. 12345 is [1][10000][2][1000][3][100][4][10][5]; the fractional part uses regular digits.
Regards, Martin
- Previous message: [Python-Dev] Python and the Unicode Character Database
- Next message: [Python-Dev] Python and the Unicode Character Database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]