[Python-Dev] Unicode locale values in 2.7 (original) (raw)

Mark Dickinson dickinsm at gmail.com
Thu Dec 3 12:55:11 CET 2009


On Thu, Dec 3, 2009 at 11:33 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:

Eric Smith <eric trueblade.com> writes:

But in trunk, the value is just used as-is. So when formating a decimal, for example, '\xc2\xa0' is just inserted into the result, such as: >>> format(Decimal('1000'), 'n') '1\xc2\xa0000' This doesn't make much sense, Why doesn't it make sense? It's normal UTF-8. The same thing happens when the monetary sign is non-ASCII, see Lib/test/testlocale.py for an example.

Well, one problem is that it messes up character counts. Suppose you're aware that the thousands separator might be a single multibyte character, and you want to produce a unicode result that's zero-padded to a width of 6. There's currently no sensible way of doing this that I can see:

format(Decimal('1000'), '06n').decode('utf-8') gives a string of length 5

format(Decimal('1000'), u'06n') fails with a UnicodeDecodeError.

Mark



More information about the Python-Dev mailing list