[Python-Dev] Python and the Unicode Character Database (original) (raw)

Alexander Belopolsky alexander.belopolsky at gmail.com
Mon Nov 29 20:38:46 CET 2010


On Mon, Nov 29, 2010 at 1:33 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:

On Mon, 29 Nov 2010 08:22:46 +0100 "Martin v. Löwis" <martin at v.loewis.de> wrote:

> The former ensures that literals in code are always readable; the later > allows users to enter numbers in their own number system. How could that > be a bad thing?

It's YAGNI, feature bloat. It gives the illusion of supporting something that actually isn't supported very well (namely, parsing local number strings). I claim that there is no meaningful application of this feature. Still, if it's not detrimental and it it's not difficult to support, then why do you care?

It is difficult to support. A fix for issue10557 would be much simpler if we did not support non-European digits. I now added a patch that handles non-ascii digits, so you can see what's involved. Note that when Unicode Consortium inevitably adds more Nd characters to the non-BMP planes, we will have to add surrogate pairs' support to this code.

In any case, there is little we can do about it in 3.2 other than fix bugs like issue10557 without breaking currently valid code, so I created a separate issue to continue this debate in context of 3.3. [issue10581]

Now, I would like to bring this thread back to it's subject. Given that UCD is now affecting the language definition and the standard library behavior, how should changes to UCD be handled?

Current documentation refers to old versions. Should version be updated or removed to imply the latest?

During PEP 3003 discussion, it was suggested to handle it on a case by case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP 3003. Should this upgrade be backported to 2.7?

[issue10581] http://bugs.python.org/issue10581



More information about the Python-Dev mailing list