[Python-Dev] Python and the Unicode Character Database (original) (raw)

Stephen J. Turnbull stephen at xemacs.org
Sat Dec 4 09:13:45 CET 2010


Antoine Pitrou writes:

Le vendredi 03 décembre 2010 à 13:58 +0900, Stephen J. Turnbull a écrit :

Antoine Pitrou writes:

The legacy format argument looks like a red herring to me. When converting from a format to another it is the programmer's job to his/her job right.

Uhmmmmmm, the argument for this "feature" proposed by several people is that Python's numeric constructors do it (right) so that the programmer doesn't have to.

As far as I understand, Alexander was talking about a legacy pre-unicode text format. We don't have to support this.

I didn't say we should support it. I'm saying that others' argument for not restricting the formats accepting by string to number converters to something well-defined and AFAIK universally understood by users (developers of Python programs and end-users) is that we already support this.

Alexander, Martin, and I are basically just pointing out that no, the "support" we have via the built-in numeric constructors is incomplete and nonconforming. We feel that is a bug to be fixed by (1) implementing the definition as currently found in the documents, and (2) moving the non-ASCII support to another module (or, as a compromise, supporting non-ASCII digits via an argument to the built-ins -- that was my proposal, I don't know if Alexander or Martin would find it acceptable).

Given that some committers (MAL, you?) don't even consider that accepting and converting a string containing digits from multiple scripts as a single number is a bug, I'd really rather that this bug/feature not be embedded in the interpreter. I suppose that as a built-in rather than syntax, technically it doesn't fall under the moratorium, but it makes me nervous....



More information about the Python-Dev mailing list