[Python-3000] PEP 3131 - the details (original) (raw)
James Y Knight foom at fuhm.net
Thu May 17 20:03:54 CEST 2007
- Previous message: [Python-3000] PEP 3131 - the details
- Next message: [Python-3000] PEP 3131 - the details
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I mentioned this in another thread as an aside in the middle of the
email, but I thought I'd put it out here at the top:
It should be considered whether formatting characters should be
ignored. And if so, which list of properties should be used for that.
I notice that the excerpt from the C# standard says:
* 4 Any formatting-characters are removed.
I don't know what they mean by that, but I'm going to guess
characters in the Cf class.
However, UAX #31 says:
2.2 Layout and Format Control Characters
Certain Unicode characters are used to control joining behavior, bidirectional ordering control, and alternative formats for display. These have the GeneralCategory value of Cf. Unlike space characters or other delimiters, they do not indicate word, line, or other unit boundaries. While it is possible to ignore these characters in determining identifiers, the recommendation is to not ignore them and to not permit them in identifiers except in special cases. This is because of the possibility for confusion between two visually identical strings; see [UTR36]. Some possible exceptions are the ZWJ and ZWNJ in certain contexts, such as between certain characters in Indic words.
It doesn't seem to me that an attack vector here is particularly
relevant, so perhaps going along with C# and ignoring Cf characters
in the source code might be a good idea. But I do notice that Unicode
4.0.1 and earlier used to recommend ignoring formatting characters in
identifiers (Ch 5 of the book), so that might be where C# got it from.
So, maybe it's better to keep the status quo, and not allow Cf
characters, unless someone comes up with a particular need for doing
so. Hm, I think I've convinced myself of that now. :)
James
- Previous message: [Python-3000] PEP 3131 - the details
- Next message: [Python-3000] PEP 3131 - the details
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]