[Python-Dev] \u and \U escapes in raw unicode string literals (original) (raw)
"Martin v. Löwis" martin at v.loewis.de
Sun May 13 18:04:44 CEST 2007
- Previous message: [Python-Dev] \u and \U escapes in raw unicode string literals
- Next message: [Python-Dev] \u and \U escapes in raw unicode string literals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
* without the Unicode escapes, the only way to put non-ASCII code points into a raw Unicode string is via a source code encoding of say UTF-8 or UTF-16, pretty much defeating the original requirement of writing ASCII code only
That's no problem, though - just don't put the Unicode character into a raw string. Use plain strings if you have a need to include Unicode characters, and are not willing to leave ASCII.
For Python 3, the default source encoding is UTF-8, so it is much easier to use non-ASCII characters in the source code. The original requirement may not be as strong anymore as it used to be.
* non-ASCII code points in text are not uncommon, they occur in most European scripts, all Asian scripts, many scientific texts and in also texts meant for the web (just have a look at the HTML entities, or think of Word exports using quotes)
And you are seriously telling me that people who commonly use non-ASCII code points in their source code are willing to refer to them by Unicode ordinal number (which, of course, they all know by heart, from 1 to 65536)?
* adding Unicode escapes to the re module will break code already using "...\u..." in the regular expressions for other purposes; writing conversion tools that detect this usage is going to be hard
It's unlikely to occur in code today - \u just means the same as u (so \u1234 matches u1234); if you want a backslash followed by u in your regular expression, you should write \u.
It would be possible to future-warn about \u in 2.6, catching these cases. Authors then would either have to remove the backslash, or duplicate it, depending on what they want to express.
Regards, Martin
- Previous message: [Python-Dev] \u and \U escapes in raw unicode string literals
- Next message: [Python-Dev] \u and \U escapes in raw unicode string literals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]