[Python-Dev] \u and \U escapes in raw unicode string literals (original) (raw)
Guido van Rossum guido at python.org
Thu May 10 20:45:57 CEST 2007
- Previous message: [Python-Dev] Strange behaviour with PyEval_EvalCode
- Next message: [Python-Dev] \u and \U escapes in raw unicode string literals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I just discovered that, in all versions of Python as far back as I have access to (2.0), \uXXXX escapes are interpreted inside raw unicode strings. Thus:
a = ur"\u1234" len(a) 1
Contrast this with:
a = ur"\x12" len(a) 4
The \U escape has the same behavior, in versions that support it.
Does anyone remember why it is done this way? The reference manual describes this behavior, but doesn't give an explanation:
"""
When an "r" or "R" prefix is used in conjunction with a "u" or "U"
prefix, then the \uXXXX and \UXXXXXXXX escape sequences are processed
while all other backslashes are left in the string. For example, the
string literal ur"\u0062\n" consists of three Unicode characters:
LATIN SMALL LETTER B',
REVERSE SOLIDUS', and `LATIN SMALL LETTER N'.
Backslashes can be escaped with a preceding backslash; however, both
remain in the string. As a result, \uXXXX escape sequences are only
recognized when there are an odd number of backslashes.
"""
-- --Guido van Rossum (home page: http://www.python.org/~guido/)
- Previous message: [Python-Dev] Strange behaviour with PyEval_EvalCode
- Next message: [Python-Dev] \u and \U escapes in raw unicode string literals
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]