[Python-Dev] \u and \U escapes in raw unicode string literals (original) (raw)

Guido van Rossum guido at python.org
Fri May 11 04:11:48 CEST 2007


On 5/10/07, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

Martin v. Löwis wrote: > why should you be able to get a non-ASCII character > into a raw Unicode string?

The analogous question would be why can't you get a non-Unicode character into a raw Unicode string. That wouldn't make sense, since Unicode strings can't even hold non-Unicode characters (or at least they're not meant to). But it doesn't seem unreasonable to want to put Unicode characters into a raw Unicode string. After all, if it only contains ASCII characters there's no need for it to be a Unicode string in the first place.

This is what prompted my question, actually: in Py3k, in the str/unicode unification branch, r"\u1234" changes meaning: before the unification, this was an 8-bit string, where the \u was not special, but now it is a unicode string, where \u is special.

-- --Guido van Rossum (home page: http://www.python.org/~guido/)



More information about the Python-Dev mailing list