[Python-3000] Raw strings containing \u or \U (original) (raw)
Georg Brandl g.brandl at gmx.net
Thu May 17 07:45:17 CEST 2007
- Previous message: [Python-3000] Raw strings containing \u or \U
- Next message: [Python-3000] Raw strings containing \u or \U
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Ron Adam schrieb:
Guido van Rossum wrote:
That would be great! This will automatically turn \u1234 into 6 characters, right? I'm not exactly clear when the '\uxxxx' characters get converted. There isn't any conversion done in tokanize.c that I can see. It's primarily only concerned with finding the beginning and ending of the string at that point. It looks like everything between the beginning and end is just passed along "as is" and it's translated further later in the chain.
Look at Python/ast.c, which has functions parsestr() and decode_unicode(). The latter calls PyUnicode_DecodeRawUnicodeEscape() which I think is the function you're looking for.
Georg
-- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
- Previous message: [Python-3000] Raw strings containing \u or \U
- Next message: [Python-3000] Raw strings containing \u or \U
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]