[Python-3000] Raw strings containing \u or \U (original) (raw)

Steven Bethard steven.bethard at gmail.com
Wed May 16 20:32:37 CEST 2007


On 5/16/07, Guido van Rossum <guido at python.org> wrote:

On 5/16/07, Steven Bethard <steven.bethard at gmail.com> wrote: > +1 for no escaping of quotes in raw strings. Python provides so many > different ways to quote a string, the cases in which you can't just > switch to another quoting style are vanishingly small. Examples from > the stdlib and their translations:: > > ''' --> "'" > '("|')' --> '''("|')''' > 'Can't stat' --> "Can't stat" > '('[^']'|"[^"]")?' --> '''('[^']'|"[^"]")?''' > > Note that allowing trailing backslashes could also clean up stuff in > modules like ntpath:: > > path[-1] in "/\" --> path[-1] in r"/" > firstTwo == '\\' --> firstTwo == r'\'

Can you also search for how often this feature is used (i.e. a raw string that has to be raw for other reasons also contains an escaped quote)? If that's rare or we can agree on easy fixes it would ease my mind about this part of the proposal.

Well, remembering that when you escape a quote in a raw string, the backslash is left in regardless of the enclosing quote type, e.g.::

r"\"" == r'\"' == r"""\"""" == r'''\"''' == '\\"'

the question is then whether there are any situations where you can't just switch the quote type. The only things in the stdlib that I could find[1] where the string quotes and the escaped quote were of the same type were:

r"^\s*=\s*\"([^\"\\]*(?:\\.[^\"\\]*)*)\""
r"([\"\\])"
r'[^\\\'\"%s ]*'
r'#\s*doctest:\s*([^\n\'"]*)$',
r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~@]*))?'
r"([^.'\"\\#]\b|^)"
r'(\'[^\']*\'|"[^"]*")\s*'
r'((\\[\\abfnrtv\'"]|\\[0-9]..|\\x..|\\u....)+)',
r'(\'[^\']*\'|"[^"]*"|[][\-a-zA-Z0-9./,:;+*%?!&$\(\)_#=~\'"@]*))?'
r'(?<=[\w\!\"\'\&\.\,\?])-{2,}(?=\w))'
r'[\"\']?'
r'[ \(\)<>@,;:\\"/\[\]\?=]'
r"[&<>\"\x80-\xff]+"

I believe every one of these would continue to work if you simply replaced r'...' or r"..." with r'''...''', that is, if you used the triple-quoted version. Even some much nastier ones than what's in the stdlib (e.g. where the string starts and ends with different quote types) seem to work out okay when you switch to the appropriate triple quotes::

r'\'\"' == r'''\'\"'''
r'"\'' == r""""\'"""

I actually wasn't able to find something I couldn't translate. It would be helpful to have another set of eyes if anyone has the time.

[1] I skipped the tests dir because I'm lazy. ;-)

STeVe

I'm not in-sane. Indeed, I am so far out of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy



More information about the Python-3000 mailing list