[Python-Dev] one last SRE headache (original) (raw)

Ka-Ping Yee ping@lfw.org
Thu, 31 Aug 2000 16:04:26 -0500 (CDT)


On Thu, 31 Aug 2000, Fredrik Lundh wrote:

I had to add one rule:

If it starts with a zero, it's always an octal number. Up to two more octal digits are accepted after the leading zero.

Fewer rules are better. Let's not arbitrarily rule out the possibility of more than 100 groups.

The octal escapes are a different kind of animal than the backreferences: for a backreference, there is actually a backslash followed by a number in the regular expression; but we already have a reasonable way to put funny characters into regular expressions.

That is, i propose removing the translation of octal escapes from the regular expression engine. That's the job of the string literal:

r'\011'    is a backreference to group 11

'\\011'    is a backreference to group 11

'\011'     is a tab character

This makes automatic construction of regular expressions a tractable problem. We don't want to introduce so many exceptional cases that an attempt to automatically build regular expressions will turn into a nightmare of special cases.

-- ?!ng