[Python-Dev] one last SRE headache (original) (raw)

Fredrik Lundh Fredrik Lundh" <effbot@telia.com
Fri, 1 Sep 2000 00:28:40 +0200


tim peters:

The PRE documentation expresses the true intent:

\number Matches the contents of the group of the same number. Groups are numbered starting from 1. For example, (.+) \1 matches 'the the' or '55 55', but not 'the end' (note the space after the group). This special sequence can only be used to match one of the first 99 groups. If the first digit of number is 0, or number is 3 octal digits long, it will not be interpreted as a group match, but as the character with octal value number.

yeah, I've read that. clear as coffee.

but looking at again, I suppose that the right way to implement this is (doing the tests in the given order):

if it starts with zero, it's an octal escape
(1 or 2 octal digits may follow)

if it starts with an octal digit, AND is followed
by two other octal digits, it's an octal escape

if it starts with any digit, it's a reference
(1 extra decimal digit may follow)

oh well. too bad my scanner only provides a one-character lookahead...