[Python-Dev] one last SRE headache (original) (raw)

Tim Peters tim_one@email.msn.com
Thu, 31 Aug 2000 17:55:56 -0400


The PRE documentation expresses the true intent:

\number
Matches the contents of the group of the same number. Groups
are numbered starting from 1. For example, (.+) \1 matches 'the the'
or '55 55', but not 'the end' (note the space after the group). This
special sequence can only be used to match one of the first 99 groups.
If the first digit of number is 0, or number is 3 octal digits long,
it will not be interpreted as a group match, but as the character with
octal value number. Inside the "[" and "]" of a character class, all
numeric escapes are treated as characters

This was discussed at length when we decided to go the Perl-compatible route, and Perl's rules for backreferences were agreed to be just too ugly to emulate. The meaning of \oo in Perl depends on how many groups precede it! In this case, there are fewer than 41 groups, so Perl says "octal escape"; but if 41 or more groups had preceded, it would mean "backreference" instead(!). Simply unbearably ugly and error-prone.

-----Original Message----- From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On Behalf Of Fredrik Lundh Sent: Thursday, August 31, 2000 3:47 PM To: python-dev@python.org Subject: [Python-Dev] one last SRE headache

can anyone tell me how Perl treats this pattern? r'((((((((((a))))))))))\41' in SRE, this is currently a couple of nested groups, surrounding a single literal, followed by a back reference to the fourth group, followed by a literal "1" (since there are less than 41 groups) in PRE, it turns out that this is a syntax error; there's no group 41. however, this test appears in the test suite under the section "all test from perl", but they're commented out: # Python does not have the same rules for \41 so this is a syntax error # ('((((((((((a))))))))))\41', 'aa', FAIL), # ('((((((((((a))))))))))\41', 'a!', SUCCEED, 'found', 'a!'), if I understand this correctly, Perl treats as an octal escape (chr(041) == "!"). now, should I emulate PRE, Perl, or leave it as it is... PS. in case anyone wondered why I haven't seen this before, it's because I just discovered that the test suite masks syntax errors under some circumstances...


Python-Dev mailing list Python-Dev@python.org http://www.python.org/mailman/listinfo/python-dev