[Python-Dev] one last SRE headache (original) (raw)

Guido van Rossum guido@beopen.com
Thu, 31 Aug 2000 16:12:29 -0500


amk wrote: > outside a character class it's a character if there are exactly > 3 octal digits; otherwise it's a backref. So \41 is a backref > to group 41, but \041 is the literal character ASCII 33.

so what's the right way to parse this? read up to three digits, check if they're a valid octal number, and treat them as a decimal group number if not?

Suggestion:

If there are fewer than 3 digits, it's a group.

If there are exactly 3 digits and you have 100 or more groups, it's a group -- too bad, you lose octal number support. Use \x. :-)

If there are exactly 3 digits and you have at most 99 groups, it's an octal escape.

(Can you even have more than 99 groups in SRE?)

--Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)