[Python-Dev] one last SRE headache (original) (raw)
Andrew Kuchling akuchlin@mems-exchange.org
Thu, 31 Aug 2000 15:46:03 -0400
- Previous message: [Python-Dev] one last SRE headache
- Next message: [Python-Dev] one last SRE headache
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Aug 31, 2000 at 09:46:54PM +0200, Fredrik Lundh wrote:
can anyone tell me how Perl treats this pattern? r'((((((((((a))))))))))\41'
if I understand this correctly, Perl treats as an octal escape (chr(041) == "!").
Correct. From perlre:
You may have as many parentheses as you wish. If you have more
than 9 substrings, the variables <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>10</mn><mo separator="true">,</mo></mrow><annotation encoding="application/x-tex">10, </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8389em;vertical-align:-0.1944em;"></span><span class="mord">10</span><span class="mpunct">,</span></span></span></span>11, ... refer to the
corresponding substring. Within the pattern, \10, \11,
etc. refer back to substrings if there have been at least that
many left parentheses before the backreference. Otherwise (for
backward compatibility) \10 is the same as \010, a backspace,
and \11 the same as \011, a tab. And so on. (\1 through \9
are always backreferences.)
In other words, if there were 41 groups, \41 would be a backref to group 41; if there aren't, it's an octal escape. This magical behaviour was deemed not Pythonic, so pre uses a different rule: it's always a character inside a character class ([\41] isn't a syntax error), and outside a character class it's a character if there are exactly 3 octal digits; otherwise it's a backref. So \41 is a backref to group 41, but \041 is the literal character ASCII 33.
--amk
- Previous message: [Python-Dev] one last SRE headache
- Next message: [Python-Dev] one last SRE headache
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]