Issue 32067: Deprecate accepting unrecognized braces in regular expressions (original) (raw)

Currently {m}, {m,n}, {m,} and {,n} where m and n are non-negative decimal numbers are accepted in regular expressions as quantifiers that mean repeating the previous RE from m (0 by default) to n (infinity by default) times.

But if the opening brace '{'is not followed by one of the above patterns, it means just the literal '{'.

import re re.search('(foobar){e}', 'xirefoabralfobarxie') re.search('(foobar){e}', 'foobar{e}') <re.Match object; span=(0, 9), match='foobar{e}'>

This conflicts with the regex module which uses braces for defining the "fuzzy" matching.

import regex regex.search('(foobar){e}', 'xirefoabralfobarxie') <regex.Match object; span=(0, 6), match='xirefo', fuzzy_counts=(6, 0, 0)> regex.search('(foobar){e}', 'foobar{e}') <regex.Match object; span=(0, 6), match='foobar'>

I don't think it is worth to add support of fuzzy matching in the re module, but for compatibility it would be better to raise an error or a warning in case of '{' not following by the one of the recognized patterns. This could also help to catch typos and errors in regular expressions, i.e. in '-{1.2}' or '-{1, 2}' instead of '-{1,2}'.

Possible variants:

  1. Emit a DeprecationWarning in 3.7 (and 2.7.15 with the -3 option), raise a re.error in 3.8 or 3.9.

  2. Emit a PendingDeprecationWarning in 3.7, a DeprecationWarning in 3.8, and raise a re.error in 3.9 or 3.10.

  3. Emit a RuntimeWarning or SyntaxWarning in 3.7 and forever.

  4. Emit a FutureWarning in 3.7, and implement the fuzzy matching or replace re with regex sometimes in future. Unlikely.