[Python-Dev] Should we move to replace re with regex? (original) (raw)

Guido van Rossum guido at python.org
Fri Aug 26 23:45:17 CEST 2011


I just made a pass of all the Unicode-related bugs filed by Tom Christiansen, and found that in several, the response was "this is fixed in the regex module [by Matthew Barnett]". I started replying that I thought that we should fix the bugs in the re module (i.e., really in _sre.c) but on second thought I wonder if maybe regex is mature enough to replace re in Python 3.3. It would mean that we won't fix any of these bugs in earlier Python versions, but I could live with that.

However, I don't know much about regex -- how compatible is it, how fast is it (including extreme cases where the backtracking goes crazy), how bug-free is it, and so on. Plus, how much work would it be to actually incorporate it into CPython as a complete drop-in replacement of the re package (such that nobody needs to change their imports or the flags they pass to the re module).

We'd also probably have to train some core developers to be familiar enough with the code to maintain and evolve it -- I assume we can't just volunteer Matthew to do so forever... :-)

What's the alternative? Is adding the requested bug fixes and new features to _sre.c really that hard?

-- --Guido van Rossum (python.org/~guido)



More information about the Python-Dev mailing list