[Python-Dev] Should we move to replace re with regex? (original) (raw)

Nick Coghlan ncoghlan at gmail.com
Sat Aug 27 10:02:49 CEST 2011

Previous message: [Python-Dev] Should we move to replace re with regex?
Next message: [Python-Dev] Should we move to replace re with regex?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, Aug 27, 2011 at 4:01 PM, Dan Stromberg <drsalists at gmail.com> wrote:

You're talking technically, which is important, but wasn't what I was suggesting would be helped.

Politically, and from a marketing standpoint, it's easier to withdraw a feature you've given with a "Play with this, see if it works for you" warning.

The standard library isn't for playing. "pip install regex" is for playing. If we aren't sure we want to make the transition, then it doesn't go in.

However, to my mind, reviewing and incorporating regex is a far more feasible model than trying to enhance the existing re module with a comparable feature set. At the moment, there's already an obvious way to get enhanced regex support in Python: install regex and use it instead of the standard library's re module. That's enough to pretty much kill any motivation anyone might have to make major changes to re itself.

We're at least getting one thing right this time that we got wrong with multiprocessing, though - we're much, much further out from the 3.3 release than we were from the 2.6 release when multiprocessing was added to the standard library :)

The next step needed is for someone to volunteer to write and champion a PEP that:

articulates the deficiencies in the current re module (the regex docs already cover some of this, as do Tom Christiansen's notes on the issue tracker)
explains why upgrading re in place is not feasible (e.g. noting that the availability of regex really limits the desire for anyone to reinvent that particular wheel, so even things that are theoretically possible may be highly unlikely in practice)
proposes a transition plan (personally, I'd be fine with an optparse -> argparse style transition where re remains around indefinitely to support legacy code, but new users are pointed towards regex. But depending on compatibility details, merging the two APIs in the existing re namespace may also be feasible)
proposes a maintenance strategy (I don't know how much Matthew has written regarding internal design details, but that kind of thing could really help. Matthew agreeing to continue maintenance as part of the standard library would also help a great deal, but wouldn't be enough on its own - while it's good for modules to have active maintainers to make the final call associated design decisions, it's potentially problematic when other core developers don't understand what the code is doing well enough to fix bugs in it)
confirms that the regex test suite can be incorporated cleanly into the standard library regression test suite (the difficulty of this was something that was underestimated for the inclusion of multiprocessing. Test suite integration is also the final sticking point holding up the PEP 380 'yield from' patch, although that's close to being resolved following the PyConAU sprints)
document tests conducted (e.g. micro-benchmark results, fusil results)

PEP 371 (addition of multiprocessing), PEP 389 (addition of argparse) and Jesse's reflections on the way multiprocessing was added (http://jessenoller.com/2009/01/28/multiprocessing-in-hindsight/) are well worth reading for anyone considering stepping up to write a PEP. That last also highlights why even Matthew's support, however capably he has handled maintenance of regex as an independent project, wouldn't be enough - we had Richard Oudkerk's support and agreement to continue maintenance as the original author of multiprocessing, but he became unavailable early in the integration process. If Jesse hadn't been able to take up most of that slack, the likely result would have been reversion of the changes and removal of multiprocessing from the 2.6 release.

Writing PEPs can be quite a frustrating experience (since a lot of feedback will be negative as people try to poke holes in the idea to see if it stands up to close scrutiny), but it's also really satisfying and rewarding if they end up getting accepted and incorporated :)

Have then been any future features that were added provisionally? I can't either, but ISTR hearing that from future import was started with such an intent. Irrespective, it's hard to import something from "future" without at least suspecting that you're on the bleeding edge.

No, we make an explicit guarantee that future imports will never go away once they've been added. They may become redundant, but they won't break. There's no provision in the future mechanism for changes that are added and then later removed (see http://docs.python.org/dev/library/future).

They're strictly for cases where backwards incompatibilities (usually, but not always, new keywords) may break existing code.

Cheers, Nick.

-- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

Previous message: [Python-Dev] Should we move to replace re with regex?
Next message: [Python-Dev] Should we move to replace re with regex?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list