[Python-Dev] Missing arguments in RE functions (original) (raw)
Noam Raphael noamr at myrealbox.com
Tue Sep 7 21:34:11 CEST 2004
- Previous message: [Python-Dev] Subversion, Codeville
- Next message: [Python-Dev] Missing arguments in RE functions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello,
I've now finished teaching Python to a group of people, and regular expressions was a part of the course. I have encountered a few missing features (that is, optional arguments) in RE functions. I've checked, and it seems to me that they can be added very easily.
The first missing feature is the "flags" argument in the findall and finditer functions. Searching for all occurances of an RE is, of course, a legitimate action, and I had to use (?s) in my RE, instead of adding re.DOTALL, which, to my opinion, is a lot clearer. The solution is simple: the functions sub, subn, split, findall and finditer all first compile the given RE, with the flags argument set to 0, and then run the appropriate method. As far as I can see, they could all get an additional optional argument, flags=0, and compile the RE with it.
The second missing feature is the ability to specify start and end indices when doing matches and searches. This feature is available when using a compiled RE, but isn't mentioned at all in any of the straightforward functions (That's why I didn't even know it was possible, until I now checked - I naturally assumed that all the functionality is availabe when using the functions). I think these should be added to the functions match, search, findall and finditer. This feature isn't documented for the findall and finditer methods, but I checked, and it seems to work fine. (In case you are interested in the use case: the exercise was to parse an XML file. It was done by first matching the beginning of a tag, then trying to match attributes, and so on - each match starts from where the previous successfull match ended. Since I didn't know of this feature, it was done by replacing the original string with a substring after every match, which is terribly unefficient.)
If you approve, I can create a patch in a few minutes and send it.
Have a good day, Noam Raphael
- Previous message: [Python-Dev] Subversion, Codeville
- Next message: [Python-Dev] Missing arguments in RE functions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]