[Python-Dev] Zero-width matching in regexes (original) (raw)
Terry Reedy tjreedy at udel.edu
Tue Dec 5 15:26:04 EST 2017
- Previous message (by thread): [Python-Dev] Zero-width matching in regexes
- Next message (by thread): [Python-Dev] Zero-width matching in regexes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 12/4/2017 6:21 PM, MRAB wrote:
I've finally come to a conclusion as to what the "correct" behaviour of zero-width matches should be: """always return the first match, but never a zero-width match that is joined to a previous zero-width match""".
Is this different from current re or regex?
If it's about to return a zero-width match that's joined to a previous zero-width match, then backtrack and keep on looking for a match.
Example: >>> print([m.span() for m in re.finditer(r'|.', 'a')]) [(0, 0), (0, 1), (1, 1)] re.findall, re.split and re.sub should work accordingly. If re.finditer finds n matches, then re.split should return a list of n+1 strings and re.sub should make n replacements (excepting maxsplit, etc.).
-- Terry Jan Reedy
- Previous message (by thread): [Python-Dev] Zero-width matching in regexes
- Next message (by thread): [Python-Dev] Zero-width matching in regexes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]