[Python-Dev] Re: re.split on empty patterns (original) (raw)

Mike Coleman mkc at mathdogs.com
Mon Aug 23 00:53:34 CEST 2004


"Brett C." <bac at OCF.Berkeley.EDU> writes:

Mike Coleman wrote:

[SNIP] > # alternative 2: > re.structmatch(r'xxx|(?=abc)', 'zzxxxabczz') --> ['zz', 'bbczz'] ^ > re.structmatch(r'xxx|(?=abc)', 'zzxxxbbczz') --> ['zz', 'bbczz'] > # alternative 3: > re.structmatch(r'xxx|(?=abc)', 'zzxxxabczz') --> ['zz', '', 'bbczz'] ^ > re.structmatch(r'xxx|(?=abc)', 'zzxxxbbczz') --> ['zz', 'bbczz'] > I take it the first 'b' in both of the first examples for each alternative were supposed to be 'a'?

Yes, that's correct. Oops.

And as for which version, I actually like Mike's version more than the one AMK and Tim like. The reason is that the '' in the middle of the example in question in the OP tells you where the split would have occurred had split0 (I like that or 'splitempty') not been turned on. That way there is no real loss of data between the two, but a gain with the new feature being used.

Is there something we can do to move this forward? It seems like a couple of people like one option and a couple the other, but I think at least we all agree that the general feature would be a good idea. So, should we take a vote? Or just go with the more conservative option, in order to get something in the tree for 2.4?

Mike



More information about the Python-Dev mailing list