Issue 32211: Document the bug in re.findall() and re.finditer() in 2.7 and 3.6 (original) (raw)
re.findall(r'^|\w+', 'two words') ['', 'wo', 'words']
Seems the current behavior was documented incorrectly in .
It will be fixed in 3.7 (see , ), but I hesitate to backport the fix to 3.6 and 2.7 because this can break the user code. For example:
In Python 3.6:
list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar')) [<_sre.SRE_Match object; span=(4, 4), match=''>, <_sre.SRE_Match object; span=(5, 5), match=''>]
In Python 3.7:
list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar')) [<re.Match object; span=(4, 4), match=''>, <re.Match object; span=(4, 5), match='\n'>, <re.Match object; span=(5, 5), match=''>]
(This is a real pattern used in the docstring module, but with re.sub()).
The proposed PR documents the current weird behavior in 2.7 and 3.6.