gh-73123: Add a keepempty argument to string, bytes and bytearray split methods by MarkCBell · Pull Request #26222 · python/cpython (original) (raw)
This PR adds an optional keepempty
argument to string.split (and similarly for bytes.split, bytearray.split and UserString.split). As described in issue bpo-28937:
- If
keepempty
is true then empty strings are never stripped out of the result array. - If
keepempty
is false then empty strings are always stripped out of the result array. - If
keepempty
isNone
(the default) then the current behaviour is followed in which empty strings are stripped out of the result array if and only if the separator string isNone
.
To do this it uses a new splitting algorithm which has been designed to be compatible with the existing maxsplit
argument. This is roughly:
def split(string, sep=None, maxsplit=None, keepempty=None):
prune = sep is None if keepempty is None else not keepempty
if sep is None: sep = ' '
# Ok, the real implementation actually matches on any whitespace,
# but matching on ' ' is good enough for this toy example.
results = []
count = 0
i = 0
j = string.find(sep, i)
while j >= 0:
if j > i or not prune:
if maxsplit is not None and count >= maxsplit:
break
results.append(string[i:j])
count += 1
i = j + len(sep)
j = string.find(sep, i)
if i < len(string) or not prune:
results.append(string[i:])
return results
A number of tests have been added to check the correct behaviour.