Issue 13346: re.split() should behave like string.split() for maxsplit=0 and maxsplit=-1 (original) (raw)

Created on 2011-11-05 02:46 by acg, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (5)

msg147066 - (view)

Author: Alan Grow (acg)

Date: 2011-11-05 02:46

If you split a string in a maximum of zero places, you should get the original string back. "".split(s,0) behaves this way. But re.split(r,s,0) performs an unlimited number of splits in this case.

To get an unlimited number of splits, "".split(s,-1) is a sensible choice. But in this case re.split(r,s,-1) performs zero splits.

Where's the sense in this?

import string, re string.split("foo bar baz"," ",0) ['foo bar baz'] re.split("\s+","foo bar baz",0) ['foo', 'bar', 'baz'] string.split("foo bar baz"," ",-1) ['foo', 'bar', 'baz'] re.split("\s+","foo bar baz",-1) ['foo bar baz']

msg147067 - (view)

Author: Ezio Melotti (ezio.melotti) * (Python committer)

Date: 2011-11-05 03:03

This is a known issue, but I don't think it can be fixed without breaking backward compatibility. The behavior with negative values is not explicitly documented, so I would consider it an implementation detail. The behavior with positive values is documented for both the functions. Also even if it's inconsistent, I would expect people to request at least 1 split, otherwise they are basically asking for a no-op. I suggest to close this as wontfix

msg147542 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2011-11-13 03:04

The two methods are defined differently, and act as defined, so this is a feature request, not a bug report.

str.split([sep[, maxsplit]]) ... If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements). If maxsplit is not specified, then there is no limit on the number of splits (all possible splits are made).

re.split(pattern, string, maxsplit=0, flags=0) ...If maxsplit is nonzero, at most maxsplit splits occur,

Clearly, if maxsplit for re.split is the default of 0, it must do all splits. There is a difference between being optional with no default (possible with C-coded functions) and with a default.

Logically, both should have a default of None, meaning no limit. But I agree with Ezio and do not see that happening for Python 3.

As for negative values, I would have maxsplit treated as a count and make negative values a ValueError.

msg147545 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2011-11-13 03:08

Terry, thanks for closing this. The API for str.split() has been set in stone for a very long time.

msg150281 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2011-12-28 05:15

I concur with closing this one.

History

Date

User

Action

Args

2022-04-11 14:57:23

admin

set

github: 57555

2011-12-28 05:15:57

rhettinger

set

messages: +

2011-11-13 03:08:59

rhettinger

set

messages: +

2011-11-13 03:04:51

terry.reedy

set

status: open -> closed
versions: - Python 2.7, Python 3.2
nosy: + terry.reedy

messages: +

type: behavior -> enhancement

2011-11-05 03:03:22

ezio.melotti

set

versions: + Python 2.7, Python 3.2, Python 3.3, - Python 2.6
nosy: + rhettinger, ezio.melotti

messages: +

resolution: wont fix

2011-11-05 02:46:35

acg

create