Issue 911080: string str.split() behaviour inconsistency (original) (raw)
The str.split() method behaves differently depending on if it uses the default (no arguments) separator, or if you provide your own. There is no way to reproduce the functionality of the default separator if you supply your own.
s = "a b c" s.split() ['a', 'b', 'c'] s.split(" ") ['a', 'b', '', 'c']
The default split uses a different algorithm, where it combines multiple separators into a single separator. Providing a custom separator makes split separate each individual separator.
Obviously there are good reasons for forcing a separate entry between each separator. With simple comma or colon separated records, you want to know if an entry is blank.
The problem is there is not a way to reproduce the functionality of the default behavior. This alternate behavior is also not documented, so it becomes confusing why split behaves different once you want your own separators.
Fixing could be a problem. Changing the actual split() method would break many programs. But adding an different split is a potentially nice solution.
The other option would be to "re-use" the current splitfields() function and have it work like the current split. And change the split() to behave like it does with no default. This would unfortunately still "break stuff".
The easiest fix may just be documentation and letting people know of this difference.
I've been helping some newbies through Python. When this came up I was a little surprised and we were forced to learn it was just a little "magic and scary".