Issue 13012: Allow keyword argument in str.splitlines() (original) (raw)

Created on 2011-09-20 13:44 by mark.dickinson, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue13012.patch mark.dickinson,2011-09-20 17:56 review
issue13012-repl.diff ezio.melotti,2011-09-23 21:03 review
Messages (18)
msg144326 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-20 13:44
The docstring of str.splitlines says: splitlines(...) S.splitlines([keepends]) -> list of strings Return a list of the lines in S, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and true. Currently the optional argument can only be specified positionally. What do people think about also allowing this argument to be specified by keyword? (And applying the same change to bytes.splitlines.) The main motivation here is readability: in long_string.splitlines(keepends=True) it's fairly clear what the boolean argument is doing. But in long_string.splitlines(True) it's much less clear. (This came up during a Python training class that I was running recently.) I'm not advocating making the argument keyword-only; just allowing those who want to specify it by keyword to do so. There are many other str methods that don't accept keyword arguments, but it's only this one where I think readability would really benefit.
msg144327 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-20 13:50
> (And applying the same change to bytes.splitlines.) Oh, and bytearray.splitlines, too.
msg144328 - (view) Author: Alex Gaynor (alex) * (Python committer) Date: 2011-09-20 13:51
Personally, I regard every C function which, for obscure internal details, doesn't take keyword arguments as a sad bug, which should of course be fixed :)
msg144329 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-09-20 13:57
+1; the keyword arg version is much more readable.
msg144343 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-20 17:56
Here's a patch.
msg144358 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-09-21 00:14
LGTM
msg144361 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-09-21 01:32
Patch looks good. I noticed a change in the conventional type for 'keepends' from 'int' to 'bool'. Several unit tests were updated to match this change. Perhaps other call sites should be updated too? A little greping shows: $ grep -Rl 'splitlines(0)' * --include='*.py' Doc/tools/docutils/writers/newlatex2e/__init__.py $ grep -Rl 'splitlines(1)' * --include='*.py' Doc/tools/sphinx/pycode/pgen2/tokenize.py Doc/tools/docutils/readers/python/moduleparser.py Lib/test/test_tokenize.py Lib/difflib.py Lib/lib2to3/pgen2/tokenize.py Lib/codecs.py
msg144470 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-09-23 19:22
I agree with Alex. The poorly documented fact that *some* C-coded functions cannot accept arguments identified by keyword rather than position is a bit hole in the function abstraction. +1 to the patch (and the int to bool change)
msg144474 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-09-23 21:03
The attached patch adds 'keepends=' to a few calls and replaces 0/1 with False/True. The patch can be applied after Mark's patch. Doing two separate commits is probably better.
msg144482 - (view) Author: Meador Inge (meador.inge) * (Python committer) Date: 2011-09-24 02:20
> Doing two separate commits is probably better. Out of curiosity, what is typically the convention on that? The devguide seems to suggest one changeset per issue: """ Just please make sure that you generate a single, condensed patch rather than a series of several changesets. """ I think for this case two patches is better. In general, I am OK with the git-style series and hg-style patchbombs, but the devguide seems to say otherwise. Hmmm, that makes me wonder if we can patchbomb the tracker :-)
msg144483 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-09-24 02:36
One changeset per issue is a general goal. So is the ability to review patches. Sometime people forget to add a News or Acks entry and have to followup with an addendum. (Mark's patch still lack that, for instance.) Sometimes a patch is so large that a reviewer asks or requires splitting. In this case, each patch is large enough and touches enough files (up to 10 I think) that two is plausible. What would not be appreciated by either a reviewer or commit-list recipients would be one patch file per patched file. That would be a patch bomb indeed ;-)
msg144488 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-09-24 08:15
New changeset 0b2660384d4e by Mark Dickinson in branch 'default': Issue #13012: Allow 'keepends' to be passed as a keyword argument in str.splitlines, bytes.splitlines and bytearray.splitlines. http://hg.python.org/cpython/rev/0b2660384d4e
msg144489 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-24 08:17
Thanks for the reviews. Applied the first patch, plus Misc/NEWS entry, and minus the change to Lib/collections/__init__.py; I think that change belongs more with the second patch.
msg144497 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2011-09-24 10:52
Ezio's patch looks fine to me, though as Meador points out in the Rietveld review, there are a couple of omissions that could be fixed. Personally, I'm not so convinced of the value of upgrading all the existing uses of splitlines to use 'keepends'; I just wanted to be able to do this in *new* code. :-) I guess I'm +0 on the int -> bool changes (replacing .splitlines(1) with .splitlines(True)), and -0 on adding the extra 'splitlines' keywords throughout existing source.
msg144498 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-09-24 13:21
> Out of curiosity, what is typically the convention on that? In theory every issue should be about a single problem and therefore have a single patch to fix it. Sometimes a secondary "problem" that is not strictly related to the first one arises, for example when a cleanup is necessary before applying the patch, or when a new feature is added and a few places get changed to use it. In these cases is not worth opening another issue, so both the patches are attached to the same issue, but committed separately because they address two different "problems". > Sometime people forget to add a News or Acks entry and have to > followup with an addendum. (Mark's patch still lack that, for > instance.) Avoiding to include the NEWS entry in patches is common, because the file gets updated often and it will most likely cause conflicts. `make patchcheck` remind us to add it -- assuming you don't forget to use it. > Ezio's patch looks fine to me, though as Meador points out in the > Rietveld review, there are a couple of omissions that could be fixed. Adding keepends there would have made the line too long, required some splitting and probably made the code less readable overall, so I prefer to keep it out. > Personally, I'm not so convinced of the value of upgrading all the > existing uses of splitlines to use 'keepends'; I just wanted to be > able to do this in *new* code. :-) If we can make old code more readable too and we already have a patch to do it, why not? (one instance also had the comment # True == keep line ends)
msg144506 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2011-09-24 20:22
Given that someone is willing to write a patch and someone else to review it, I am +1 on changing 0/1 to False/True anywhere appropriate in the stdlib. Ditto for using a new feature. The "# True == keep line ends" comment illustrates why the original patch is a good idea.
msg144565 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011-09-28 14:38
New changeset 83f43b58c988 by Ezio Melotti in branch 'default': #13012: use splitlines(keepends=True/False) instead of splitlines(0/1). http://hg.python.org/cpython/rev/83f43b58c988
msg144566 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011-09-28 14:40
I applied my patch, including the changes in Lib/collections/__init__.py, the issue can now be closed.
History
Date User Action Args
2022-04-11 14:57:21 admin set github: 57221
2011-09-28 14:40:53 ezio.melotti set status: open -> closed
2011-09-28 14:40:33 ezio.melotti set assignee: mark.dickinsonresolution: fixedmessages: + stage: patch review -> resolved
2011-09-28 14:38:08 python-dev set messages: +
2011-09-24 20:22:34 terry.reedy set messages: +
2011-09-24 13:21:46 ezio.melotti set messages: +
2011-09-24 10:52:31 mark.dickinson set messages: +
2011-09-24 08:17:55 mark.dickinson set messages: +
2011-09-24 08:15:44 python-dev set nosy: + python-devmessages: +
2011-09-24 02:36:44 terry.reedy set messages: +
2011-09-24 02:20:58 meador.inge set messages: +
2011-09-23 21:04:00 ezio.melotti set files: + issue13012-repl.diffmessages: +
2011-09-23 19:22:20 terry.reedy set nosy: + terry.reedymessages: +
2011-09-21 01:32:40 meador.inge set messages: +
2011-09-21 00:14:58 ezio.melotti set messages: +
2011-09-20 17:56:59 mark.dickinson set stage: needs patch -> patch review
2011-09-20 17:56:45 mark.dickinson set files: + issue13012.patchkeywords: + patchmessages: +
2011-09-20 13:58:21 ezio.melotti set nosy: + ezio.melotti
2011-09-20 13:57:03 meador.inge set nosy: + meador.ingemessages: +
2011-09-20 13:51:46 alex set nosy: + alexmessages: +
2011-09-20 13:50:40 mark.dickinson set messages: +
2011-09-20 13:44:06 mark.dickinson create