Issue 801847: Adding rsplit() to string and unicode objects. (original) (raw)

Issue801847

Created on 2003-09-07 00:52 by jafo, last changed 2022-04-10 16:11 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
Python-2.3-rsplit.diff jafo,2003-09-07 00:52 Code patch against 2.3 release.
Python-CVS-docs-rsplit.diff jafo,2003-09-07 00:55 Patch against CVS for the documentation.
Messages (24)
msg53994 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-07 00:52
I'm attaching patches to the library and documentation for implementing rsplit() on string and unicode objects. This works like split(), but working from the right. ./python -c 'print u"foo, bar, baz".rsplit(None, 1)' [u'foo, bar,', u'baz'] This was supposed to be against the CVS code, but I've had a heck of a time getting it checked out -- my checkout has been hung for half an hour now. The code patch is against the 2.3 release, the docs patch is against the CVS. My checkout got to docs, but I didn't have the code to a point where I could build and test it. Sean
msg53995 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-09-07 19:49
Logged In: YES user_id=21627 Why is this function useful?
msg53996 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-08 00:56
Logged In: YES user_id=81797 Can you provide more details about why the usefulness of this function is in question? First I would like to tell you the story of it coming to be, then I will answer your incomplete question with a (probably) incomplete answer. I had a device which sent me comma-separated fields, but one of the fields in the middle could contain a comma. The answer that seemed obvious to me was to use split with a maxsplit to get the fields up to that field, and then a rsplit with a maxsplit on the remainder. When I mentioned on #python that I was implementing rsplit, 4 other fellow python users replied right away that they had been wanting it. To answer your question, it's useful because people using strings are used to having r*() functions like rfind and rstrip. The lack of rsplit is kind of glaring in this context. Really, though, it's useful because otherwise people have to implement -- often badly. Sean
msg53997 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-09-10 17:08
Logged In: YES user_id=21627 I questioned the usefulness because I could not think of a meaningful application. Now I see what a potential application could be, but I doubt its generality, because that approach would break if there could be two fields that have commas in them. I also disagree that symmetry can motivate usefulness: I also doubt that all of the r* functions are useful, but they cannot be removed for backwards compatibility. The fact that rsplit would fit together with the other r* functions indicates that adding rsplit would provide symmetry, not that it would provide usefulness.
msg53998 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-10 19:15
Logged In: YES user_id=81797 os.path.basename/os.path.dirname is an example of where you could use rsplit. One of the other #python folks said he had recently wanted rsplit for an application where he was getting the domain name and user part from a list of e-mail addresses, but found that some entries contained an "@" in the user part. Sean
msg53999 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-10 19:35
Logged In: YES user_id=80475 I would classify this more as a technique than a fundamental string operation implemented by all stringlike objects (including UserString). Accordingly, I recommend that the patch be closed and a recipe posted in the ASPN cookbook - something along the lines of: >>> def rsplit(s, sep=None, maxsplit=-1): ... return [chunk[::-1] for chunk in s[::-1].split(sep, maxsplit)[::-1]] >>> rsplit(u"foo, bar, baz", None, 1) [u'foo, bar,', u'baz']
msg54000 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-10 20:40
Logged In: YES user_id=81797 I realize that rsplit() can be implemented, because, well, I implemented it. The standard library is there to provide ready-to-use functionality so that users of python can concentrate on their program instead of concentrate on re-inventing the wheel. find() can be implemented with a short loop, split() can be implemented with find(), join() can be implemented with a short loop. Many things can be implemented with a little additional effort on the part of the user to develop or locate the code they're wanting. These little things can add up quickly and can have quite a dramatic impact on the programming experience in Python. Having to find or implement these functions will cause distraction from the code at hand, time lost while finding, implementing, testing, and maintaining the code in question. One of Python's strengths is a rich standard library. So, what are the guidelines for determining when it's rich enough? Why is it ok to suggest that users should get distracted from their code to go implement something else? Is there a policy that I'm not aware of that new functionality should be put in the cookbook instead of the standard library? Why is it being ignored that some programmers would find implementing rsplit() challenging? I'm not trying to be difficult here, I honestly can't understand the apparent change from having a rich library to a "batteries not included" stance. The response I got from #python when I mentioned having submitted the patch indicates to me that other experienced Python developers expect there to be an rsplit(). So, why is there so much resistance to adding something to the library? What are the guidelines for determining if something should be in the library? Sean
msg54001 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-09-11 06:07
Logged In: YES user_id=21627 There is PEP 2, which suggests to write a library PEP for proposal to extend the library. Now, this probably would be overkill for a single string method. However, I feel that there are already too many string methods, so I won't accept that patch. I'm not rejecting it, either, because I see that other maintainers might have a different opinion. In short, you should propose your change to python-dev, finding out what "a majority" of the maintainers thinks; you might also propose it on python-list, trying to collect reactions from users. It would then be good to summarize these discussions here (instead of carrying them out here).
msg54002 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-11 23:55
Logged In: YES user_id=80475 Guido, do you care to pronounce on this one?
msg54003 - (view) Author: Jeremy Fincher (jemfinch) Date: 2003-09-22 13:10
Logged In: YES user_id=99508 As a comment on the ease with which a programmer can get rsplit wrong, note that rhettinger's rsplit implementation is not correct: compare rsplit('foobarbaz', 'bar') with 'foobarbaz'.split('bar'). He forgot to reverse the separator if it's not None.
msg54004 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-09-22 18:17
Logged In: YES user_id=80475 I'll review your patch when I get a chance.
msg54005 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-09-29 05:38
Logged In: YES user_id=81797 This seems to have generated nothing but positive comment from the folks on python-dev. Thoughts?
msg54006 - (view) Author: Bob Ippolito (bob.ippolito) * (Python committer) Date: 2003-11-24 22:45
Logged In: YES user_id=139309 I'd have to say me too on this one, wake up please :)
msg54007 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-11-25 02:25
Logged In: YES user_id=80475 Get Guido to approve the API change and I will be happy to complete the implementation, documentation, testing, etc. Advice: He will want *compelling* non-toy use cases and reasons why that a person wouldn't just implement it in pure python (saying that they aren't smart enough is not a reason). He is rarely ever persuaded by symmetry/completeness arguments along the lines of "we have an l-this so we have to have an r-that". If that were the case, tuples would have gotten index() and count() long ago. Language growth rate is one issue, the total number of string methods is another, and more importantly he seeks to minimize the number of small API compatabilities between versions which make it difficult to write code for Py2.4 that runs on Py2.2. Also, there are a number of strong competitors vying to be added as string methods. If we only get one new method, why is this one to be preferred over str.cook() for supporting Barry's simplified string substitutions Given only one new str API change, I would personally prefer to add an optional fillchar argument to str.center().
msg54008 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-11-25 16:34
Logged In: YES user_id=81797 Raymond, you've asked Guido about it on September 11, and he (apparently) explicitly stayed out of the discussion. I assumed that you had let him know you wanted his judgement on this and that his response was that he didn't want to be involved, leaving it up to the library "elite guard" instead. Did you actually copy Guido on your earlier request? Personally, I don't see the logic in "if we get only one string method". Python isn't for the Python core developers, it's for the users. If the users have several things that they want added, why the artificial limit on how many to accept?
msg54009 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2003-11-25 19:16
Logged In: YES user_id=21627 It's very easy: find somebody with commit privileges to approve and commit the change. Failing to do so, write a library PEP, and ask for pronouncement.
msg54010 - (view) Author: Sean Reifschneider (jafo) * (Python committer) Date: 2003-11-25 19:35
Logged In: YES user_id=81797 If you are reading this and are interested in having this functionality in the standard Python library, please step forward and champion the effort. Obviously, I believe this is useful, or I wouldn't have spent the better part of a day building and testing it. However, I simply don't have the time to go through the politics of it. What needs to be done is a case needs to be further built for the presentation to the Python developement team. See Raymond's message below for a good list of what's needed there. Also see the thread on the python developers mailing list that I started in relation to this back in September. I will be happy to help out on this, but I just don't have the time to champion the adoption process. Thanks, Sean
msg54011 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-11-25 20:47
Logged In: YES user_id=6380 OK, I'm in a generous mood today. I approve the idea. (I'm not going to review the code, that's up to Raymond and others). And Raymond can have a fillchar option to center() as well. I don't know what cook() was supposed to do, but if it's $ substitution, I recommend to keep that in a separate module for now.
msg54012 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-11-25 21:03
Logged In: YES user_id=80475 Okay, I've got it from here!
msg54013 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2003-12-01 14:18
Logged In: YES user_id=80475 Alex said he would take the patch from here.
msg54014 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2003-12-13 19:45
Logged In: YES user_id=55188 On my review, few bugs are found. >>> u'x\x00y\x00z'.rsplit(u'\x00', 1) zsh: bus error (core dumped) ./python >>> u'abcd'.rsplit(u'abcd') [u'abcd'] >>> 'a,b,c,d'.rsplit(u',', 2) [u'a', u'b', u'c,d'] And, unittests on Lib/test/test_strop.py should be moved to strings_test.py My revision for jafo's patch is available at http://people.freebsd.org/~perky/rsplit-perkyrev.diff
msg54015 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-12-15 02:45
Logged In: YES user_id=6380 Perky, feel free to check it in!
msg54016 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2003-12-15 18:58
Logged In: YES user_id=55188 Yo. Just checked in. I don't have a permission to close this entry. Can anybody help? :-)
msg54017 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2003-12-15 19:38
Logged In: YES user_id=6380 I could just close it for you, but I'll make it an exercise for you & Raymond to figure out how to set your developer perms properly. :-)
History
Date User Action Args
2022-04-10 16:11:03 admin set github: 39195
2003-09-07 00:52:55 jafo create