Issue 1388489: bug in rstrip & lstrip (original) (raw)

quick detail: Python 2.4.2 (#1, Dec 9 2005, 22:48:42) [GCC 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)] on linux2 Type "help", "copyright", "credits" or "license" for more information.

"net.tpl".rstrip('.tpl') 'ne' "foo.tpl".rstrip('.tpl') 'foo'

I ran the following code to test this: 26 - [jwhitlark@Snowflake]: ~/pythonBugTest 0> cat testForRStripBug.py #! /usr/bin/python

for word in open('/opt/openoffice/share/dict/ooo/en_US.dic', 'r'): word = word.split('/')[0] testWord = (word + '.tpl').rstrip('.tpl') if word != testWord: print word, testWord

And came up with the attached file of incorrect matches. Out of 62075 words in the en_US.dic, 6864 do not match. Here is the frequency count of the last letter of the origional word, the only pattern I could discern so far: 0> ./freqCount.py < run1 {'p': 566, 'l': 2437, 't': 3861}

No other letters seem to be clipped. Why this should be so, I have no idea. I would guess that the error was in function do_xstrip in python/trunk/Objects/stringobject.c, but C is not my strong suit. I will be looking at it further when I have time, but if anyone knows how to fix this, please help.

Logged In: YES user_id=89016

This is not a bug. The documentation (http://docs.python.org/lib/string-methods.html) says that: "The chars argument is a string specifying the set of characters to be removed". i.e. "net.tpl".rstrip(".tpl") strips every ".", "t", "p" and "l" character from the right end of the string, not every occurence of the character sequence ".tpl". This seems to be a frequent misunderstanding, so if you can suggest improvements to the docstring or the documentation, please do so.