ENH: support str translate for StringMethods by mortada · Pull Request #10052 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation31 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
as a part of #9111
note that this handles both py2 and py3 as the signature for str.translate
has changed from py2 to py3
I don't know if I find it a good idea to have different behaviour for python 2 vs 3. Why not just going with the python 3 signature?
the thing is it's more than the signature that changed from python 2 to 3. The meaning of table
in str.translate
also changed. If we only have the python 3 signature, users of python 2 would lose functionality with this method. In particular, users of python 2 would not be able to delete characters with StringMethods.translate
, because there's no way to specify that in the python 2 structure of table
.
Correct, but I think single internal func is better for API doc on the website to show both python 2 and 3 info. How about:
- Merge to single function to accept both
table
anddeletechar
. - Ignore
deletechar
in python 3 with warning, and describe it in docstring.
If the change between python 2 and 3 is that big, I think even more reason for us to choose one behaviour (and thus, python 3 behaviour is of course the most logical). Users of python 2 will eventually have to change their code for python 3, so if they start using the pandas string method, they can already adapt (and otherwise they can always use the apply approach). We should follow the behaviour of the standard string methods as much as possible, but here I would personally deviate.
I would just put a warning in the docstring that it follows python 3 behaviour.
Do you know if there is a reason the behaviour changed? Did the functionality goes elwewhere?
so first of all, don't get me wrong - I think the python 3 signature is better and I'm a big advocate for using python 3. It's just that sticking with the python 3 signature in our implementation means that python 2 users will lose some functionality. And I'm open to making that tradeoff.
@jorisvandenbossche just to be clear, no functionality of the standard str.translate()
has gone somewhere else from python 2 to python 3. The deletechars
kwarg gets dropped in python 3 because the python 3 table
format is flexible enough to let users specify deletions.
so, python 2 users won't really lose some functionality, but just have to adapt their code somewhat?
I also understand the reasons to go for python 2/3 dependent behaviour, but as this is a 'new' feature for pandas, I would lean towards making a choice for 'one way', as we have the choice to do that
@jorisvandenbossche if we make the signature StringMethods.translate(table)
for both python 2 and python 3, then users of python 2 will lose the ability to pass deletechars
. Users of python 2 can't specify deletions in the python 2 table
format.
Ah, sorry, I didn't really look in detail to the str.translate
method. I thought the table
was just a normal argument, instead of a special string you can generate with a function. OK, no problem then!
I think the suggestion of @sinhrks is good then, having one docstring that has information for both python 2 and 3 (as we don't have separate docs)
I haven't used str.translate
before, so I don't have a strong opinion here. I think we agree that we want to be forwards compatible with Python 3 as much as possible while retaining functionality. In this case, it seems like have two function signatures is a pretty reasonable solution -- there's not too much in the way of duplicated code and it is relatively straightforward to document. But @sinhrks's solution would also be fine by me.
Sounds good thank you all for the feedback! Will update shortly.
updated, please take a look
def str_translate(arr, table, deletechars=None): |
---|
""" |
Map all characters in the string through the given mapping table. |
Equivalent to standard ``str.translate``. Note that the optional argument |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use ":meth:
str.translate" (and "`:func:`string.maketrans
" for another) like other docstring.
Can you update the docstring here as well?
yes definitely :) the only thing is that this would be on the same line as the PR for index(), rindex()
, but I'll update here anyway
@mortada this was passing before yes? and you just changed docs right?
yes this was passing before
@mortada ok, this needs a rebase now that I merged the other. pls ping when ready.
ok rebase is done, updated
jreback added a commit that referenced this pull request
ENH: support str translate for StringMethods
Labels
String extension data type and string data