ENH: support str translate for StringMethods by mortada · Pull Request #10052 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation31 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

mortada

as a part of #9111

note that this handles both py2 and py3 as the signature for str.translate has changed from py2 to py3

@sinhrks @jreback

@jorisvandenbossche

I don't know if I find it a good idea to have different behaviour for python 2 vs 3. Why not just going with the python 3 signature?

@mortada

the thing is it's more than the signature that changed from python 2 to 3. The meaning of table in str.translate also changed. If we only have the python 3 signature, users of python 2 would lose functionality with this method. In particular, users of python 2 would not be able to delete characters with StringMethods.translate, because there's no way to specify that in the python 2 structure of table.

@sinhrks

Correct, but I think single internal func is better for API doc on the website to show both python 2 and 3 info. How about:

@jorisvandenbossche

If the change between python 2 and 3 is that big, I think even more reason for us to choose one behaviour (and thus, python 3 behaviour is of course the most logical). Users of python 2 will eventually have to change their code for python 3, so if they start using the pandas string method, they can already adapt (and otherwise they can always use the apply approach). We should follow the behaviour of the standard string methods as much as possible, but here I would personally deviate.
I would just put a warning in the docstring that it follows python 3 behaviour.

Do you know if there is a reason the behaviour changed? Did the functionality goes elwewhere?

@mortada

so first of all, don't get me wrong - I think the python 3 signature is better and I'm a big advocate for using python 3. It's just that sticking with the python 3 signature in our implementation means that python 2 users will lose some functionality. And I'm open to making that tradeoff.

@jorisvandenbossche just to be clear, no functionality of the standard str.translate() has gone somewhere else from python 2 to python 3. The deletechars kwarg gets dropped in python 3 because the python 3 table format is flexible enough to let users specify deletions.

@jorisvandenbossche

so, python 2 users won't really lose some functionality, but just have to adapt their code somewhat?

I also understand the reasons to go for python 2/3 dependent behaviour, but as this is a 'new' feature for pandas, I would lean towards making a choice for 'one way', as we have the choice to do that

@jorisvandenbossche

@mortada

@jorisvandenbossche if we make the signature StringMethods.translate(table) for both python 2 and python 3, then users of python 2 will lose the ability to pass deletechars. Users of python 2 can't specify deletions in the python 2 table format.

@jorisvandenbossche

Ah, sorry, I didn't really look in detail to the str.translate method. I thought the table was just a normal argument, instead of a special string you can generate with a function. OK, no problem then!

@jorisvandenbossche

I think the suggestion of @sinhrks is good then, having one docstring that has information for both python 2 and 3 (as we don't have separate docs)

@shoyer

I haven't used str.translate before, so I don't have a strong opinion here. I think we agree that we want to be forwards compatible with Python 3 as much as possible while retaining functionality. In this case, it seems like have two function signatures is a pretty reasonable solution -- there's not too much in the way of duplicated code and it is relatively straightforward to document. But @sinhrks's solution would also be fine by me.

@jreback

@mortada

Sounds good thank you all for the feedback! Will update shortly.

@mortada

updated, please take a look

sinhrks

def str_translate(arr, table, deletechars=None):
"""
Map all characters in the string through the given mapping table.
Equivalent to standard ``str.translate``. Note that the optional argument

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use ":meth:str.translate" (and "`:func:`string.maketrans" for another) like other docstring.

@jorisvandenbossche

Can you update the docstring here as well?

@mortada

@jorisvandenbossche

@mortada

yes definitely :) the only thing is that this would be on the same line as the PR for index(), rindex(), but I'll update here anyway

@jreback

@mortada this was passing before yes? and you just changed docs right?

@mortada

yes this was passing before

@jreback

@mortada ok, this needs a rebase now that I merged the other. pls ping when ready.

@mortada

@mortada

@mortada

ok rebase is done, updated

jreback added a commit that referenced this pull request

May 8, 2015

@jreback

ENH: support str translate for StringMethods

@jreback

@mortada

Labels

Strings

String extension data type and string data