ENH: Add StringMethods.partition and rpartition by sinhrks · Pull Request #9773 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation25 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

sinhrks

@jorisvandenbossche

What do you think of having return_type defaults to 'frame'? (I know in str.split the default is 'series', but that is for back compatibility)

@shoyer

+1 for defaulting to return a frame.

@sinhrks

OK, changed default to 'frame'.

@jreback

lgtm, but not sure when would use partition over split

@jorisvandenbossche

partition will always give three items back (only split on first or last occurence of separator) I think?

@sinhrks maybe add some examples to the docstring, and put a 'see also' to split explaining the difference. (+ a see also the the other (r)partition)

@sinhrks

Added docstrings (examples and see also). partition is more specific than split, may useful when want to retain separators.

jorisvandenbossche

pat : string, default whitespace
String to split on.
return_type : {'series', 'frame'}, default 'frame'
If frame, returns a DataFrame (elements are strings)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small detail, but can you put a point . at the end of this sentence (as in html, the line breaks are not preserved, so the following If series, ... comes directly behind this, so without punctuation it looks strange

@sinhrks

@jorisvandenbossche Thanks, fixed your points.

About return type, I'll prepare separate PR for existing funcs which is also incorrect.

@sinhrks

@jorisvandenbossche

Yes, there is no real good solution to have it correct for both (unless starting with templating, but then the purpose of having a visual docstring you can correctly see in the source code for the module level functions is a bit lost). So I would say: most important that user sees the correct info, so I would indeed go with the first one.

@jorisvandenbossche

@sinhrks or, maybe what we could also do is this: replace the array with Series, and leave the module level function docstrings as is.
Eg for the example of str_findall: have a .replace('matches : array', 'matches : Series') on the docstring

BTW: it is not that there is a different policy in the docstrings apparently. Eg the str_extract you linked to is really returning already a Series/DataFrame. So it is just that the str_ functions have different return types.

@sinhrks

Separate docstring issue to #9843 as we have to fix other methods, and also consider #9667.

@jorisvandenbossche

@sinhrks can you update this with expand=True/false instead of return_type? (see discussion in #9870)

@sinhrks

OK, fixed this. And posting the behavior of expand in #9870.

@jreback

looks fine. needs a rebase.

jorisvandenbossche

pat : string, default whitespace
String to split on.
expand : bool, default True
If True, return DataFrame/MultiIndex expanding dimensionality

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a . at the end of the sentence.

Or make it a list:

* If true ..
* If false ..

@sinhrks

@sinhrks

@jorisvandenbossche

jorisvandenbossche added a commit that referenced this pull request

May 7, 2015

@jorisvandenbossche

ENH: Add StringMethods.partition and rpartition