ENH: Add StringMethods.partition and rpartition by sinhrks · Pull Request #9773 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation25 Commits1 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
What do you think of having return_type
defaults to 'frame'? (I know in str.split
the default is 'series', but that is for back compatibility)
+1 for defaulting to return a frame.
OK, changed default to 'frame'.
lgtm, but not sure when would use partition
over split
partition
will always give three items back (only split on first or last occurence of separator) I think?
@sinhrks maybe add some examples to the docstring, and put a 'see also' to split explaining the difference. (+ a see also the the other (r)partition)
Added docstrings (examples and see also). partition
is more specific than split
, may useful when want to retain separators.
pat : string, default whitespace |
---|
String to split on. |
return_type : {'series', 'frame'}, default 'frame' |
If frame, returns a DataFrame (elements are strings) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small detail, but can you put a point .
at the end of this sentence (as in html, the line breaks are not preserved, so the following If series, ...
comes directly behind this, so without punctuation it looks strange
@jorisvandenbossche Thanks, fixed your points.
About return type, I'll prepare separate PR for existing funcs which is also incorrect.
Yes, there is no real good solution to have it correct for both (unless starting with templating, but then the purpose of having a visual docstring you can correctly see in the source code for the module level functions is a bit lost). So I would say: most important that user sees the correct info, so I would indeed go with the first one.
@sinhrks or, maybe what we could also do is this: replace the array with Series, and leave the module level function docstrings as is.
Eg for the example of str_findall
: have a .replace('matches : array', 'matches : Series')
on the docstring
BTW: it is not that there is a different policy in the docstrings apparently. Eg the str_extract
you linked to is really returning already a Series/DataFrame. So it is just that the str_
functions have different return types.
Separate docstring issue to #9843 as we have to fix other methods, and also consider #9667.
@sinhrks can you update this with expand=True/false
instead of return_type? (see discussion in #9870)
OK, fixed this. And posting the behavior of expand
in #9870.
looks fine. needs a rebase.
pat : string, default whitespace |
---|
String to split on. |
expand : bool, default True |
If True, return DataFrame/MultiIndex expanding dimensionality |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a .
at the end of the sentence.
Or make it a list:
* If true ..
* If false ..
jorisvandenbossche added a commit that referenced this pull request
ENH: Add StringMethods.partition and rpartition