API: Make most arguments for read_html and read_json keyword-ony by alexitkes · Pull Request #27573 · pandas-dev/pandas (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
As mentioned in #27544 it is better to use keyword-only arguments for functions with too many arguments. A deprecation warning will be displayed if read_html get more than 2 positional arguments (io and match) or read_json get more than 1 positional argument (path_or_buf).
- tests added / passed
Three groups of tests are actually needed- Tests dof the
deprecate_nonkeyword_argumentsdecoratorr - Check whether the
read_jsonfunction emits aFutureWarningwhenever necessary and does not emit it whenever not. - Check whether the
read_htmlfunction emits aFutureWarningwhenever necessary and does not emit it whenever not.
- Tests dof the
- passes
black pandas - passes
git diff upstream/master -u -- "*.py" | flake8 --diff - whatsnew entry
alexitkes changed the title
WIP: Make most arguments for read_html and read_json keyword-ony Make most arguments for read_html and read_json keyword-ony
We need to deprecate the old behavior first. As written, this is an api breaking change.
And we shouldn’t do this until 1.0 or later.
We need to deprecate the old behavior first. As written, this is an api breaking change.
And we shouldn’t do this until 1.0 or later.
and how exactly would you deprecate ?
Should be able to use a decorator to inspect the calm, right? And if any positional or keyword args are passed in args then we warn?
On Jul 25, 2019, at 07:44, Jeff Reback ***@***.***> wrote: We need to deprecate the old behavior first. As written, this is an api breaking change. And we shouldn’t do this until 1.0 or later. and how exactly would you deprecate ? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Thank you. I also wanted to propose somewhat like this.
def Deprecated(*, n_args: int):
def _deprecated(f):
def _f(*args, **kwargs):
if len(args) > n_args:
warnings.warn("%s will only accept %i positional arguments in version X.Y.Z" % (f.__name__,n_args))
return f(*args, **kwargs)
return _f
return _deprecated
Or do such a decorator already exist?
Hello @alexitkes! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻
Comment last updated at 2020-03-31 18:09:11 UTC
alexitkes changed the title
Make most arguments for read_html and read_json keyword-ony [WIPMake most arguments for read_html and read_json keyword-ony
alexitkes changed the title
[WIPMake most arguments for read_html and read_json keyword-ony [WIP] Make most arguments for read_html and read_json keyword-ony
alexitkes changed the title
[WIP] Make most arguments for read_html and read_json keyword-ony Make most arguments for read_html and read_json keyword-ony
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests to read_html / read_json to ensure that warnings are raised there?
Lifting #27573 (comment) to the top level
I'd like not to repeat the arguments here. That seems fragile. Is it possible to have the decorator inspect the signature and warn on any positional_or_keyword arg passed positionally?
What do you think about this @alexitkes?
@TomAugspurger Thank you for review. Yes, writing argument list twice will be quite annoying, but I will need some time to find a way to avoid it, and I would like to try. However, I think it would be better to preserve the allowed_args argument for the decorator and default it to list of required arguments if None given.
However, I think it would be better to preserve the allowed_args argument for the decorator and default it to list of required arguments if None given.
Why do you think we'll need that? That sounds more complicated than just detecting kwargs passed positionally. If we aren't going to need it then I'd rather not implement it.
@TomAugspurger I have just written such a decorator and it does not look too complicated. However it is my first experience with inspect module so possibly more tests are needed though some tests are written and they all run successfully.
It will not be difficult to remove allowed_args argument and always set it to list of required arguments, if you think eventually all arguments having the default values should be keyword-only arguments. Should I do it?
I'm not sure without looking more closely. My thought is that all keyword arguments being keyword-only is easier to explain to users.
On Tue, Jul 30, 2019 at 3:10 AM Alex Itkes ***@***.***> wrote:@TomAugspurger <https://github.com/TomAugspurger> I have just written such a decorator and it does not look too complicated. However it is my first experience with inspect module so possibly more tests are needed though some tests are written and they all run successfully. It will not be difficult to remove allowed_args argument and always set it to list of required arguments, if you think eventually all arguments having the default values should be keyword-only arguments. Should I do it? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#27573?email_source=notifications&email_token=AAKAOIWDW7YBWLIQNA6SISTQB7ZQNA5CNFSM4IGSQB62YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3DFE6A#issuecomment-516313720>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAKAOIQDBCCUJD5VIZIZ5VLQB7ZQNANCNFSM4IGSQB6Q> .
I'm not sure without looking more closely. My thought is that all keyword
arguments being keyword-only is easier to explain to users.
I have almost forgotten one thing - the path_or_buf argument for read_json has the default value None. So if we decorate the function in that way all arguments for read_json will be keyword-only. But I think most users are used to writing read_json(filename) and will not like to write read_json(path_or_buf=filename). For this reason I still think there should be a way to pass the list of arguments allowed to be positional to the decorator. However, the default decorator behavior (optional arguments are keyword only) should be used wherever there is no really significant reason to change it. The read_html function is decorated in this way.
I have almost forgotten one thing - the path_or_buf argument for read_json has the default value None. So if we decorate the function in that way all arguments for read_json will be keyword-only.
Oh sorry. In that case we'll definitely need to have the ability to have some keyword-or-positional arguments.
Well, it seems to work fine though I am a bit worried because it's my first experience with inspect module. I also don't know the proper number of version in which it will become impossible to use positionl arguments (currently written 1.1). What should it be?
| """ |
|---|
| df = pd.DataFrame({"A": [2, 4, 6], "B": [3, 6, 9]}, index=[0, 1, 2]) |
| with tm.assert_produces_warning(FutureWarning): |
| assert_frame_equal(df, read_json(df.to_json(orient="split"), "split")) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you minimize the amount of code in the context manager? I think just read_json(buf, "split").
Also, do we need to test all three of these? What differs between them?
| def wrapper(*args, **kwargs): |
|---|
| if isinstance(allow_args, int) and len(args) > allow_args: |
| msg = ( |
| "After version %s all arguments of %s " |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version -> pandas version.
Use {}-style string templates.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, done it. I also added a separate function to form a better looking warning message depending on the number of arguments, but so far I only tested it manually. Is there a proper way to test how exactly does the produced warning look?
@WillAyd @jreback @jorisvandenbossche do you have thoughts on this (the overall goal, the implementation seems fine).
- Do we expect to expand this to more readers? Should these be done in the same release?
- What about other places in the library? Same release?
- What's a good general policy on whether an argument should be keyword only?
| def test_index(self): |
|---|
| df1 = self.read_html(self.spam_data, ".*Water.*", index_col=0) |
| df2 = self.read_html(self.spam_data, "Unit", index_col=0) |
| with tm.assert_produces_warning(FutureWarning): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment for non-duplicated tests; would rather not use the deprecated behavior
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, such amount of duplicated tests looks quite ugly, so I removed all tests with deprecated behavior except for one. Would it be enough?
Co-Authored-By: Joris Van den Bossche jorisvandenbossche@gmail.com
Co-Authored-By: Joris Van den Bossche jorisvandenbossche@gmail.com
Co-Authored-By: Joris Van den Bossche jorisvandenbossche@gmail.com
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a general policy for which arguments are keyword only?
lgtm. on anything with lots of args (pretty much read_*), happy to only have the first arg positional (path_or_buf).
Yeah
On Apr 6, 2020, at 18:25, Jeff Reback ***@***.***> wrote: looks good, @TomAugspurger ok here? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Labels
read_html, to_html, Styler.apply, Styler.applymap
read_json, to_json, json_normalize