DEPR: deprecate raise_on_error in .where/.mask in favor of errors= by jreback · Pull Request #17744 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation17 Commits1 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

jreback

jorisvandenbossche

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR itself looks good, added some minor comments.

But more in general, can you actually give a use case ? I mean a small example where you see the effect of the keyword?
I haven't been able to make one, and it also seems you didn't have to change any test (the one test you changed, there the keyword has no effect at all). Just wondering, probably I am missing something, but if not I would rather deprecate completely without adding alternative.

return the results
errors : str, {'raise', 'ignore'}, default 'ignore'
'raise' : pass the error to the higher level
'ignore' : evaluate the op with and return the results

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was like this before, but for me the phrase "evaluate the op with" is not really clear what it means

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah a bunch of copy-paste

if raise_on_error is not None:
warnings.warn(
"raise_on_error is deprecated in favor of errors='raise'",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 'raise' is only for setting it to True, not False, so I would mention both options

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right

assert_series_equal(result, expected)
result = s.where([True, False],
Timestamp('20130101', tz='US/Eastern'),
raise_on_error=True)
errors='ignore')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between both of the above cases ?

(also , the errors has no effect)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right the kw is pretty much ignored in .where (that's part of the problem); its used in other routines. This is pretty much a mess internally, though making progress.

@jreback

But more in general, can you actually give a use case ? I mean a small example where you see the effect of the keyword?
I haven't been able to make one, and it also seems you didn't have to change any test (the one test you changed, there the keyword has no effect at all). Just wondering, probably I am missing something, but if not I would rather deprecate completely without adding alternative.

this really doesn't change anything; just had to fix some actual calls that causes the deprecation warning. The point of this is make things consistent. In the future I actually do want to change the defaults I think (and make 'coerce') possible. This is just conforming things a bit.

@jorisvandenbossche

I know this PR is not changing things, it is just converting the previous situation to a new more consistent one. That is fine. But if it is really a useless keyword, it is also an opportunity to clean things up, and to not just translate the mistakes (?) of the previous keyword to the new one.

Maybe I am missing something, but so my question still is: is there a case errors='ignore' actually does something?
If it currently has no effect, we shouldn't add the new errors keyword, but just deprecate the old raise_on_errors. Or if adding the new one, have errors='raise' as the only option.

@jreback

Maybe I am missing something, but so my question still is: is there a case errors='ignore' actually does something?
If it currently has no effect, we shouldn't add the new errors keyword, but just deprecate the old raise_on_errors. Or if adding the new one, have errors='raise' as the only option.

We DO need this, it is just not fully working ATM; it was turned off a while back. We need the option to raise, or ignore where the dtype is actually changing (rather than just do it, or in some cases coerce), this is a very common choice which we do in lots of places.

I don't think its a big deal to add the kw. Sure I can prob only allow 'raise' for now. Though we DO need it in a couple of places.

@codecov

@jreback

TomAugspurger

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for doing this now, if it makes things easier in the future.

- ``raise`` : allow exceptions to be raised
- ``ignore`` : suppress exceptions. On error return original object
values :

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like an unfinished docstring here.

@jorisvandenbossche

I am fine with having the keyword, I would just update the actual docstring (of the user facing where and mask) to not be misleading (as the current explanation is not correct).

I didn't follow the earlier discussion (#16821), but just to clarify: what is the idea with errors keyword in the future? You say you would like to change this? But to what?
Because the linked PR actually changed it to not error anymore if a non-matching dtype value is inserted, but to convert to object.

@jreback

I didn't follow the earlier discussion (#16821), but just to clarify: what is the idea with errors keyword in the future? You say you would like to change this? But to what?
Because the linked PR actually changed it to not error anymore if a non-matching dtype value is inserted, but to convert to object.

well the problem is we are effectively always errors='coerce' now (even though we don't call it that), and never errors='raise```, eg. say you do a .where()(or lots of other ops like.fillna, .setitemetc.) on something which would change the dtype, e.g. assign aTimestamp`` to an integer array.

Now we just up-convert to object, but providing an option to coerce or raise is palable, and in fact you really want the default to be raise (and actually do it rather than up-converting), which is almost always an accident)

@jreback

@jreback

ghost pushed a commit to reef-technologies/pandas that referenced this pull request

Oct 16, 2017

@jreback

alanbato pushed a commit to alanbato/pandas that referenced this pull request

Nov 10, 2017

@jreback @alanbato

No-Stream pushed a commit to No-Stream/pandas that referenced this pull request

Nov 28, 2017

@jreback @No-Stream