DEPR: deprecate Index.holds_integer by topper-123 · Pull Request #50243 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation23 Commits14 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

topper-123

jbrockmendel

"""
Whether the type is an integer type.
"""
return lib.infer_dtype(value) in ["integer", "mixed-integer"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@topper-123 relying on this at all for internal purposes is what i dislike. whether its a method vs function is totally irrelevant

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, can you explan what's the issue with this approach? This method was only used in the plotting module, so a very little used function? Should we replace it there with normal dtype checking, i.e. just drop the "mixed-integer" checks?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is "values-dependent behavior" which in general we are trying to move away from. it makes it much harder to reason about code. one of the reasons this is little-used is i have been trying to remove usages of it.

i expect most of the relevant cases could just be is_integer_dtype(self.dtype). there would likely be some straggler object-dtype-of-integer cases that would need some deprecation cycle

@topper-123

Ok, I understand. I'll look into it tomorrow. I've never worked on this part of the code base, hopefully it's not too complex.

topper-123

elif is_object_dtype(self):
return self.inferred_type not in ["integer", "mixed-integer"]
else:
return True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to just use return self.inferred_type not in ["integer", "mixed-integer"] here instead of the multiliner?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like all our tests pass if we change this whole thing to return self.dtype.kind in ["i", "u"]. I think the thing to do is drop this for now and deprecate it in 2.x and change it to this simpler check in 3.0.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow how could deprecate this, as this is an internal attribute, and I don't think this changes behavior (though I'm not 100 % sure there isn't an edge case).

Are you saying only use return self.inferred_type not in ["integer", "mixed-integer"] here and drop the if/elif/else?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though I'm not 100 % sure there isn't an edge case

Under my proposal, pd.Index([1, 2], dtype=object).holds_integer() would change in the future.

The only places where we use holds_integer are in _should_fallback_to_positional (which I expect to get rid of in 3.0) and plotting code. It would require some actual effort to determine if the suggested change could affect users via the plotting code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concretely in this PR, I git difficulty understanding what you want. Can you say what code you want changed, e.g. should the proposed deprecation of holds_integer be dropped? IMO it does seems like a good idea to deprecate the method...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this PR i would suggest

def _holds_integer(self) -> bool:
     return self.inferred_type not in ["integer", "mixed-integer"] 

def holds_integer(self) -> bool:
    warngs.warn("holds_integer is deprecated [...]", FutureWarning)
    return self._holds_integer()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nop problem, I've just updated the PR.

@topper-123

jbrockmendel

@@ -939,15 +939,15 @@ def __call__(self, *args, **kwargs):
f"{kind} requires either y column or 'subplots=True'"
)
if y is not None:
if is_integer(y) and not data.columns.holds_integer():
if is_integer(y) and data.columns._should_fallback_to_positional:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think these are exactly the same. _should_fallback_to_positional is overriden by Index subclasses

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, my instinct is that the old code easily give wrong result by giving an integer to an index that has _should_fallback_to_positional set to False and it should instead have been crystal clear if x/y are meant for .loc/.iloc instead of this in thiscode section. But this is difficult to prove and I'm not very familiar with the plotting code, so I've changed to use _holds_integer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree the current code is sketchy. hopefully we'll figure out what the original intent was and can change it to something more like is_integer_dtype(data.columns.dtype)

@topper-123

The ARM failure looks unrelated.

@topper-123

I've changed this to use Index._holds_integer.

@topper-123

jbrockmendel

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Holding off on merging until the existing deprecations are all enforced for 2.0.

@topper-123

Terji Petersen added 12 commits

January 14, 2023 11:39

@topper-123

@topper-123

I've updated this PR after it was decided that it is ok now to merge deprecation PRs.

The code checks errors are unrelated to this PR.

@phofl

@topper-123

I’ve rerun the failed tests and everything is green now.

phofl

@phofl

Labels

Deprecate

Functionality to remove in pandas

Index

Related to the Index class or subclasses