Implement integer array add/sub for datetimelike indexes by jbrockmendel · Pull Request #19959 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation28 Commits16 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

jbrockmendel

AFAICT this is the next-to-last non-raising case for these operations. After this is just object-dtype, and for that we already have the correct implementation in addsub_offset_array, just need to make the condition less strict and add appropriate tests.

@jbrockmendel

@jbrockmendel

@codecov

@jbrockmendel

jreback

td = Timedelta(self.freq)
return op(self, td * other)
else:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for an else

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you address this comment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fell through the cracks; just pushed an update.

@@ -810,6 +851,11 @@ def __rsub__(self, other):
# we need to wrap in DatetimeIndex and flip the operation
from pandas import DatetimeIndex
return DatetimeIndex(other) - self
elif (is_datetime64_any_dtype(self) and hasattr(other, 'dtype') and
not is_datetime64_any_dtype(other)):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment on why this is

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed, but GitHub UI not surfacing.

@pytest.mark.parametrize('freq', ['H', 'D'])
@pytest.mark.parametrize('box', [np.array, pd.Index])
def test_dti_add_intarray_tick(self, box, freq):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we must have some tests that do index arithmetic with integers, either move them here if they are not redundant or remove them.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are tests for integers, but nothing ATM for integer-arrays.

# ---------------------------------------------------------------
# __add__/__sub__ with integer arrays
@pytest.mark.parametrize('box', [np.array, pd.Index])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above

@jbrockmendel

Comments above acknowledged.

Let's stick a pin in this for a little bit. I think there may be a problem with integer ops, need to double-check before moving forward on this.

@jbrockmendel

@jbrockmendel

Let's stick a pin in this for a little bit. I think there may be a problem with integer ops, need to double-check before moving forward on this.

I've convinced myself this is OK. Or more accurately, that weird behavior (not affected by this PR, but in integer add/sub) can only come up if a user specifically passes verify_integrity=False to DatetimeIndex or TimedeltaIndex.

@jbrockmendel

@jbrockmendel

@jbrockmendel

Needs rebasing since its gotten a bit stale, but I think this is otherwise ready. Will push shortly.

@jbrockmendel

@jbrockmendel

@jreback

@jbrockmendel

@jbrockmendel

@jbrockmendel

@jreback gentle ping. I've got some time this weekend I can put towards finishing this off.

@jbrockmendel

@jbrockmendel

@gfyoung any idea if everything/everyone is OK?

@gfyoung

@jbrockmendel : I see a bunch of comments from @jreback that you haven't explicitly addressed. That might be why no one is saying anything.

@jbrockmendel

@gfyoung Thanks. I think there's just the one where I haven't un-indented an else: clause as requested; might as well do that now. I think the others are addressed, though some are indirect.

@jbrockmendel

@gfyoung

I think the others are addressed, though some are indirect.

Did you address the commenting request above?

EDIT: I see the comment now! 😄

@gfyoung

@jreback : ping. I think all comments have been addressed.

@jbrockmendel

@jbrockmendel

@jbrockmendel

@jreback gentle ping. I'd like to finish this off and get back to unifying Index/Series/DataFrame arithmetic implementations in core.ops.

@jreback

will have to wait until after the rc

@jbrockmendel

@jbrockmendel

jreback

@@ -951,7 +951,7 @@ Datetimelike API Changes
rather than ``NotImplementedError`` when index is not a :class:`DatetimeIndex` (:issue:`20725`).
- Restricted ``DateOffset`` keyword arguments. Previously, ``DateOffset`` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior. Now, only valid arguments will be accepted. (:issue:`17176`, :issue:`18226`).
- :func:`pandas.merge` provides a more informative error message when trying to merge on timezone-aware and timezone-naive columns (:issue:`15800`)
- For :class:`DatetimeIndex` and :class:`TimedeltaIndex` with ``freq=None``, addition or subtraction of integer-dtyped array or ``Index`` will raise ``NullFrequencyError`` instead of ``TypeError`` (:issue:`19895`)
- For :class:`DatetimeIndex` and :class:`TimedeltaIndex` with ``freq=None``, addition or subtraction of integer-dtyped array or ``Index`` will raise ``NullFrequencyError`` instead of ``TypeError`` if the index ``freq`` attribute is ``None``, and will return an object of the same class otherwise (:issue:`19895`, :issue:`19959`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.24 (this is an api change right?)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just changed.

@jbrockmendel

@jbrockmendel

@jbrockmendel

@jreback if we can get this through then DatetimeIndex + IntArray will be meaningful and we can flesh out the tests in #21160.

jreback

@jreback

thanks @jbrockmendel

it seems there might be some code you can remove from PeriodIndex though? (obviously in a followup)

@jbrockmendel

it seems there might be some code you can remove from PeriodIndex though?

Sounds plausible; I'll have to refresh my memory. IIRC the next dtype to handle is Period/PeriodIndex subtraction, with the complication that we need to fix the current scalar subtraction behavior.

Labels

Compat

pandas objects compatability with Numpy or Python functions

Datetime

Datetime data dtype

Period

Period data type

Timedelta

Timedelta data type