API: PeriodIndex subtraction to return object Index of DateOffsets by jbrockmendel · Pull Request #20049 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation25 Commits15 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

@jbrockmendel

@jbrockmendel

@jbrockmendel

jschendel

cls=type(self).__name__))
if not len(self) == len(other):
raise ValueError("cannot add indices of unequal length")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed it, but is there a test that hits this? And should it say "subtract" instead of "add"?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catches on both counts. Will fix.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed with fix to add-->subtract. test_pi_sub_pi, test_pi_sub_pi_with_nat, test_pi_sub_isub_pi each go through this method, though I don't think any go through this particular line

jschendel

- ``Categorical.fillna`` now validates its ``value`` and ``method`` keyword arguments. It now raises when both or none are specified, matching the behavior of :meth:`Series.fillna` (:issue:`19682`)
- ``pd.to_datetime('today')`` now returns a datetime, consistent with ``pd.Timestamp('today')``; previously ``pd.to_datetime('today')`` returned a ``.normalized()`` datetime (:issue:`19935`)
- :func:`Series.str.replace` now takes an optional `regex` keyword which, when set to ``False``, uses literal string replacement rather than regex replacement (:issue:`16808`)
- :class:`PeriodIndex` subtraction of another ``PeriodIndex`` will now return a numeric ``Index`` instead of raising a ``TypeErrorr`` (:issue:`20049`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra "r" at the end of "TypeErrorr"

@jreback

I think we should be returning an array (Index?) of offsets rather than the integers which are untyped.

we don't really support this so an object Index would be ok I think.

@jbrockmendel

I think we should be returning an array (Index?) of offsets rather than the integers which are untyped.

I more-or-less agree, but ATM Period - Period returns an integer and PeriodIndex - Period returns an integer array. These three operations should all be consistent.

@jreback

I more-or-less agree, but ATM Period - Period returns an integer and PeriodIndex - Period returns an integer array. These three operations should all be consistent.

and these should changed (both). yes its an API breaking change but much more intuitve / better. it will just have to be an object array but that's ok.

@jbrockmendel

@jbrockmendel

and these should changed (both). yes its an API breaking change but much more intuitve / better. it will just have to be an object array but that's ok.

Implementing this, the non-obvious issue that comes up is what to do with NaTs in a PeriodIndex when subtracting Period. Never mind, this is easy to handle.

@jbrockmendel

@jbrockmendel

@codecov

jreback

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good. rebase, small comments. @jschendel can you give a once-over

- :func:`DataFrame.to_dict` with ``orient='index'`` no longer casts int columns to float for a DataFrame with only int and float columns (:issue:`18580`)
- A user-defined-function that is passed to :func:`Series.rolling().aggregate() <pandas.core.window.Rolling.aggregate>`, :func:`DataFrame.rolling().aggregate() <pandas.core.window.Rolling.aggregate>`, or its expanding cousins, will now *always* be passed a ``Series``, rather than a ``np.array``; ``.apply()`` only has the ``raw`` keyword, see :ref:`here <whatsnew_0230.enhancements.window_raw>`. This is consistent with the signatures of ``.aggregate()`` across pandas (:issue:`20584`)
- Rolling and Expanding types raise ``NotImplementedError`` upon iteration (:issue:`11704`).
>>>>>>> cbec58eacd8e9cd94b7f42351b8de4559c250909

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover

with pytest.raises(TypeError):
p - idx
@pytest.mark.parametrize('op', [operator.add, ops.radd,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add (maybe another test) add/sub with a pi and another dtype (I don't think we have this test?)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking, you mean to add cases over in the period test_arithmetic file, not here, right?

rng = pd.period_range('1/1/2000', freq='D', periods=5)
other = pd.period_range('1/6/2000', freq='D', periods=5)
# previously performed setop union, now raises TypeError (GH14164)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a pretty old comment, can you improve

class TestPeriodIndexArithmetic(object):
# ---------------------------------------------------------------
# __add__/__sub__ with PeriodIndex

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you give a 1-2 line on what are allowed ops on PI

# TODO needs to wait on #13077 for decision on result type
def test_pi_sub_isub_pi(self):
# GH#20049
# previously raised TypeError (GH#14164), before that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you refresh comment

@pep8speaks

Hello @jbrockmendel! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on June 22, 2018 at 03:51 Hours UTC

@jbrockmendel

Removing test_pi_sub_pi because it is entirely redundant with test_pi_sub_isub_pi. Moving the latter to where it should be in the test file.

@jbrockmendel

@jorisvandenbossche jorisvandenbossche changed the titleimplement add/sub for period_dtype other API: PeriodIndex subtraction to return object Index of DateOffsets

Jun 12, 2018

jorisvandenbossche

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't first the Period scalar substraction be changed as well?

@jbrockmendel

@jbrockmendel

@jbrockmendel

jschendel

- For :class:`DatetimeIndex` and :class:`TimedeltaIndex` with non-``None`` ``freq`` attribute, addition or subtraction of integer-dtyped array or ``Index`` will return an object of the same class (:issue:`19959`)
- :class:`DateOffset` objects are now immutable. Attempting to alter one of these will now raise ``AttributeError`` (:issue:`21341`)
- :class:`PeriodIndex` subtraction of another ``PeriodIndex`` will now return an object-dtype :class:`Index` of :class:`DateOffset` objects instead of raising a ``TypeErrorr`` (:issue:`20049`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra "r" at the end of "TypeErrorr"

@jbrockmendel

jschendel

def test_pi_sub_pi_with_nat(self):
rng = pd.period_range('1/1/2000', freq='D', periods=5)
other = rng[1:].insert(0, pd.NaT)
assert other[1:].equals(rng[1:])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assert necessary? I don't immediately see why it's needed, but could be missing something. If it is needed, can assert_index_equal be used instead?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is checking that I haven't already screwed up, i.e. that the test below is indeed testing what I think it is testing.

@jbrockmendel

gentle ping (this should be tandem with #21314)

jreback

@jreback

thanks @jbrockmendel

also pls have a look at the period docs to see if we need updating anywhere.

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request

Oct 1, 2018

@jbrockmendel

Labels