Fix name setting in DTI/TDI add and sub by jbrockmendel · Pull Request #19744 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation22 Commits12 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

@@ -358,17 +358,13 @@ def _maybe_update_attributes(self, attrs):
def _add_delta(self, delta):
if isinstance(delta, (Tick, timedelta, np.timedelta64)):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prob add a doc-string to these also making the guarantee that the name should not be set

@@ -726,6 +736,11 @@ def __sub__(self, other):
else: # pragma: no cover
return NotImplemented

if result is not NotImplemented:
res_name = ops._get_series_op_result_name(self, other)
result = result.rename(name=res_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this copies again, just set the name

@@ -726,6 +736,11 @@ def __sub__(self, other):
else: # pragma: no cover
return NotImplemented

if result is not NotImplemented:
res_name = ops._get_series_op_result_name(self, other)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename this to

get_op_result_name


if isinstance(delta, (Tick, timedelta, np.timedelta64)):
new_values = self._add_delta_td(delta)
elif is_timedelta64_dtype(delta):
if not isinstance(delta, TimedeltaIndex):
delta = TimedeltaIndex(delta)
else:
# update name when delta is Index
name = com._maybe_match_name(self, delta)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move _maybe_match_name to ops and put next to get_op_result_name (and should remove it in favor of get_op_result_name, but that might be slightly tricky).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. While I'm at it I'm going to move get_op_result_name with the utility functions near the top of the file instead of in the arithmetic-specific spot it is now.

# GH#18824
dti = pd.date_range('2017-01-01', periods=2, tz=tz)
other = box([pd.offsets.MonthEnd(), pd.offsets.Day(n=2)])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you removing all of the boxing? do these not work?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

box is parametrized over [np.array, pd.Index], but the pd.Index case is being separated out into its own test that also checks names.

return name


def _maybe_match_name(a, b):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this as a standalone? (IOW can you replace the current usage with get_op_result_name)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not can you doc-string this

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the uses can be replaced without affecting any existing behavior. I'll see if I can get them all in a way that doesn't necessitate new tests. If not I'll kill it off in the next pass.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, all non-test usages can be replaced. I've done that and then also added a docstring. Getting rid of it and updating the tests to test get_op_result_name directly will wait for another round with narrower scope.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok looks good. next pass let's try to remove this (I think you did, just need to transfer the tests)

@@ -1814,7 +1814,7 @@ def combine_first(self, other):
this = self.reindex(new_index, copy=False)
other = other.reindex(new_index, copy=False)
# TODO: do we need name?
name = com._maybe_match_name(self, other) # noqa

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyidea why the noqa is here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd speculate that it's because name is defined and then for some reason not used, so flake8 should complain.

@@ -726,6 +736,11 @@ def __sub__(self, other):
else: # pragma: no cover
return NotImplemented

if result is not NotImplemented:
res_name = ops.get_op_result_name(self, other)
result.rename(name=res_name, inplace=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simply set .name (and same above)

return self + other[0]
else:
warnings.warn("Adding/subtracting array of DateOffsets to "
"{} not vectorized".format(type(self)),
PerformanceWarning)
return self.astype('O') + np.array(other)
# TODO: pass freq='infer' like we do in _sub_offset_array?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of these comments (esp here, after the return)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) to make sure it gets seen during review, b) as a note to myself or the next pass

Series([1], name='x'), Series(
[2], name='x'))
assert (matched == 'x')

matched = com._maybe_match_name(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next pass, move these to test_ops.py (new)

Once this goes through the remaining cases on todo list are: array[offsets] for PeriodIndex, array[int], and array[timedelta64], and last some object-dtype corner cases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple items for next pass. merging.

return name


def _maybe_match_name(a, b):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok looks good. next pass let's try to remove this (I think you did, just need to transfer the tests)

@@ -1814,7 +1814,7 @@ def combine_first(self, other):
this = self.reindex(new_index, copy=False)
other = other.reindex(new_index, copy=False)
# TODO: do we need name?
name = com._maybe_match_name(self, other) # noqa
name = ops.get_op_result_name(self, other) # noqa

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clearly this is not used, so on next pass can you remove

harisbal pushed a commit to harisbal/pandas that referenced this pull request

Labels

pandas objects compatability with Numpy or Python functions

Datetime data dtype

Timedelta data type

2 participants