implement+test mean for datetimelike EA/Index/Series by jbrockmendel · Pull Request #24757 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation109 Commits44 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

jbrockmendel

This takes over from #23890, implementing+testing reductions one at a time.

@jbrockmendel

gfyoung

@gfyoung gfyoung added Algos

Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff

ExtensionArray

Extending pandas with custom dtypes or arrays.

and removed ExtensionArray

Extending pandas with custom dtypes or arrays.

labels

Jan 13, 2019

@jreback

we are still treating datetimes as non-numeric yes? meaning that by default they will be exlcuded from say .mean() and you have to pass numeric_only=False?

jreback

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you see if we have coverage of this on mixed dtype frames as well? (see my comment about numeric_only). Also need to assert that other ops automatically exclude (e.g. std).

@jbrockmendel

@jbrockmendel

@codecov

@codecov

@jbrockmendel

jreback

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need the tests from the OP (e.g. a mixed DataFrame)

@TomAugspurger

we are still treating datetimes as non-numeric yes? meaning that by default they will be exlcuded from say .mean() and you have to pass numeric_only=False?

I think that makes the most sense.

@TomAugspurger

@jbrockmendel

Does this close #24752?

I don't think so, will need to double-check.

@jbrockmendel

jreback

@@ -47,6 +47,27 @@ def test_ops_properties_basic(self):
assert s.day == 10
pytest.raises(AttributeError, lambda: s.weekday)
# TODO: figure out where in tests.reductions this belongs
def test_mean(self, tz_naive_fixture):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems very duplicative of the test below, is there a reason?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bc at the time when this was written Index reduction tests had not been collected in tests.reductions. Will move/de-duplicate.

@jbrockmendel

I guess these have to be uniform periods so mean is just another period, though I think this could easily be a non-sensical return values, e.g. what is the mean of pd.date_range('2012', freq='M', periods=4)

Eyeballing it at about 2012-03-14. Why would that be nonsensical?

@jbrockmendel

@jreback

it’s no longer a M freq Period. my point is the mean if a span is an odd thing

jorisvandenbossche

@jbrockmendel

@jbrockmendel

@jorisvandenbossche

@jbrockmendel instead of keeping rebasing this, can you actually answer to my request for changes? (unless it is just lack of time that you didn't make any changes yet, that's no problem)

@jreback

can you merge master and fix the conflicts.

@jorisvandenbossche this has been outstanding a while, any remaining issues?

@jbrockmendel

@jreback

jreback

@@ -1435,6 +1435,50 @@ def max(self, axis=None, skipna=True, *args, **kwargs):
# Don't have to worry about NA `result`, since no NA went in.
return self._box_func(result)
def mean(self, axis=None, skipna=True):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don'e we also need out=? meaning we should just accept kwargs like we do for Series.mean

Parameters
----------
axis : None
Dummy parameter to match NumPy signature

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would just add kwargs for this purpose; we don't add them explicity anywhere else

@jreback

can you merge master; this looks fine otherwise, though I think @jorisvandenbossche had a question.

@jbrockmendel

TomAugspurger

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good for now. There's still an outstanding discussion about accepting additional arguments so that methods like np.mean work correctly. These are

but I'm fine with leaving that as a followup when someone wants it.

@jorisvandenbossche

@jorisvandenbossche

Given the lack of response of @jbrockmendel, I added a commit that removes the axis keyword (single commit, so can be easily removed again if that is a problem).
Rationale: since we don't do any compat-handling of numpy keyword arguments, let's also not do it for axis. We can always add that later.

Fine with merging it like this.

jreback

@jreback

@jbrockmendel

Given the lack of response of @jbrockmendel,

My bad. Starting next month I'm working FT on pandas, so until then I'm focusing on statsmodels and spending time outdoors. Will rebase momentarily.

@jbrockmendel

@jbrockmendel

@jreback

I think this will pass if you merge master?

@jorisvandenbossche

until then I'm focusing on statsmodels and spending time outdoors.

Enjoy your time off!


Failures were actually real ones: pushed fixes for a problem with the axis (that I caused) and a docstring lint error

@jreback

@jbrockmendel

@jbrockmendel

@jreback

This was referenced

Apr 24, 2020

Labels

Algos

Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff

Datetime

Datetime data dtype

ExtensionArray

Extending pandas with custom dtypes or arrays.

Testing

pandas testing functions or related to the test suite