implement+test mean for datetimelike EA/Index/Series by jbrockmendel · Pull Request #24757 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation109 Commits44 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
This takes over from #23890, implementing+testing reductions one at a time.
- closes #xxxx
- tests added / passed
- passes
git diff upstream/master -u -- "*.py" | flake8 --diff
- whatsnew entry
Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff
Extending pandas with custom dtypes or arrays.
and removed ExtensionArray
Extending pandas with custom dtypes or arrays.
labels
we are still treating datetimes as non-numeric yes? meaning that by default they will be exlcuded from say .mean() and you have to pass numeric_only=False
?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you see if we have coverage of this on mixed dtype frames as well? (see my comment about numeric_only). Also need to assert that other ops automatically exclude (e.g. std).
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need the tests from the OP (e.g. a mixed DataFrame)
we are still treating datetimes as non-numeric yes? meaning that by default they will be exlcuded from say .mean() and you have to pass numeric_only=False?
I think that makes the most sense.
Does this close #24752?
I don't think so, will need to double-check.
@@ -47,6 +47,27 @@ def test_ops_properties_basic(self): |
---|
assert s.day == 10 |
pytest.raises(AttributeError, lambda: s.weekday) |
# TODO: figure out where in tests.reductions this belongs |
def test_mean(self, tz_naive_fixture): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems very duplicative of the test below, is there a reason?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bc at the time when this was written Index reduction tests had not been collected in tests.reductions. Will move/de-duplicate.
I guess these have to be uniform periods so mean is just another period, though I think this could easily be a non-sensical return values, e.g. what is the mean of
pd.date_range('2012', freq='M', periods=4)
Eyeballing it at about 2012-03-14. Why would that be nonsensical?
it’s no longer a M freq Period. my point is the mean if a span is an odd thing
@jbrockmendel instead of keeping rebasing this, can you actually answer to my request for changes? (unless it is just lack of time that you didn't make any changes yet, that's no problem)
can you merge master and fix the conflicts.
@jorisvandenbossche this has been outstanding a while, any remaining issues?
@@ -1435,6 +1435,50 @@ def max(self, axis=None, skipna=True, *args, **kwargs): |
---|
# Don't have to worry about NA `result`, since no NA went in. |
return self._box_func(result) |
def mean(self, axis=None, skipna=True): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don'e we also need out=? meaning we should just accept kwargs like we do for Series.mean
Parameters |
---|
---------- |
axis : None |
Dummy parameter to match NumPy signature |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would just add kwargs for this purpose; we don't add them explicity anywhere else
can you merge master; this looks fine otherwise, though I think @jorisvandenbossche had a question.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems good for now. There's still an outstanding discussion about accepting additional arguments so that methods like np.mean
work correctly. These are
- axis
- dtype
- out
- keepdims (optional)
but I'm fine with leaving that as a followup when someone wants it.
Given the lack of response of @jbrockmendel, I added a commit that removes the axis
keyword (single commit, so can be easily removed again if that is a problem).
Rationale: since we don't do any compat-handling of numpy keyword arguments, let's also not do it for axis. We can always add that later.
Fine with merging it like this.
Given the lack of response of @jbrockmendel,
My bad. Starting next month I'm working FT on pandas, so until then I'm focusing on statsmodels and spending time outdoors. Will rebase momentarily.
I think this will pass if you merge master?
until then I'm focusing on statsmodels and spending time outdoors.
Enjoy your time off!
Failures were actually real ones: pushed fixes for a problem with the axis (that I caused) and a docstring lint error
This was referenced
Apr 24, 2020
Labels
Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff
Datetime data dtype
Extending pandas with custom dtypes or arrays.
pandas testing functions or related to the test suite