pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation109 Commits44 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

This takes over from #23890, implementing+testing reductions one at a time.

closes #xxxx
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

gfyoung added Algos

Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff

ExtensionArray

Extending pandas with custom dtypes or arrays.

and removed ExtensionArray

Extending pandas with custom dtypes or arrays.

labels

Jan 13, 2019

we are still treating datetimes as non-numeric yes? meaning that by default they will be exlcuded from say .mean() and you have to pass numeric_only=False?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you see if we have coverage of this on mixed dtype frames as well? (see my comment about numeric_only). Also need to assert that other ops automatically exclude (e.g. std).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need the tests from the OP (e.g. a mixed DataFrame)

we are still treating datetimes as non-numeric yes? meaning that by default they will be exlcuded from say .mean() and you have to pass numeric_only=False?

I think that makes the most sense.

Does this close #24752?

I don't think so, will need to double-check.

@@ -47,6 +47,27 @@ def test_ops_properties_basic(self):
assert s.day == 10
pytest.raises(AttributeError, lambda: s.weekday)

# TODO: figure out where in tests.reductions this belongs
def test_mean(self, tz_naive_fixture):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems very duplicative of the test below, is there a reason?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bc at the time when this was written Index reduction tests had not been collected in tests.reductions. Will move/de-duplicate.

I guess these have to be uniform periods so mean is just another period, though I think this could easily be a non-sensical return values, e.g. what is the mean of pd.date_range('2012', freq='M', periods=4)

Eyeballing it at about 2012-03-14. Why would that be nonsensical?

it’s no longer a M freq Period. my point is the mean if a span is an odd thing

@jbrockmendel instead of keeping rebasing this, can you actually answer to my request for changes? (unless it is just lack of time that you didn't make any changes yet, that's no problem)

can you merge master and fix the conflicts.

@jorisvandenbossche this has been outstanding a while, any remaining issues?

@@ -1435,6 +1435,50 @@ def max(self, axis=None, skipna=True, args, *kwargs):
# Don't have to worry about NA `result`, since no NA went in.
return self._box_func(result)

def mean(self, axis=None, skipna=True):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don'e we also need out=? meaning we should just accept kwargs like we do for Series.mean

Parameters
----------
axis : None
Dummy parameter to match NumPy signature

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would just add kwargs for this purpose; we don't add them explicity anywhere else

can you merge master; this looks fine otherwise, though I think @jorisvandenbossche had a question.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good for now. There's still an outstanding discussion about accepting additional arguments so that methods like np.mean work correctly. These are

axis
dtype
out
keepdims (optional)

but I'm fine with leaving that as a followup when someone wants it.

Given the lack of response of @jbrockmendel, I added a commit that removes the axis keyword (single commit, so can be easily removed again if that is a problem).
Rationale: since we don't do any compat-handling of numpy keyword arguments, let's also not do it for axis. We can always add that later.

Fine with merging it like this.

Given the lack of response of @jbrockmendel,

My bad. Starting next month I'm working FT on pandas, so until then I'm focusing on statsmodels and spending time outdoors. Will rebase momentarily.

I think this will pass if you merge master?

until then I'm focusing on statsmodels and spending time outdoors.

Enjoy your time off!

Failures were actually real ones: pushed fixes for a problem with the axis (that I caused) and a docstring lint error

This was referenced

Apr 24, 2020

Labels

Algos

Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff

Datetime

Datetime data dtype

ExtensionArray

Extending pandas with custom dtypes or arrays.

Testing

pandas testing functions or related to the test suite