DEPR/CLN: Remove pd.rolling_*, pd.expanding* and pd.ewm* by topper-123 · Pull Request #18723 · pandas-dev/pandas (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation52 Commits3 Checks0 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are removing a lot of tests that should remain
@@ -216,7 +216,9 @@ Removal of prior version deprecations/changes |
---|
- The ``pandas.io.wb`` and ``pandas.io.data`` stub modules have been removed (:issue:`13735`) |
- ``Categorical.from_array`` has been removed (:issue:`13854`) |
- The ``freq`` and ``how`` parameters have been removed from the ``rolling``/``expanding``/``ewm`` methods of DataFrame |
and Series (deprecated since v0.18). Instead, resample before calling the methods. (:issue:18601 & :issue:18668) |
and Series (deprecated since v0.18). Instead, resample before calling the methods. (:issue:`18601`) |
- The top-level functions ``pd.rolling_*``, ``pd.expanding_*`` and ``pd.ewm*`` have been removed (deprecated since v0.18). |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don’t remove any issue numbers
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this and some other stuff that’s unrelated to the pd.rolling stuff seems to be a rebase issue. I’ll fix it tomorrow.
If you want, you can limit the review to only the latest commit, the other commits are because of the rebase issue.
@@ -331,5 +333,5 @@ Categorical |
---|
Other |
^^^^^ |
- Improved error message when attempting to use a Python keyword as an identifier in a ``numexpr`` backed query (:issue:`18221`) |
- Improved error message when attempting to use a Python keyword as an identifier in a numexpr query (:issue:`18221`) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don’t change other things
12.952, np.nan, np.nan]) |
---|
with catch_warnings(record=True): |
rs = mom.rolling_window(vals, 5, 'boxcar', center=True) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don’t remove tests just because we are catching the warning
you may need to rewrite them to use the new syntax
unless they duplicate current tests
Codecov Report
Merging #18723 into master will increase coverage by
0.07%
.
The diff coverage isn/a
.
@@ Coverage Diff @@ ## master #18723 +/- ##
Coverage 91.62% 91.69% +0.07%
Files 150 148 -2
Lines 48728 48542 -186
- Hits 44645 44513 -132
- Misses 4083 4029 -54
Flag | Coverage Δ | |
---|---|---|
#multiple | 90.06% <ø> (+0.07%) | ⬆️ |
#single | 41.71% <ø> (-0.03%) | ⬇️ |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update ca4ae4f...1443818. Read the comment docs.
The rebase issue has been solved now. I'll look at your comments now,. If you've got additional comments, I'd be thakful, and I'll look at them
Hello @topper-123! Thanks for updating the PR.
Cheers ! There are no PEP8 issues in this Pull Request. 🍻
Comment last updated on February 01, 2018 at 06:53 Hours UTC
Hi @jreback. An issue is that dask apparently still uses pd.rolling_*
. This means that pandas/tests/test_downstream.py
currently fails. Should this PR be put on hold, or would notifying the dask guys about this removal be enough? This stuff has been deprecated since 0.18, so for quite a long time.
This PR doesn't touch too many files, so it would be ok to wait a bit, but given the simplification to the pandas API that this PR offers, it should IMO be included when 0.22 releases.
I've updated the PR, so be'd grateful for a review, so it'll be ready for an eventual pull.
The dask versions are deprecated, just like pandas. I believe the best solution would be to
- Use this PR to update all the test to the new syntax
- Update dask to remove the
dd.rolling
, etc. methods (@topper-123 could I convince you to work on that? :) It'd be nice to have you as a contributor if you haven't already.) - Remove them from pandas once dask has a release
Depending on the timing of pandas 0.22, dask may not have a release out.
@TomAugspurger , the test_dask
function doesn't explicitly call pd.rolling_*
, so changes must be made in the dask code base, correct?
I've never used dask, and have a feeling I'd have to learn some intricancies of that lib to make such a PR, and I could easily go down a rabbit's hole that I never intended to enter... Could someone familiar with dask not take it upon themselves to do those changes? I can sit on this PR until dask is updated, and rebase when this is ready to merge.
TomAugspurger added a commit to TomAugspurger/dask that referenced this pull request
@topper-123 dask was importing them at the top level if dask.dataframe
, which caused the failure. Next time dask wants to mirror a pandas deprecation, we'll want to put it in a try
/ except
block.
@TomAugspurger , thanks for picking up on this, I will wait with my PR a while, until dask is updated.
My PR lowers the number of top-level functions of pandas by some 30, so it would IMO be very beneficial to pandas' users to have this merged by 0.22, as it would make pandas look much more focused. Could dask have a release before pandas 0.22, eventually just a minor release?
jcrist pushed a commit to dask/dask that referenced this pull request
dask 0.16.1 has now been released, and all tests now pass locally with dask version 0.16.1. The travis failure has to do with IO/S3 which I didn't touch. That's the only failure in travis.
So, I think everything is ok with my PR. Is there anything I can do to make this pass that travis error, or is this travis failure unrelated to my PR?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question about one of the tests
assert result2.dtype == np.float_ |
---|
def _check_ew_structures(self, func, name): |
def _check_ew(self, name=None): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this function consisted of _check_ew_ndarray
and _check_ew_structures
subfunctions to test both interfaces. You removed _check_ew_ndarray
and only kept the content of _check_ew_structures
as this is testing the new interface.
But, as far as I can see, the actual testing was done in _check_ew_ndarray
(_check_ew_structures
is only testing the output class type). Shouldn't this testing be kept? (changed to the new interface)
(might good be I am missing something, or that it is covered in other tests)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right: So rather than test this function call, test the ewm
method call on a series. I'll look into it tomorrow.
@jorisvandenbossche , I've now replicated the code from _check_ew_ndarray
into check_ew
. Could you review it?
@topper-123 that specific one looks good now I think.
Did you double check the other tests to see there were not similar problems?
I've added test_expanding_func
(parametrized, was called test_parametrized_max
previously) and test_expanding_apply
. Everything is now included AFAICT.
The tests have been running for 9 hours now and the Travis-CI tests have not yet finished. Same thing happened yesterday where after 20 hours one travis test-suite failed without console output (my guess is it timed out). Is anything wrong with the travis suite?
Everything passes locally and the errors that travis reported two days ago have been fixed (but the fixes not confirmed by travis, because I can't get travis to go through them...).
What should I do? Could a core dev run the travis tests locally and confirm they're ok, and merge them?
Ok, thanks. I'll wait a bit more then.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry @topper-123, one more question
(it's a very complex file / diff (not your fault!) to review)
result = get_result(arr, 50, min_periods=30) |
---|
tm.assert_almost_equal(result[-1], static_comp(arr[10:-10])) |
# min_periods is working correctly |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to move parts of _check_ndarray
like this part on checking min_periods
to _check_moments_func
?
From a superficial look, it seems that _check_ndarray
did more than _check_structures
(eg in _check_structures
I don't see any assertion of NaNs when min_periods
is specified like here)
I also don't know if test_stable
and test_window
make sense for the new interface
r = obj.rolling(window=window, min_periods=min_periods, |
---|
center=center) |
return getattr(r, name)(**kwargs) |
self._check_series(name, static_comp=static_comp, |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure of the point of adding _check_series
, it seems that you are already checking the series & frame results (line 1344), yes?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there IS a difference, IOW you are checking something different, then certainly don't make a new function, just in-line it here.
tp added 2 commits
Hi @jreback. I uploaded a new PR 12 days ago.
There were a few issues that were checked in _check_ndarray
but were not in _check_moments
previously:
- tests for whether Nans were excluded correctly
- tests for whether
center
worked correctly - tests for what happens when the window is larger than the series
All of the above are inlined in _check_moments
now (see 37abd25).
I've rebased a moments ago, without making code changes otherwise, so everything should turn up green & rebased when the tests are finished.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine. just that 1 question.
@@ -863,72 +848,55 @@ def test_centered_axis_validation(self): |
---|
.rolling(window=3, center=True, axis=2).mean()) |
def test_rolling_sum(self): |
self._check_moment_func(mom.rolling_sum, np.nansum, name='sum', |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, shouldn't this be nansum?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. fixed.
harisbal pushed a commit to harisbal/pandas that referenced this pull request
remove pd.running_*, pd.expanding_* and pd.ewm* and related code
added test_expanding_func and test_expanding_apply
recreate _check_ndarray inline in _check_moment_func