PERF: Use Indexers to implement groupby rolling by mroeschke · Pull Request #34052 · pandas-dev/pandas (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation24 Commits40 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff

Currently, grouby.rolling is implemented essentially as groupby.apply(lambda x: x.rolling()) which can be potentially slow.

This PR implements groupby.rolling by calculating bounds with a GroupbyRollingIndxer and using the rolling aggregations in cython to compute the results.

Matt Roeschke added 14 commits

May 6, 2020 19:40

Here are preliminary benchmarks. The performance so far is fairly similar. I suspect that the fact that I have to reconstruct the resulting index is killing the performance

With this benchmark:

In [1]: n = 1000
   ...: df = pd.DataFrame({"A": [str(i) for i in range(n)] * 10, "B": list(range(n)) * 10})
   ...: g = df.groupby("A").rolling(window=2)
   ...: %timeit g.sum()
1.5 s ± 5.77 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) <-- PR
1.52 s ± 49 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) <-- master

try with a longer window as well
also profile this should be way faster

I was accidentally still dispatching to the old implementation in this PR, here are the performance results with that removed

-- PR
In [2]: %timeit g.sum()
61.2 ms ± 4.61 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit g.sum()
71.9 ms ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: %timeit g.sum()
57.9 ms ± 714 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


-- master
In [2]: %timeit g.sum()
1.42 s ± 9.08 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit g.sum()
1.44 s ± 26.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit g.sum()
1.48 s ± 73.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you enhance the benchamarks for this (if we don't have?)

also if you can add a whatsnew note.

Matt Roeschke added 3 commits

May 13, 2020 21:15

Hello @mroeschke! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-05-21 04:47:41 UTC

Matt Roeschke added 3 commits

May 13, 2020 23:14

df = pd.DataFrame(
{"A": [str(i) for i in range(N)] * 10, "B": list(range(N)) * 10}
)
self.groupby_roll = df.groupby("A").rolling(window=2)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a timebased one as well

@@ -611,7 +611,7 @@ Performance improvements
and :meth:`~pandas.core.groupby.groupby.Groupby.last` (:issue:`34178`)
- Performance improvement in :func:`factorize` for nullable (integer and boolean) dtypes (:issue:`33064`).
- Performance improvement in reductions (sum, prod, min, max) for nullable (integer and boolean) dtypes (:issue:`30982`, :issue:`33261`, :issue:`33442`).

- Performance improvement in ``groupby(..).rolling(..)`` (:issue:`34052`)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a way of hitting the api here?

np.concatenate(list(self._groupby.grouper.indices.values()))
)

# filter out the on from the object

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe better to call super()._create_blocks(obj) (e.g. add obj as an optional arg that defaults to self._selected_obj)

Matt Roeschke added 7 commits

May 18, 2020 20:34

Final benchmarks:

# rolling over 1000 groups
In [1]: In [1]: n = 1000
   ...:    ...: df = pd.DataFrame({"A": [str(i) for i in range(n)] * 10, "B": list(range(n)) * 10})
   ...:    ...: g = df.groupby("A").rolling(window=2)

-- master
In [2]: %timeit g.sum()
1.38 s ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit g.sum()
1.4 s ± 13.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit g.sum()
1.37 s ± 6.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

--- PR
In [2]: %timeit g.sum()
63.1 ms ± 396 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [3]: %timeit g.sum()
70.1 ms ± 8.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: %timeit g.sum()
63.7 ms ± 471 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, some doc comment requests. ping on green.

center: Optional[bool] = None,
closed: Optional[str] = None,
) -> Tuple[np.ndarray, np.ndarray]:
start_arrays = []

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some comments here on what you are doing

@@ -147,12 +148,10 @@ def _validate_get_window_bounds_signature(window: BaseIndexer) -> None:
f"get_window_bounds"
)

def _create_blocks(self):
def _create_blocks(self, obj):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you can type: Union[Series, DataFrame] (I think we have an annoation for that).

groupby_keys = [grouping.name for grouping in self._groupby.grouper._groupings]
result_index_names = groupby_keys + grouped_index_name

result_index_data = []

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you can add some commments here on what you are doing

@property
def _constructor(self):
return Rolling

def _gotitem(self, key, ndim, subset=None):
def _create_blocks(self, obj):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you can type

Matt Roeschke added 2 commits

May 20, 2020 21:46

Does this affect also "Expanding"?

No this currently doesn't apply to expanding. PR's to make it apply to expanding welcome!