API: Add pipe method to GroupBy objects · Issue #10353 · pandas-dev/pandas (original) (raw)

Extend the new "pipe" protocol to GroupBy objects to allow for piping of a wider class of functions. Currently, one can only create pipes that chain together objects inheriting from NDFrame. But the concept of piping is general and could be extended to other pandas objects, specifically anything inheriting from GroupBy.

The use case is to write pipe that allow one to freely transform back-and-forth between NDFrames and GroupBy objects. Example:

df = DataFrame({A: [...], B: [...]})

def f(dfgb):
    return dfgb['B'].value_counts()

def g(srs):
    return srs * 2

grouped = df.groupby('A')

grouped.pipe(f).pipe(g)

Note that these transformations are transformations are

There are a few ways to implement this. A simple way is to break out the core functionality of "pipe" into a pure function and then to call that function in any method implementation of pipe. Another way is to think of piping as a mix-in trait, put it as a method in a base class, and then mix that base class into any class that wants to implement pipe-ability. I have no strong preference between these options, and I'm open to other implementations that may be more inline with Pandas' design goals or the long-term vision of the "pipe" concept.

A strawman implementation of the first implementation suggestion can be found here:
master...ghl3:groupby-pipe

CC
@TomAugspurger
@shoyer