ENH/API: clarify groupby by to handle columns/index names · Issue #5677 · pandas-dev/pandas (original) (raw)

Referenced briefly in the OP at #3275

In [11]: idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3)])

In [12]: idx.names = ['outer', 'inner']

In [13]: df = pd.DataFrame({"A": np.arange(6), 'B': ['one', 'one', 'two', 'two', 'one', 'one']}, index=idx)

So the idea is to be able to call

df.groupby('B', level='inner')

instead of

In [15]: df.reset_index().groupby(['B', 'inner']).mean() Out[15]: A B inner
one 1 0.0 2 2.5 3 5.0 two 1 3.0 3 2.0

[5 rows x 1 columns]

Currently this raises TypeError: 'numpy.ndarray' object is not callable. Mostly just syntactic sugar, but I've been having to do a lot of this lately and all the reset_indexes are getting annoying. Thoughts?