ENH/API: clarify groupby by to handle columns/index names · Issue #5677 · pandas-dev/pandas (original) (raw)
Referenced briefly in the OP at #3275
In [11]: idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3)])
In [12]: idx.names = ['outer', 'inner']
In [13]: df = pd.DataFrame({"A": np.arange(6), 'B': ['one', 'one', 'two', 'two', 'one', 'one']}, index=idx)
So the idea is to be able to call
df.groupby('B', level='inner')
instead of
In [15]: df.reset_index().groupby(['B', 'inner']).mean()
Out[15]:
A
B inner
one 1 0.0
2 2.5
3 5.0
two 1 3.0
3 2.0
[5 rows x 1 columns]
Currently this raises TypeError: 'numpy.ndarray' object is not callable
. Mostly just syntactic sugar, but I've been having to do a lot of this lately and all the reset_index
es are getting annoying. Thoughts?