API: DataFrameGroupBy column subset selection with single list? · Issue #23566 · pandas-dev/pandas (original) (raw)
I wouldn't be surprised if there is already an issue about this, but couldn't directly find one.
When doing a subselection of columns on a DataFrameGroupBy object, both a plain list (so a tuple within the __getitem__
[] brackets) as the double square brackets (a list inside the __getitem__
[] brackets) seems to work:
In [6]: df = pd.DataFrame(np.random.randint(10, size=(10, 4)), columns=['a', 'b', 'c', 'd'])
In [8]: df.groupby('a').sum()
Out[8]:
b c d
a
0 0 5 7
3 18 6 12
4 16 6 9
6 10 11 11
9 3 3 0
In [9]: df.groupby('a')['b', 'c'].sum()
Out[9]:
b c
a
0 0 5
3 18 6
4 16 6
6 10 11
9 3 3
In [10]: df.groupby('a')[['b', 'c']].sum()
Out[10]:
b c
a
0 0 5
3 18 6
4 16 6
6 10 11
9 3 3
Personally I find this df.groupby('a')['b', 'c'].sum()
a bit strange, and inconsistent with how DataFrame indexing works.
Of course, on a DataFrameGroupBy you don't have the possible confusion with indexing multiple dimensions (rows, columns), but still.