BUG: GroupBy.apply iterates over first group twice · Issue #12155 · pandas-dev/pandas (original) (raw)

When using the GroupBy.apply() method with a custom function handle, the first group is iterated over twice. Explicitly looping over each group and applying the function directly does not yield the error.

Error reproduction

import pandas as pd df = pd.DataFrame({'one':[1,2,3,1,2,3],'two':[4,5,6,7,8,9]}) df one two 0 1 4 1 2 5 2 3 6 3 1 7 4 2 8 5 3 9

gp = df.groupby('one') def foo(group): ... print(group.iloc[[0]].index[0]) ... gp.apply(foo) 0 0 1 2 Empty DataFrame Columns: [] Index: []

len(gp) 3

The example prints the row index of the first group member. The '0' is printed twice.

Versions:

pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.5.1.final.0 python-bits: 64 OS: Windows OS-release: 7 machine: AMD64 processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None

pandas: 0.17.1 nose: None pip: 8.0.1 setuptools: 19.4 Cython: None numpy: 1.10.1 scipy: None statsmodels: None IPython: None sphinx: None patsy: None dateutil: 2.4.2 pytz: 2015.7 blosc: None bottleneck: None tables: None numexpr: None matplotlib: None openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: None httplib2: None apiclient: None sqlalchemy: None pymysql: None psycopg2: None Jinja2: None