BUG: iloc can create columns · Issue #6766 · pandas-dev/pandas (original) (raw)

After a concat of two DataFrames with the same columns. I want to consolidate some data and remove NaNs in some columns by values in other columns. I ended up with a DataFrame that magically had additional columns.

This is the minimum example that I can give to reproduce the faulty behaviour using current master (70de129):

df1 = pd.DataFrame([{'A':None, 'B':1},{'A':2, 'B':2}])
df2 = pd.DataFrame([{'A':3, 'B':3},{'A':4, 'B':4}])
df = pd.concat([df1, df2], axis=1)
>>> df1
    A  B
0 NaN  1
1   2  2

[2 rows x 2 columns]

>>> df2
   A  B
0  3  3
1  4  4

[2 rows x 2 columns]

>>> df
    A  B  A  B
0 NaN  1  3  3
1   2  2  4  4

[2 rows x 4 columns]

Now replacing NaNs in the 0 column with (corresponding) values in the 2 column ('A'), I expected to simply write a 3 into NaN (which it did), but it actually added a column '0' at the end of the DataFrame even though iloc is not supposed to enlarge the dataset. Clearly a bug.

inds = np.isnan(df.iloc[:, 0])
df.iloc[:, 0][inds] = df.iloc[:, 2][inds]
>>> df
   A  B  A  B  0
0  3  1  3  3  3
1  2  2  4  4  2

[2 rows x 5 columns]