BUG: iloc can create columns · Issue #6766 · pandas-dev/pandas (original) (raw)
After a concat of two DataFrames with the same columns. I want to consolidate some data and remove NaNs in some columns by values in other columns. I ended up with a DataFrame that magically had additional columns.
This is the minimum example that I can give to reproduce the faulty behaviour using current master (70de129):
df1 = pd.DataFrame([{'A':None, 'B':1},{'A':2, 'B':2}])
df2 = pd.DataFrame([{'A':3, 'B':3},{'A':4, 'B':4}])
df = pd.concat([df1, df2], axis=1)
>>> df1
A B
0 NaN 1
1 2 2
[2 rows x 2 columns]
>>> df2
A B
0 3 3
1 4 4
[2 rows x 2 columns]
>>> df
A B A B
0 NaN 1 3 3
1 2 2 4 4
[2 rows x 4 columns]
Now replacing NaNs in the 0 column with (corresponding) values in the 2 column ('A'), I expected to simply write a 3 into NaN (which it did), but it actually added a column '0' at the end of the DataFrame even though iloc is not supposed to enlarge the dataset. Clearly a bug.
inds = np.isnan(df.iloc[:, 0])
df.iloc[:, 0][inds] = df.iloc[:, 2][inds]
>>> df
A B A B 0
0 3 1 3 3 3
1 2 2 4 4 2
[2 rows x 5 columns]