Replacing a column with iloc replaces another column with same name · Issue #22036 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas
a = pandas.DataFrame({'a': ['0'], 'b': ['str']})
print('---') print(a)
a.iloc[:, 0] = [int(v) for v in a.iloc[:, 0]]
print('---') print(a)
b = pandas.concat([a, pandas.DataFrame({'b': ['str2']})], axis=1)
print('---') print(b)
b.iloc[:, 2] = ['str3']
print('---') print(b)
Problem description
The issue seems to be that if there is a DataFrame with duplicate column names and mixed dtypes, if I try to replace one column with another value, using iloc
, another column with same name is replaced as well.
Expected Output
The final print should be:
And not:
It is interesting that if I change concat line to (see renaming of column b
to c
):
b = pandas.concat([a, pandas.DataFrame({'c': ['str2']})], axis=1)
Then the output is correctly:
Also, if a.iloc[:, 0] = [int(v) for v in a.iloc[:, 0]]
is commented out, it works also correctly.
Moreover, the following also work correctly (see the change of 0
column index into [0]
column index, and similar for 2
(this latter change is not really necessary to make it work)):
import pandas
a = pandas.DataFrame({'a': ['0'], 'b': ['str']})
print('---') print(a)
a.iloc[:, [0]] = [int(v) for v in a.iloc[:, 0]]
print('---') print(a)
b = pandas.concat([a, pandas.DataFrame({'b': ['str2']})], axis=1)
print('---') print(b)
b.iloc[:, [2]] = ['str3']
print('---') print(b)
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.0-46-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.3
pytest: None
pip: 18.0
setuptools: 40.0.0
Cython: None
numpy: 1.15.0
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None