BUG: Series.replace not working if there are duplicated columns (original) (raw)

Pandas version checks

Reproducible Example

import pandas as pd

df = pd.DataFrame([['a', 'x', 'o'], ['b', 'y', 'p'], ['c', 'z', 'q']], columns=['F', 'F', 'G']) df['G'].replace('p', 'NEW', inplace=True) print(df)

Issue Description

When the DataFrame has duplicated columns names, the Series.replace with inplace=True of a not-duplicated column does not work (it does not perform the replacement) and raises a warning:

/home/arnau/.virtualenvs/lr/lib/python3.10/site-packages/pandas/core/generic.py:6619: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return self._update_inplace(result)

Some examples:

# doesn't work
df = pd.DataFrame([['a', 'x', 'o'], ['b', 'y', 'p'], ['c', 'z', 'q']], columns=['F', 'F', 'G'])
df['G'].replace('p', 'NEW', inplace=True)
# do work (because not inplace)
df = pd.DataFrame([['a', 'x', 'o'], ['b', 'y', 'p'], ['c', 'z', 'q']], columns=['F', 'F', 'G'])
df['G'] = df['G'].replace('p', 'NEW')
# do work (because there is no duplicated column)
df = pd.DataFrame([['a', 'x', 'o'], ['b', 'y', 'p'], ['c', 'z', 'q']], columns=['F', 'H', 'G'])
df['G'].replace('p', 'NEW', inplace=True)

Expected Behavior

The replacement being done, and without any warning.

Installed Versions

INSTALLED VERSIONS

commit : 66e3805
python : 3.10.0.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-43-generic
Version : #47~20.04.2-Ubuntu SMP Mon Dec 13 11:06:56 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.5
numpy : 1.21.2
pytz : 2021.1
dateutil : 2.8.2
pip : 21.3.1
setuptools : 44.1.1
Cython : 0.29.24
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.7.1
html5lib : None
pymysql : 0.9.3
psycopg2 : 2.9.1 (dt dec pq3 ext lo64)
jinja2 : 3.0.3
IPython : 7.28.0
pandas_datareader: None
bs4 : 4.10.0
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.3
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 6.0.1
pyxlsb : None
s3fs : None
scipy : 1.7.2
sqlalchemy : 1.2.17
tables : None
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.55.0rc1