BUG: DataFrame.replace() is inconsistent between float64
and Float64
· Issue #55398 · pandas-dev/pandas (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd pd.DataFrame({"a": 0.0, "b": 0.0}, index=[0]).astype( {"a": "Float64", "b": "float64"} ).replace(False, 1)
Issue Description
Depending on the column dtype, zero values are picked up when replacing False
values in the DataFrame, so the column a
ends up with 1 and column b
ends up with 0.
For the record, other forms of .replace()
seem to have the same issue:
pd.DataFrame({"a": 0.0, "b": 0.0}, index=[0]).astype(
{"a": "Float64", "b": "float64"}
).replace({"a": {False: 1}})
pd.DataFrame({"a": 0.0, "b": 0.0}, index=[0]).astype(
{"a": "Float64", "b": "float64"}
).replace({"a": {False: 1}, "b": {False: 1}})
There is also a possibly unrelated issue of infering the resulting dtype. The following will work, converting the column a
to object
dtype:
pd.DataFrame({"a": [0, 1, 2]}, dtype="float64").replace({1: "A"})
while this next one will end up with a TypeError: Invalid value 'A' for dtype Float64
pd.DataFrame({"a": [0, 1, 2]}, dtype="Float64").replace({1: "A"})
I am not sure if this behavior is an intended property of Float64
or not and whether it is a separate issue or not.
Expected Behavior
Results for both dtypes (in both columns) should be the same. I would expect it to not treat 0 as False
, but either way both replaced values should be the same.
Installed Versions
INSTALLED VERSIONS ------------------ commit : e86ed37python : 3.11.5.final.0 python-bits : 64 OS : Linux OS-release : 6.2.0-33-generic Version : #33~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep 7 10:33:52 UTC 2 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 2.1.1
numpy : 1.25.2
pytz : 2023.3
dateutil : 2.8.2
setuptools : 65.5.0
pip : 23.2.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 3.1.2
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.16.0
pandas_datareader : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.7.2
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 13.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None