pd.to_numeric(..., errors="coerce") failing silently when strings contain "uint64" · Issue #32394 · pandas-dev/pandas (original) (raw)
Problem description
When trying to coerce strings to numeric values using to_numeric()
, the occurrence of the substring "uint64" (but not any other dtype-like substring it seems) leads to silent failure to coerce.
strs = ["32", "64", "uint32", "float64", "sdnfonsdf uint32 knsdf", "sdnfonsdf uint64 knsdf", "uint64"] print([pd.to_numeric(s, errors="coerce") for s in strs]) pd.to_numeric(pd.Series(["32", "64", "uint64"]), errors="coerce")
[32, 64, nan, nan, nan, 'sdnfonsdf uint64 knsdf', 'uint64']
0 32 1 64 2 uint64 dtype: object
Expected Output
[32, 64, nan, nan, nan, nan, nan]
0 32.0 1 64.0 2 NaN dtype: float64
Seems to fail equally in 0.25.3 and 1.0...
Output of pd.show_versions()
INSTALLED VERSIONS
commit : None
pandas : 0.25.3
numpy : 1.17.3
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200119
Cython : None
pytest : 5.3.4
hypothesis : None
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.1
html5lib : None
pymysql : 0.9.3
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.14.1
pytables : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.12
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None