pd.to_numeric(..., errors="coerce") failing silently when strings contain "uint64" · Issue #32394 · pandas-dev/pandas (original) (raw)

Problem description

When trying to coerce strings to numeric values using to_numeric(), the occurrence of the substring "uint64" (but not any other dtype-like substring it seems) leads to silent failure to coerce.

strs = ["32", "64", "uint32", "float64", "sdnfonsdf uint32 knsdf", "sdnfonsdf uint64 knsdf", "uint64"] print([pd.to_numeric(s, errors="coerce") for s in strs]) pd.to_numeric(pd.Series(["32", "64", "uint64"]), errors="coerce")

[32, 64, nan, nan, nan, 'sdnfonsdf uint64 knsdf', 'uint64']

0 32 1 64 2 uint64 dtype: object

Expected Output

[32, 64, nan, nan, nan, nan, nan]

0 32.0 1 64.0 2 NaN dtype: float64

Seems to fail equally in 0.25.3 and 1.0...

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None

pandas : 0.25.3
numpy : 1.17.3
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200119
Cython : None
pytest : 5.3.4
hypothesis : None
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.4.1
html5lib : None
pymysql : 0.9.3
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.4.1
matplotlib : 3.1.2
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.14.1
pytables : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.12
tables : None
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None