BUG: fix astype conversion string -> float by mlondschien · Pull Request #37974 · pandas-dev/pandas (original) (raw)
In [1]: import pandas as pd
In [2]: ser = pd.Series([1, pd.NA, 2], dtype="string")
In [3]: ser.astype("int")
...
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NAType'
Hmm. As far as I understood, int
(the non-extension array dtype) does not support missings. See:
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: pd.Series([1.0, np.nan], dtype="float").astype("int")
ValueError: Cannot convert non-finite values (NA or inf) to integer
However, using the nullable extension array dtype Int64
, the conversion works:
In [4]: pd.Series(["1", pd.NA], dtype="string").astype("Int64")
Out[4]:
0 1
1 <NA>
dtype: Int64
Would you prefer if above (string
with pd.NA
to int
) raised a ValueError
instead?
Edit: Were you possibly referring toobject
-> Int64
as in pd.Series(["1", "2"]).astype("Int64")
?
Unrelated: Any idea what's up with the failing tests?