BUG: fix astype conversion string -> float by mlondschien · Pull Request #37974 · pandas-dev/pandas (original) (raw)

In [1]: import pandas as pd
In [2]: ser = pd.Series([1, pd.NA, 2], dtype="string")
In [3]: ser.astype("int")
...
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NAType'

Hmm. As far as I understood, int (the non-extension array dtype) does not support missings. See:

In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: pd.Series([1.0, np.nan], dtype="float").astype("int")
ValueError: Cannot convert non-finite values (NA or inf) to integer

However, using the nullable extension array dtype Int64, the conversion works:

In [4]: pd.Series(["1", pd.NA], dtype="string").astype("Int64")
Out[4]: 
0       1
1    <NA>
dtype: Int64

Would you prefer if above (string with pd.NA to int) raised a ValueError instead?

Edit: Were you possibly referring toobject -> Int64 as in pd.Series(["1", "2"]).astype("Int64")?

Unrelated: Any idea what's up with the failing tests?