BUG: conversion from Int64 to string introduces decimals in Pandas 2.2.0 · Issue #57418 · pandas-dev/pandas (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd pd.Series([12, pd.NA], dtype="Int64[pyarrow]").astype("string[pyarrow]")
yields
0 12.0
1 <NA>
dtype: string
Issue Description
Converting from Int64[pyarrow] to string[pyarrow] introduces float notation, if NAs are present. I.e. 12
will be converted to "12.0"
. This bug got introduced in Pandas 2.2.0, is still present on Pandas 2.2.x branch, but is not present on Pandas main branch.
Expected Behavior
import pandas as pd
x = pd.Series([12, pd.NA], dtype="Int64[pyarrow]").astype("string[pyarrow]")
assert x[0] == "12"
Installed Versions
INSTALLED VERSIONS
commit : fd3f571
python : 3.11.7.final.0
python-bits : 64
OS : Darwin
OS-release : 23.2.0
Version : Darwin Kernel Version 23.2.0: Wed Nov 15 21:59:33 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8112
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 2.2.0
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.8.2
setuptools : 69.0.3
pip : 24.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 15.0.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None