BUG: Data types are not preserved while concatenating DataFrames with nullable integers · Issue #27692 · pandas-dev/pandas (original) (raw)

Copy-pastable example

input data

import pandas as pd t1 = pd.DataFrame(index=[0], data={'x':[1]}, dtype='UInt8') t2 = pd.DataFrame(index=[1], data={'y':[1]}, dtype='UInt8') t3 = pd.concat([t1,t2], join='outer', sort=False)

'''actual result''' print(t3.dtypes)

x object

y object

dtype: object

Problem description

Data types are not preserved data type while concatenating DataFrames with nullable integers. Instead, the result of concatenation has mixed data types and so the column types are object:

type(t3.at[0,'x']) <class 'int'> type(t3.at[1,'x']) <class 'float'>

Expected Output

'''expected result''' print(t3.dtypes)

x UInt8

y UInt8

dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit : None python : 3.6.3.final.0 python-bits : 64 OS : Linux OS-release : 3.10.0-862.11.6.el7.x86_64 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : en_US.utf-8 LANG : en_US.utf-8 LOCALE : en_US.UTF-8

pandas : 0.25.0
numpy : 1.16.3
pytz : 2018.4
dateutil : 2.7.3
pip : 19.1.1
setuptools : 39.0.1
Cython : 0.29.12
pytest : 3.3.2
hypothesis : None
sphinx : 1.6.6
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.3.4
html5lib : 0.9999999
pymysql : None
psycopg2 : None
jinja2 : 2.10
IPython : 6.2.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.0.2
numexpr : 2.6.4
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.11.1
pytables : None
s3fs : None
scipy : 1.1.0
sqlalchemy : None
tables : 3.5.2
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None