BUG: incorrect type when casting to nullable type in multiindex dataframe · Issue #46896 · pandas-dev/pandas (original) (raw)

Pandas version checks

Reproducible Example

import pandas as pd

df = pd.DataFrame( columns=pd.MultiIndex.from_tuples([("a", "c"), ("a", "d")]), data=[[1, 2], [3, 4]], )

df["a"] = df["a"].astype("Int64")

print(df.dtypes) # results in type "object" for both columns

print(df["a"].astype("Int64").dtypes) # here the dtypes are correct

Issue Description

When casting to a nullable type (like "Int64") in a multiindex column dataframe, the resulting type is "object" (where it should be "Int64").

If you cast to a numpy type (e.g. .astype(float)), we do see the expected behavior (namely, a dataframe with types float).

It seems that the problem is not with the .astype(..) method, but with the __setitem__ (i.e., with the re-assignment df["a"] = df["a"]...).

Happy to help, but it seems it requires debugging C code which is beyond the capabilities of my tiny brain.

Expected Behavior

We expect that the columns have the right type.

Installed Versions

INSTALLED VERSIONS

commit : 9f24918
python : 3.8.13.final.0
python-bits : 64
OS : Darwin
OS-release : 21.4.0
Version : Darwin Kernel Version 21.4.0: Fri Mar 18 00:47:26 PDT 2022; root:xnu-8020.101.4~15/RELEASE_ARM64_T8101
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : None.UTF-8

pandas : 1.5.0.dev0+725.g9f24918759.dirty
numpy : 1.21.6
pytz : 2022.1
dateutil : 2.8.2
pip : 22.0.4
setuptools : 62.1.0
Cython : 0.29.28
pytest : 7.1.2
hypothesis : 6.45.1
sphinx : 4.5.0
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.8.0
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.1
IPython : 8.3.0
pandas_datareader: None
bs4 : 4.11.1
bottleneck : 1.3.4
brotli :
fastparquet : 0.8.0
fsspec : 2021.11.0
gcsfs : 2021.11.0
markupsafe : 2.1.1
matplotlib : 3.5.1
numba : 0.55.1
numexpr : 2.8.0
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 7.0.0
pyreadstat : 1.1.5
pyxlsb : None
s3fs : 2021.11.0
scipy : 1.8.0
snappy :
sqlalchemy : 1.4.36
tables : 3.7.0
tabulate : 0.8.9
xarray : 0.18.2
xlrd : 2.0.1
xlwt : 1.3.0
zstandard : None