BUG: Regression in assert_frame_equal with check_like=True and Multiindex · Issue #48975 · pandas-dev/pandas (original) (raw)
Pandas version checks
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
df_1 = pd.DataFrame([("a", "a", 1), ("b", "b", 2), ("c", "c", 3)]) df_1[0] = df_1[0].astype(pd.CategoricalDtype(["a", "b", "c"])) df_1[1] = df_1[1].astype(pd.CategoricalDtype(["a", "b", "c"]))
df_2 = pd.DataFrame([("c", "c", 3), ("b", "b", 2), ("a", "a", 1)]) df_2[0] = df_2[0].astype(pd.CategoricalDtype(["a", "b", "c"])) df_2[1] = df_2[1].astype(pd.CategoricalDtype(["a", "b", "c"]))
assert_frame_equal( df_1.set_index([0, 1]), df_2.set_index([0, 1]), check_like=True, )```
Gives an AssertionError
:
```--------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Input In [24], in <cell line: 9>() 6 df_2[0] = df_2[0].astype(pd.CategoricalDtype(["a", "b", "c"])) 7 df_2[1] = df_2[1].astype(pd.CategoricalDtype(["a", "b", "c"])) ----> 9 assert_frame_equal( 10 df_1.set_index([0, 1]), 11 df_2.set_index([0, 1]), 12 check_like=True, 13 )
[... skipping hidden 3 frame]
File ~/miniconda3/envs/gams40/lib/python3.10/site-packages/pandas/_testing/asserters.py:318, in assert_index_equal.._check_types(left, right, obj)
315 if not exact:
316 return
--> 318 assert_class_equal(left, right, exact=exact, obj=obj)
319 assert_attr_equal("inferred_type", left, right, obj=obj)
321 # Skip exact dtype checking when check_categorical
is False
[... skipping hidden 1 frame]
File ~/miniconda3/envs/gams40/lib/python3.10/site-packages/pandas/_testing/asserters.py:677, in raise_assert_detail(obj, message, left, right, diff, index_values) 674 if diff is not None: 675 msg += f"\n[diff]: {diff}" --> 677 raise AssertionError(msg)
AssertionError: MultiIndex level [0] are different
MultiIndex level [0] classes are different [left]: CategoricalIndex(['a', 'b', 'c'], categories=['a', 'b', 'c'], ordered=False, dtype='category', name=0) [right]: Index(['a', 'b', 'c'], dtype='object', name=0)```
Issue Description
The example passes in Pandas 1.4.4, but fails in 1.5.0. Unclear why AssertionError message shows that the class type of [right] might have changed?
Expected Behavior
Expected consistent behavior between Pandas 1.4.4 and 1.5.0.
Installed Versions
INSTALLED VERSIONS
commit : 87cfe4e
python : 3.10.4.final.0
python-bits : 64
OS : Darwin
OS-release : 21.6.0
Version : Darwin Kernel Version 21.6.0: Wed Aug 10 14:25:27 PDT 2022; root:xnu-8020.141.5~2/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.5.0
numpy : 1.23.1
pytz : 2022.4
dateutil : 2.8.2
setuptools : 63.4.1
pip : 22.2.2
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 8.4.0
pandas_datareader: None
bs4 : None
bottleneck : None
brotli : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.9.1
snappy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None
tzdata : None