merge fails to add suffixes on multiindex columns · Issue #28518 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas as pd index_tuples=[]
for word_group in ["a", "b", "c", "d"]: for correctness in ["1", "2", "3"]: index_tuples.append([word_group, correctness])
index = pd.MultiIndex.from_tuples(index_tuples, names=["outer", "inner"])
frame_x = pd.DataFrame(columns = index) frame_x["id"]=""
frame_y = pd.DataFrame(columns = index) frame_y["id"]=""
print(frame_x.merge(frame_y, on="id").columns)
Problem description
I'm trying to merge to dataframes. Both have a multiindex and an "id" column. The merge happens on "id", the outer layer of the multiindex should receive suffixes. Depending on the number of indices in the multiindex this doesn't work. Only some of the multiindex columns receive suffixes, other's don't. For the codesample, I've set up an empty dataframe, the behaviour is the same when it is filled with data.
The issue seems non-deterministic. Sometimes it happens, sometimes it doesn't. Here is a video:
https://imgur.com/a/rbSvuSl
I'm using pandas version 0.25.1
This is a conda environment, here is its yml file: https://gist.github.com/lhk/ab3cf1f95be37a23789792fd75beef93
Expected Output
All columns in the multiindex should receive either _x or _y as suffix.
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit : None python : 3.6.9.final.0 python-bits : 64 OS : Linux OS-release : 5.0.0-21-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8
pandas : 0.25.1
numpy : 1.16.5
pytz : 2019.2
dateutil : 2.8.0
pip : 19.2.2
setuptools : 41.0.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.1.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.1
sqlalchemy : None
tables : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None