DEPR: Merging on different number of levels · Issue #34862 · pandas-dev/pandas (original) (raw)
Two dataframes with MultiIndex in columns (shapes are different).
df1_columns = pd.MultiIndex.from_tuples([("A0", "B0", "C0"), ("A1", "B1", "C1")]) df1 = pd.DataFrame([[1, 2], [10, 20]], columns=df1_columns)
df2_columns = pd.MultiIndex.from_tuples([("X0", "Y0"), ("X1", "Y1")]) df2 = pd.DataFrame([[1, 200], [10, 200]], columns=df2_columns)
Case 1
pd.merge(df1, df2, left_on=[("A0", "B0","C0")], right_on=[("X0", "Y0")])
returns
(A0, B0, C0) (A1, B1, C1) (X0, Y0) (X1, Y1)
0 1 2 1 200
1 10 20 10 200
It's ok. Columns saved its structure from the both dataframes. They are tuples, but it can be fixed (for example with something like this).
Case 2
pd.merge(df2, df1, left_on=[("X0", "Y0")], right_on=[("A0", "B0","C0")])
returns
X0 X1 A0 A1
Y0 Y1 B0 B1
0 1 200 1 2
1 10 200 10 20
When the left DataFrame is a DataFrame with fewer column levels the merge result is questionable: lower levels are dropped.
Is this a bug or a feature (result levels shape must be the same as for the left DataFrame)?
Notes:
- In both cases the warning is the same
UserWarning: merging between different levels can give an unintended result ...
. - Checked with pandas 0.25.1 and 1.0.4.
- See notebook.
- I have searched the [pandas] tag on StackOverflow for similar questions.
- I have asked my usage related question on StackOverflow (see 62490244).