BUG: Joining single-index DataFrame to multiindex DF incorrect for how=left and how=right · Issue #10741 · pandas-dev/pandas (original) (raw)
When joining DataFrames where the calling frame is a multiindex DF and the input frame is a single-index DF, how='left'
and how='right'
produce results that should be swapped (i.e., 'left'
returns what 'right'
should return, and vice versa). To give a single example using how='left'
:
df1 = pd.DataFrame([['a', 'x', 0.471780], ['a','y', 0.774908], ['a', 'z', 0.563634],
['b', 'x', -0.353756], ['b', 'y', 0.368062], ['b', 'z', -1.721840],
['c', 'x', 1], ['c', 'y', 2], ['c', 'z', 3],
],
columns=['first', 'second', 'value1']
).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10], ['b', 20]],
columns=['first', 'value2']).set_index(['first'])
print(df1.join(df2, how='left'))
value1 value2
first second
a x 0.471780 10
y 0.774908 10
z 0.563634 10
b x -0.353756 20
y 0.368062 20
z -1.721840 20
c x 1.000000 NaN
y 2.000000 NaN
z 3.000000 NaN
value1 value2
first second
a x 0.471780 10
y 0.774908 10
z 0.563634 10
b x -0.353756 20
y 0.368062 20
z -1.721840 20
However, the correct behavior occurs if the single-index DF is the calling frame and the multiindex DF is the input frame (df2.join(df1, how='left')
). Behavior for how='inner'
and how='outer'
is correct in both situations.