REGR: fix error caused by deprecation warning in pd.merge when joining two Series by nsarang · Pull Request #47946 · pandas-dev/pandas (original) (raw)

@jorisvandenbossche

We wanted to get rid of value dependent behaviour. Your op looks "correct", but if "a" also has a midx, then the tuple name is converted into a MultiIndex itself:

a = DataFrame(
    {("A", "B"): 1},
    index=MultiIndex.from_product([["a"], [0, 1]], names=["o", "i"]),
)
b = Series(
    1,
    index=MultiIndex.from_product([["a"], [1, 2]], names=["o", "i"]),
    name=("B", "C"),
)
pd.merge(b, a, on=["o", "i"])

And the more complicated case:

a = DataFrame(
    {("A", "B"): 1},
    index=MultiIndex.from_product([["a"], [0, 1]], names=["o", "i"]),
)
b = Series(
    1,
    index=MultiIndex.from_product([["a"], [1, 2]], names=["o", "i"]),
    name=("B", "C", "D"),
)
pd.merge(a, b, on=["o", "i"])

raises: ValueError: Length of names must match number of levels in MultiIndex.

If the DataFrame has more levels than our tuple name has elements, we flatten the DataFrame MultiIndex columns to a tuple too:

a = DataFrame(
    {("A", "B", "C"): 1},
    index=MultiIndex.from_product([["a"], [0, 1]], names=["o", "i"]),
)
b = Series(
    1,
    index=MultiIndex.from_product([["a"], [1, 2]], names=["o", "i"]),
    name=("B", "C"),
)
pd.merge(a, b, on=["o", "i"])
     (A, B, C)  (B, C)
o i                   
a 1          1       1