Rolling correlation for DataFrame with MultiIndex columns broken in Pandas 0.23 · Issue #21157 · pandas-dev/pandas (original) (raw)

import numpy as np
import pandas as pd

index = range(5)
data = np.random.random(size=(len(index), 2))

simple_cols = ['M', 'N']
simple_df = pd.DataFrame(data, index=index, columns=simple_cols)

multi_cols = pd.MultiIndex.from_arrays([['M', 'N'], ['P', 'Q']], names=['a', 'b'])
multi_df = pd.DataFrame(data, index=index, columns=multi_cols)

# Works:
simple_df.rolling(3).corr()

# Fails
multi_df.rolling(3).corr()

Since the Pandas 0.23 release it is no longer possible to calculate rolling correlation on a pd.DataFrame which has a pd.MultiIndex column index. An exception is raised in 0.23. In 0.22 a valid rolling correlation result is returned.

Pandas should return the same rolling correlation matrix as it returns for a data frame with simple index, but with the multi index levels in columns and index as they appeared in 0.22.