BUG: to_excel to throw IndexError for dataframe with empty (row) MultiIndex · Issue #19543 · pandas-dev/pandas (original) (raw)

mi = pd.MultiIndex.from_arrays([ np.array(['a', 'a', 'b', 'b']), np.array(['1', '2', '2', '3']), np.array(['alpha', 'beta', 'alpha', 'beta']), ], names=['one', 'two', 'three']) df = pd.DataFrame(np.random.rand(4, 3), index=mi) df2 = df.loc[('b', '1', slice(None)), :] print(df2.index)

Output:

MultiIndex(levels=[['a', 'b'], ['1', '2', '3'], ['alpha', 'beta']],
           labels=[[], [], []],
           names=['one', 'two', 'three'])

The next line causes an IndexError:

df2.to_excel('test.xlsx')

Notice that the 'empty' MultiIndex does not give any indication of its presence when the entire dataframe is printed, looking like an ordinary empty dataframe instead.

Empty DataFrame
Columns: [0, 1, 2]
Index: []

Problem description

I'm not quite sure which behaviour is the problematic one here: the fact that the first block of code creates an 'empty' MultiIndex that has values in levels, or that to_excel doesn't write a DataFrame with such an index to file properly. The 'empty' MultiIndex behaviour is a little bit anomalous by itself, because a similar block of code (below) will cause a KeyError. The KeyError is 'suppressed' by the fact that there is a third level in the MultiIndex.

mi = pd.MultiIndex.from_arrays([ np.array(['a', 'a', 'b', 'b']), np.array(['1', '2', '2', '3']), ], names=['one', 'two']) df = pd.DataFrame(np.random.rand(4, 3), index=mi) df2 = df.loc[('b', '1'), :]

I initially ran into this behaviour (the IndexError) when using xs, but since xs seems to be headed for the deprecation guillotine, I figured I'd rewrite this with loc instead.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-43-Microsoft
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.22.0
pytest: None
pip: 9.0.1
setuptools: 27.2.0
Cython: None
numpy: 1.14.0
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.4
feather: None
matplotlib: None
openpyxl: 2.4.9
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0b10
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None