NaN label in MultiIndex is assigned a non NaN value when writing to excel file · Issue #13511 · pandas-dev/pandas (original) (raw)
Given a DataFrame which has a MultiIndex. When a label of the MultiIndex has the value NaN and the DataFrame is written to an excel file, the label will have a value which is not NaN in the excel file.
Code Sample, a copy-pastable example if possible
df = pd.DataFrame({'c1': [1,1,2,2],
'c2': [None] + "b a b".split()})
df
returns a DataFrame where the first element of column 'c2' is NaN:
c1 c2
0 1
1 1 b
2 2 a
3 2 b
Set both columns as the index:
df_midx = df.set_index(['c1', 'c2'])
df_midx
returns
Write DataFrame to excel file and read it back in:
df_midx.to_excel('df_midx.xlsx')
df_midx_from_xlsx = pd.read_excel('df_midx.xlsx')
df_midx_from_xlsx
returns
c1 c2
0 1.0 b
1 b
2 2.0 a
3 b
The first element of column 'c2' is now set to 'b' instead of NaN.
Expected Output
c1 c2
0 1.0
1 b
2 2.0 a
3 b
output of pd.show_versions()
INSTALLED VERSIONS
commit: ac174349b0e1525475c2354e1c0b8ee1ed1cabad
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-88-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: None
pip: 1.5.4
setuptools: 2.2
Cython: None
numpy: 1.11.0
scipy: 0.16.1
statsmodels: 0.6.1
xarray: None
IPython: 4.0.1
sphinx: None
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: 2.5.2
matplotlib: 1.5.0
openpyxl: 2.3.5
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None