ERR: invalid MultiIndex construction · Issue #17464 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas as pd import numpy as np idx0 = range(2) idx1 = np.repeat(range(2), 2)
midx = pd.MultiIndex( levels=[idx0, idx1], labels=[ np.repeat(range(len(idx0)), len(idx1)), np.tile(range(len(idx1)), len(idx0)) ], names=['idx0', 'idx1'] )
df = pd.DataFrame( [ [i2/float(j), 'example{}'.format(i), i3/float(j)] for j in range(1, len(idx0) + 1) for i in range(1, len(idx1) + 1) ], columns=['col0', 'col1', 'col2'], index=midx )
example = df.loc[[(0, 1)]]
For display
In [13]: example Out[13]: col0 col1 col2 idx0 idx1 0 1 16.0 example4 64.0
Problem description
Taken from this StackOverflow post:
Given the following dataframe
col0 col1 col2
idx0 idx1 0 0 1.0 example1 1.0 0 4.0 example2 8.0 1 9.0 example3 27.0 1 16.0 example4 64.0 1 0 0.5 example1 0.5 0 2.0 example2 4.0 1 4.5 example3 13.5 1 8.0 example4 32.0
the .xs operation will select
In [121]: df.xs((0,1), level=[0,1]) Out[121]: col0 col1 col2 idx0 idx1 0 1 9.0 example3 27.0 1 16.0 example4 64.0
whilst the .loc operation will select
In [125]: df.loc[[(0,1)]] Out[125]: col0 col1 col2 idx0 idx1 0 1 16.0 example4 64.0
This is highlighted even further by the following
In [149]: df.loc[pd.IndexSlice[:, 1], :] Out[149]: col0 col1 col2 idx0 idx1 0 1 9.0 example3 27.0 1 16.0 example4 64.0
In [150]: df.loc[pd.IndexSlice[0, 1], :] Out[150]: col0 16 col1 example4 col2 64 Name: (0, 1), dtype: object
Expected Output
Note that this only works for this minimal example because there is only one label in level 0 index axis
In [8]: df.loc[pd.IndexSlice[:, 1], :] Out[8]: col0 col1 col2 idx0 idx1 0 1 9.0 example3 27.0 1 16.0 example4 64.0
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.20.3
pytest: 3.0.7
pip: 9.0.1
setuptools: 36.2.7
Cython: 0.25.2
numpy: 1.13.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: 0.5.0