BUG: Passing multiple levels to stack when having mixed integer/string level names · Issue #8584 · pandas-dev/pandas (original) (raw)
Related #7770
Using the example of the docs (http://pandas.pydata.org/pandas-docs/stable/reshaping.html#multiple-levels):
columns = MultiIndex.from_tuples([('A', 'cat', 'long'), ('B', 'cat', 'long'), ('A', 'dog', 'short'), ('B', 'dog', 'short')],
names=['exp', 'animal', 'hair_length'])
df = DataFrame(randn(4, 4), columns=columns)
CONTEXT: df.stack(level=['animal', 'hair_length'])
and df.stack(level=[1, 2])
are equivalent (feature introduced in #7770). Mixing integers location and string names (eg df.stack(level=['animal', 2])
) gives a ValueError.
But if you have level names of mixed types, some different (and wrong things) happen:
- With a total different number, it still works as it should:
df.columns.names = ['exp', 'animal', 10]
df.stack(level=['animal', 10])
- With the number 1, it treats the 1 as a level number instead of the level name, leading to a wrong result (two times the same level unstacked):
In [42]: df.columns.names = ['exp', 'animal', 1]
In [43]: df.stack(level=['animal', 1])
Out[43]:
exp A B
animal animal
0 cat cat -1.006065 0.401136
dog dog 0.526734 -1.753478
1 cat cat -0.718401 -0.400386
dog dog -0.951336 -1.074323
2 cat cat 1.119843 -0.606982
dog dog 0.371467 -1.837341
3 cat cat -1.467968 1.114524
dog dog -0.040112 0.240026
- With the number 0, it gives a strange error:
In [46]: df.columns.names = ['exp', 'animal', 0]
In [47]: df.stack(level=['animal', 0])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-47-4e9507e0708f> in <module>()
----> 1 df.stack(level=['animal', 0])
/home/joris/scipy/pandas/pandas/core/frame.pyc in stack(self, level, dropna)
3390
3391 if isinstance(level, (tuple, list)):
-> 3392 return stack_multiple(self, level, dropna=dropna)
3393 else:
3394 return stack(self, level, dropna=dropna)
....
/home/joris/scipy/pandas/pandas/core/index.pyc in _partial_tup_index(self, tup, side)
3820 raise KeyError('Key length (%d) was greater than MultiIndex'
3821 ' lexsort depth (%d)' %
-> 3822 (len(tup), self.lexsort_depth))
3823
3824 n = len(tup)
KeyError: 'Key length (2) was greater than MultiIndex lexsort depth (0)'