BUG: .stack(dropna=False) looks through views incorrectly for dataframe views with multi-index columns · Issue #8844 · pandas-dev/pandas (original) (raw)

In the example below, [11] is incorrectly reflecting columns in dfa that should not be visible to dfa1. Note that this is not a problem when the columns are not a multi-index ([5] and [6]), or when dropna=True (the default; [10] and [12]).

Python 3.4.2 (v3.4.2:ab2c023a9432, Oct  6 2014, 22:16:31) [MSC v.1600 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 2.3.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: df = pd.DataFrame(np.zeros((2,5)), columns=list('ABCDE'))

In [4]: df1 = df[df.columns[:2]]

In [5]: df1.stack()
Out[5]:
0  A    0
   B    0
1  A    0
   B    0
dtype: float64

In [6]: df1.stack(dropna=False)
Out[6]:
0  A    0
   B    0
1  A    0
   B    0
dtype: float64

In [7]: dfa = pd.DataFrame(np.zeros((2,5)),
                           columns=pd.MultiIndex.from_tuples([(1,'A'), (1,'B'), (1,'C'), (1,'D'), (1,'E')],
                                                             names=['num', 'let']))

In [8]: dfa1 = dfa[dfa.columns[:2]]

In [10]: dfa1.stack()
Out[10]:
num    1
  let
0 A    0
  B    0
1 A    0
  B    0

In [11]: dfa1.stack(dropna=False)
Out[11]:
num     1
  let
0 A     0
  B     0
  C   NaN
  D   NaN
  E   NaN
1 A     0
  B     0
  C   NaN
  D   NaN
  E   NaN

In [12]: dfa1.stack(dropna=True)
Out[12]:
num    1
  let
0 A    0
  B    0
1 A    0
  B    0

In [13]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.4.2.final.0
python-bits: 64
OS: Windows
OS-release: 8
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.15.1
nose: 1.3.4
Cython: 0.21.1
numpy: 1.9.1
scipy: 0.14.0
statsmodels: 0.6.0
IPython: 2.3.1
sphinx: None
patsy: 0.3.0
dateutil: 2.2
pytz: 2014.9
bottleneck: 0.8.0
tables: 3.1.1
numexpr: 2.4
matplotlib: 1.4.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.6.3
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.8
pymysql: None
psycopg2: None