BUG: Accessing DataFrame multi-index column seems to modify its content · Issue #8850 · pandas-dev/pandas (original) (raw)

In the example below, [13] gives a different result than [11], even though you would think that [12] has no effect on data.

Python 2.7.8 (default, Oct 19 2014, 16:02:00) 
Type "copyright", "credits" or "license" for more information.

IPython 2.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: d = {'COLA': {0: 0, 1: 0, 2: 0},
   ...:      'COLB': {0: 'a', 1: 'a', 2: 'a'},
   ...:      'COLC': {0: 0, 1: 0, 2: 1}}

In [4]: df = pd.DataFrame(d)

In [5]: df
Out[5]: 
   COLA COLB  COLC
0     0    a     0
1     0    a     0
2     0    a     1

In [6]: df = df.groupby(['COLA','COLB'])['COLC']\
   ...:         .agg({'Zeros': lambda x: 0,
   ...:               'Averages': lambda x: 100.*x.mean(),
   ...:               'Weird_stuff': np.size})\
   ...:     .unstack()

In [7]: df
Out[7]: 
       Averages Weird_stuff Zeros
COLB          a           a     a
COLA                             
0     33.333333           3     0

In [8]: df.columns
Out[8]: 
MultiIndex(levels=[[u'Averages', u'Weird_stuff', u'Zeros'], [u'a']],
           labels=[[0, 1, 2], [0, 0, 0]],
           names=[None, u'COLB'])

In [9]: df.index
Out[9]: Int64Index([0], dtype='int64')

In [10]: df['Weird_stuff'] = df['Weird_stuff'].apply(lambda x: 1000000., axis=1)
/usr/local/lib/python2.7/site-packages/numpy/lib/function_base.py:3612: FutureWarning: in the future negative indices will not be ignored by `numpy.delete`.
  "`numpy.delete`.", FutureWarning)

In [11]: df
Out[11]: 
       Averages Weird_stuff Zeros
COLB          a           a     a
COLA                             
0     33.333333     1000000     0

In [12]: df['Zeros']
Out[12]: 
COLB  a
COLA   
0     0

In [13]: df # Look the value below ('Weird_stuf', 'a') here and above.
Out[13]: 
       Averages Weird_stuff Zeros
COLB          a           a     a
COLA                             
0     33.333333           3     0

In [14]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Darwin
OS-release: 14.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: pl_PL.UTF-8

pandas: 0.15.1
nose: 1.3.0
Cython: 0.20.2
numpy: 1.9.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 2.1.0
sphinx: 1.2.2
patsy: 0.2.1
dateutil: 2.2
pytz: 2014.7
bottleneck: 0.8.0
tables: 3.1.0
numexpr: 2.3.1
matplotlib: 1.3.1
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: 3.3.2
bs4: 4.3.2
html5lib: 0.999
httplib2: 0.9
apiclient: 1.2
rpy2: None
sqlalchemy: 0.9.7
pymysql: None
psycopg2: None