BUG: Accessing DataFrame multi-index column seems to modify its content · Issue #8850 · pandas-dev/pandas (original) (raw)
In the example below, [13]
gives a different result than [11]
, even though you would think that [12]
has no effect on data.
Python 2.7.8 (default, Oct 19 2014, 16:02:00)
Type "copyright", "credits" or "license" for more information.
IPython 2.1.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: d = {'COLA': {0: 0, 1: 0, 2: 0},
...: 'COLB': {0: 'a', 1: 'a', 2: 'a'},
...: 'COLC': {0: 0, 1: 0, 2: 1}}
In [4]: df = pd.DataFrame(d)
In [5]: df
Out[5]:
COLA COLB COLC
0 0 a 0
1 0 a 0
2 0 a 1
In [6]: df = df.groupby(['COLA','COLB'])['COLC']\
...: .agg({'Zeros': lambda x: 0,
...: 'Averages': lambda x: 100.*x.mean(),
...: 'Weird_stuff': np.size})\
...: .unstack()
In [7]: df
Out[7]:
Averages Weird_stuff Zeros
COLB a a a
COLA
0 33.333333 3 0
In [8]: df.columns
Out[8]:
MultiIndex(levels=[[u'Averages', u'Weird_stuff', u'Zeros'], [u'a']],
labels=[[0, 1, 2], [0, 0, 0]],
names=[None, u'COLB'])
In [9]: df.index
Out[9]: Int64Index([0], dtype='int64')
In [10]: df['Weird_stuff'] = df['Weird_stuff'].apply(lambda x: 1000000., axis=1)
/usr/local/lib/python2.7/site-packages/numpy/lib/function_base.py:3612: FutureWarning: in the future negative indices will not be ignored by `numpy.delete`.
"`numpy.delete`.", FutureWarning)
In [11]: df
Out[11]:
Averages Weird_stuff Zeros
COLB a a a
COLA
0 33.333333 1000000 0
In [12]: df['Zeros']
Out[12]:
COLB a
COLA
0 0
In [13]: df # Look the value below ('Weird_stuf', 'a') here and above.
Out[13]:
Averages Weird_stuff Zeros
COLB a a a
COLA
0 33.333333 3 0
In [14]: pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Darwin
OS-release: 14.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: pl_PL.UTF-8
pandas: 0.15.1
nose: 1.3.0
Cython: 0.20.2
numpy: 1.9.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: 2.1.0
sphinx: 1.2.2
patsy: 0.2.1
dateutil: 2.2
pytz: 2014.7
bottleneck: 0.8.0
tables: 3.1.0
numexpr: 2.3.1
matplotlib: 1.3.1
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: 3.3.2
bs4: 4.3.2
html5lib: 0.999
httplib2: 0.9
apiclient: 1.2
rpy2: None
sqlalchemy: 0.9.7
pymysql: None
psycopg2: None