Assignment with MultiIndex replaces dataframe contents with NaNs · Issue #3738 · pandas-dev/pandas (original) (raw)
Bug:df.loc['bar'] = df['bar'] * 2
; assignment with rhs having a multi-index
Related to http://stackoverflow.com/questions/16833842/assign-new-values-to-slice-from-multiindex-dataframe
In [246]: arrays = [np.array(['bar', 'bar', 'baz', 'qux', 'qux', 'bar']), np.array(['one', 'two', 'one', 'one', 'two', 'one']), np.arange(0, 6, 1)]
In [247]: df = pd.DataFrame(randn(6, 3), .....: index=arrays, .....: columns=['A', 'B', 'C']).sort_index()
In [248]: df Out[248]: A B C bar one 0 2.186362 -1.749287 -0.776363 5 0.953196 -0.717784 -0.528991 two 1 0.692802 -0.166100 1.403452 baz one 2 -0.065485 0.642380 -1.154436 qux one 3 1.515987 1.138678 -0.690564 two 4 0.261254 -0.066171 -1.352841
In [249]: df.loc['bar'] Out[249]: A B C one 0 2.186362 -1.749287 -0.776363 5 0.953196 -0.717784 -0.528991 two 1 0.692802 -0.166100 1.403452
In [250]: df.loc['bar'] = df.loc['bar'] * 2
In [251]: df Out[251]: A B C bar one 0 NaN NaN NaN 5 NaN NaN NaN two 1 NaN NaN NaN baz one 2 -0.065485 0.642380 -1.154436 qux one 3 1.515987 1.138678 -0.690564 two 4 0.261254 -0.066171 -1.352841
This however works:
In [270]: df Out[270]: A B C bar one 0 1.506060 -2.921602 1.057659 5 -1.264088 6.483417 3.259962 two 1 0.580368 -2.877875 2.916771 baz one 2 -0.003460 -1.779877 0.197661 qux one 3 0.463015 0.193559 -0.705382 two 4 0.074625 -0.258451 0.389310
In [271]: idx = [x for x in df.index if x[0] == 'bar']
In [272]: df.loc[idx] = df.loc[idx] * 2
In [273]: df Out[273]: A B C bar one 0 3.012121 -5.843204 2.115318 5 -2.528176 12.966833 6.519925 two 1 1.160736 -5.755750 5.833541 baz one 2 -0.003460 -1.779877 0.197661 qux one 3 0.463015 0.193559 -0.705382 two 4 0.074625 -0.258451 0.389310
I was also wondering: if I want to assign to a selection via MultiIndex, what is the best way to achieve that? Examples:
df.xs('blah', level=2) = df.xs('blah', level=2) * 2 crit = [x for x in df.index where x[2] == 'blah'] df.ix[crit] = df.ix[crit] * 2 df.loc[crit] = df.loc[crit] * 2 df.reset_index(level=2); df[df.level_2 == 'blah'] = df[df.level_2 == 'blah'] * 2