AssertionError while boolean indexing with non-unique columns · Issue #4879 · pandas-dev/pandas (original) (raw)

Hit this AssertionError: cannot create BlockManager._ref_locs ... while processing a dataframe with multiple empty column names. Appears to fail with 0.12/master and pass with 0.11.

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.rand(3,4), columns=['', '', 'C', 'D'])
df[df.C > .5]
Traceback (most recent call last):
  File "test_fail.py", line 5, in <module>
    df[df.C > .5]
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/frame.py", line 1830, in __getitem__
    return self._getitem_frame(key)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/frame.py", line 1898, in _getitem_frame
    return self.where(key)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/generic.py", line 2372, in where
    cond = cond.reindex(**self._construct_axes_dict())
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/generic.py", line 1118, in reindex
    return self._reindex_axes(axes, level, limit, method, fill_value, copy, takeable=takeable)._propogate_attributes(self)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/frame.py", line 2413, in _reindex_axes
    fill_value, limit, takeable=takeable)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/frame.py", line 2436, in _reindex_columns
    copy=copy, fill_value=fill_value, allow_dups=takeable)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/generic.py", line 1218, in _reindex_with_indexers
    fill_value=fill_value, allow_dups=allow_dups)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/internals.py", line 2840, in reindex_indexer
    return self._reindex_indexer_items(new_axis, indexer, fill_value)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/internals.py", line 2885, in _reindex_indexer_items
    return self.__class__(new_blocks, new_axes)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/internals.py", line 1735, in __init__
    self._set_ref_locs(do_refs=True)
  File "/home/gmd/.virtualenvs/pandas-master/local/lib/python2.7/site-packages/pandas-0.12.0_476_gb889fda-py2.7-linux-i686.egg/pandas/core/internals.py", line 1863, in _set_ref_locs
    "does not have _ref_locs set" % (block, labels))
AssertionError: cannot create BlockManager._ref_locs because block [BoolBlock: [C], 1 x 3, dtype: bool] with duplicate items [Index([u'', u'', u'C', u'D'], dtype=object)] does not have _ref_locs set