indexing.py: "'bool' object has no attribtute 'any'" with duplicate time index · Issue #17105 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
trange = pd.date_range(start=pd.Timestamp(year=2017, month=1, day=1), end=pd.Timestamp(year=2017, month=1, day=5))
make a duplicate
trange = trange.insert(loc=5, item=pd.Timestamp(year=2017, month=1, day=5))
df = pd.DataFrame(0, index=trange, columns=["A", "B"]) bool_idx = np.array([True, False, False, False, False, True]) df.loc[trange[bool_idx], "A"] += 1
Throws error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-32-b0e70145e9a6> in <module>()
----> 1 df.loc[trange[bool_idx], "A"] += 1
/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
176 else:
177 key = com._apply_if_callable(key, self.obj)
--> 178 indexer = self._get_setitem_indexer(key)
179 self._setitem_with_indexer(indexer, value)
180
/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in _get_setitem_indexer(self, key)
155 if isinstance(key, tuple):
156 try:
--> 157 return self._convert_tuple(key, is_setter=True)
158 except IndexingError:
159 pass
/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in _convert_tuple(self, key, is_setter)
222 if i >= self.obj.ndim:
223 raise IndexingError('Too many indexers')
--> 224 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
225 keyidx.append(idx)
226 return tuple(keyidx)
/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
1228
1229 mask = check == -1
-> 1230 if mask.any():
1231 raise KeyError('%s not in index' % objarr[mask])
1232
AttributeError: 'bool' object has no attribute 'any'
Problem description
First I wasn't aware that I have a duplicated index in my code -- I only realised this when trying to reproduce the error in the example above. I used time index, because in my code I have a time index too.
The problem is in indexing.py line 1235 and 1236, where the variable mask
is a python builtin bool
(instead of a numpy array) and therefore does not support the method .any()
.
Expected Output
I am not sure if the code is legal with duplicates (I will filter duplicates in my code now). Nonetheless, I think a decent error message should be raised in such a case.
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 4.8.0-58-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None