BUG: list subclass objects don't work as indexers for data frames in 1.3.x · Issue #42433 · pandas-dev/pandas (original) (raw)


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

from pandas import DataFrame df = DataFrame(dict(x=[1,2,3])) class MyList(list): ...

key = MyList(['x']) print(df[key])

Traceback (most recent call last):
  File "<input>", line 1, in <module>
    df[key]
  File "/.../python3.7/site-packages/pandas/core/frame.py", line 3445, in __getitem__
    if com.is_bool_indexer(key):
  File "/.../python3.7/site-packages/pandas/core/common.py", line 146, in is_bool_indexer
    return len(key) > 0 and lib.is_bool_list(key)
TypeError: Argument 'obj' has incorrect type (expected list, got MyList)

Problem description

This works in v1.2.*.

This because:

if com.is_bool_indexer(key):
return self._getitem_bool_array(key)

doesn't bypass list subclasses so that MyList objects are not treated as a collection of keys.

If we step in:

elif isinstance(key, list):
# check if np.array(key).dtype would be bool
return len(key) > 0 and lib.is_bool_list(key)

Compared to v1.2.5:

elif isinstance(key, list):
try:
arr = np.asarray(key)
return arr.dtype == np.bool_ and len(arr) == len(key)
except TypeError: # pragma: no cover
return False

A possible solution would be:

elif isinstance(key, list):
    # check if np.array(key).dtype would be bool

Expected Output

Print out the data frame df.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : f00ed8f
python : 3.7.8.final.0
python-bits : 64
OS : Linux
OS-release : 4.4.0-19041-Microsoft
Version : #488-Microsoft Mon Sep 01 13:43:00 PST 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.0
numpy : 1.21.0
pytz : 2020.4
dateutil : 2.8.1
pip : 19.3.1
setuptools : 52.0.0.post20210125
Cython : None
pytest : 6.2.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.3
sqlalchemy : 1.3.20
tables : None
tabulate : 0.8.6
xarray : None
xlrd : None
xlwt : None
numba : None