BUG: list
subclass objects don't work as indexers for data frames in 1.3.x · Issue #42433 · pandas-dev/pandas (original) (raw)
- I have checked that this issue has not already been reported.
- I have confirmed this bug exists on the latest version of pandas.
- (optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
from pandas import DataFrame df = DataFrame(dict(x=[1,2,3])) class MyList(list): ...
key = MyList(['x']) print(df[key])
Traceback (most recent call last):
File "<input>", line 1, in <module>
df[key]
File "/.../python3.7/site-packages/pandas/core/frame.py", line 3445, in __getitem__
if com.is_bool_indexer(key):
File "/.../python3.7/site-packages/pandas/core/common.py", line 146, in is_bool_indexer
return len(key) > 0 and lib.is_bool_list(key)
TypeError: Argument 'obj' has incorrect type (expected list, got MyList)
Problem description
This works in v1.2.*
.
This because:
if com.is_bool_indexer(key): |
---|
return self._getitem_bool_array(key) |
doesn't bypass list subclasses so that MyList
objects are not treated as a collection of keys.
If we step in:
elif isinstance(key, list): |
---|
# check if np.array(key).dtype would be bool |
return len(key) > 0 and lib.is_bool_list(key) |
Compared to v1.2.5
:
elif isinstance(key, list): |
---|
try: |
arr = np.asarray(key) |
return arr.dtype == np.bool_ and len(arr) == len(key) |
except TypeError: # pragma: no cover |
return False |
A possible solution would be:
elif isinstance(key, list):
# check if np.array(key).dtype would be bool
try:
return len(key) > 0 and lib.is_bool_list(key)
return len(key) > 0 and lib.is_bool_list(key)
except TypeError:
return False
Expected Output
Print out the data frame df
.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : f00ed8f
python : 3.7.8.final.0
python-bits : 64
OS : Linux
OS-release : 4.4.0-19041-Microsoft
Version : #488-Microsoft Mon Sep 01 13:43:00 PST 2020
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.3.0
numpy : 1.21.0
pytz : 2020.4
dateutil : 2.8.1
pip : 19.3.1
setuptools : 52.0.0.post20210125
Cython : None
pytest : 6.2.2
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.4
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.3
sqlalchemy : 1.3.20
tables : None
tabulate : 0.8.6
xarray : None
xlrd : None
xlwt : None
numba : None