BUG: boolean indexing error with .drop() · Issue #16877 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
df = pd.DataFrame( data = { 'acol' : np.arange(4), 'bcol' : 2*np.arange(4) }) df.drop(df.bcol > 2, axis=0, inplace=True)
print(df)
Expected Output
Observed Output
acol bcol
2 2 4 3 3 6 4 4 8
Problem description
The anticipated behavior was that rows with bcol
> 2 would be dropped. The actual behavior is that the boolean gets converted to 0/1, and then treated as index label. So row numbers 0 and/or 1 are dropped... but all other rows will be kept.
The documentation did not make it clear what was happening.
Solutions might include documentation clarifying that .drop()
cannot be used with boolean indexing, or a warning when receiving the (attempted) boolean index.
Output of pd.show_versions()
INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 2.6.32-573.12.1.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8
pandas: 0.20.2
pytest: 3.1.2
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.1
xarray: 0.9.6
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.0
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.5.0a1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.8.0
bs4: 4.5.3
html5lib: 0.9999999
sqlalchemy: 1.1.11
pymysql: 0.7.9.None
psycopg2: 2.7.1 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: 0.1.1
pandas_gbq: None
pandas_datareader: None