BUG: drop on axis with duplicate values doesn't raise when label is absent · Issue #19186 · pandas-dev/pandas (original) (raw)

Code Sample, a copy-pastable example if possible

>>> items
             item   name
store1      dress   V
store2       food   A
store2      water   C

CASE 1 : axis set to column and given a value which is not available in the structure. Gives ValueError.

>>> items.drop('abcd',axis=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 2530, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 2562, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3744, in drop
    labels[mask])
ValueError: labels ['abcd'] not contained in axis

CASE 2 : axis set to row, and given a value which is not available in structure should behave in the same way as the above case does.
Instead it just prints the structure table. Shouldn’t it be throwing ValueError ?

>>> items.drop('abcd',axis=0)
             item   name
store1      dress   V
store2       food   A
store2      water   C
>>> 

Problem description

When provided invalid column name(axis=1) it throws "ValueError". Same happens with row as well(axis=0). But when there are duplicate indices in the DataFrame it doesn't.

Expected Output

>>> items.drop('d',axis=0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 2530, in drop
    obj = obj._drop_axis(labels, axis, level=level, errors=errors)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/generic.py", line 2562, in _drop_axis
    new_axis = axis.drop(labels, errors=errors)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3744, in drop
    labels[mask])
ValueError: labels ['d'] not contained in axis

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.4.final.0 python-bits: 64 OS: Darwin OS-release: 17.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_AU.UTF-8 LOCALE: en_AU.UTF-8

pandas: 0.22.0
pytest: None
pip: 9.0.1
setuptools: 36.5.0
Cython: 0.27.3
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None