original) (raw)
Calling drop_duplicates method for empty pandas dataframe throws error · Issue #20516 · pandas-dev/pandas (Code Sample
pd.DataFrame().drop_duplicates() Traceback (most recent call last): File "", line 1, in File "/home/analytical-monk/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3098, in drop_duplicates duplicated = self.duplicated(subset, keep=keep) File "/home/analytical-monk/miniconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 3144, in duplicated labels, shape = map(list, zip(*map(f, vals))) ValueError: not enough values to unpack (expected 2, got 0)
Problem description
Currently, calling the drop_duplicates method for an empty dataframe object (simply pd.DataFrame()) throws an error.
Ideally it should return back the empty dataframe just liked it does when at least one column is present.
Expected Output
>>> pd.DataFrame().drop_duplicates()
Empty DataFrame
Columns: []
Index: []
pd.show_versions()
Output of >>> pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.8.0-58-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_IN
LOCALE: en_IN.ISO8859-1
pandas: 0.20.3
pytest: None
pip: 9.0.1
setuptools: 36.6.0
Cython: None
numpy: 1.13.1
scipy: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 1.0b10
sqlalchemy: 1.1.14
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None