BUG/ERR: wrong error in DataFrame.drop with non-unique datetime index + invalid keys · Issue #30399 · pandas-dev/pandas (original) (raw)
Consider this example, where there is a DataFrame with a non-unique DatetimeIndex:
In [8]: df = pd.DataFrame(np.random.randn(5, 3), columns=['a', 'b', 'c'], index=pd.date_range("2012", freq='H', periods=5))
In [9]: df = df.iloc[[0, 2, 2, 3]]
In [10]: df
Out[10]:
a b c
2012-01-01 00:00:00 -1.534726 -0.559295 0.207194
2012-01-01 02:00:00 -1.072027 0.376595 0.407512
2012-01-01 02:00:00 -1.072027 0.376595 0.407512
2012-01-01 03:00:00 0.581614 1.782635 -0.678197
If you then use drop
to drop some columns, but forget to specify columns=
or axis=1
(so you are actually dropping rows), you get a wrong error and very confusing error message:
In [10]: df.drop(['a', 'b'])
...
~/scipy/pandas/pandas/core/indexes/base.py in get_indexer_non_unique(self, target)
4559 tgt_values = target._ndarray_values
4560
-> 4561 indexer, missing = self._engine.get_indexer_non_unique(tgt_values)
4562 return ensure_platform_int(indexer), missing
4563
~/scipy/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_indexer_non_unique()
TypeError: 'NoneType' object is not iterable
Tested with pandas 0.25 and pandas master.