API: better error-handling for df.set_index · Issue #22484 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
splitting up #22236.
Let's havedf = pd.DataFrame(np.random.randn(5, 5), columns=list('ABCDE'))
The error handling of df.set_index
can be improved in at least three cases:
df.set_index(['A', 'A'], drop=False)
works, whiledf.set_index(['A', 'A'], drop=True)
yieldsKeyError: 'A'
- Objects of unknown type yield
KeyError
instead ofTypeError
:df.set_index(map(str, df.A))
KeyError: "None of [Index([...], dtype='object')] are in the [columns]"
df.set_index(['foo', 'bar', 'baz'])
only shows one missing keyKeyError: 'foo'
(in a huge stacktrace)
Better would be:
- gracefully handle duplicate column names when
drop=True
- raise better error message, e.g.
TypeError: only allowed types are: ...
- Show all missing keys:
KeyError: "['foo', 'bar', 'baz']"