pandas.Index.drop_duplicates — pandas 3.0.0.dev0+2099.g3832e85779 documentation (original) (raw)
Index.drop_duplicates(*, keep='first')[source]#
Return Index with duplicate values removed.
Parameters:
keep{‘first’, ‘last’, False
}, default ‘first’
- ‘first’ : Drop duplicates except for the first occurrence.
- ‘last’ : Drop duplicates except for the last occurrence.
False
: Drop all duplicates.
Returns:
Index
A new Index object with the duplicate values removed.
Examples
Generate an pandas.Index with duplicate values.
idx = pd.Index(["llama", "cow", "llama", "beetle", "llama", "hippo"])
The keep parameter controls which duplicate values are removed. The value ‘first’ keeps the first occurrence for each set of duplicated entries. The default value of keep is ‘first’.
idx.drop_duplicates(keep="first") Index(['llama', 'cow', 'beetle', 'hippo'], dtype='object')
The value ‘last’ keeps the last occurrence for each set of duplicated entries.
idx.drop_duplicates(keep="last") Index(['cow', 'beetle', 'llama', 'hippo'], dtype='object')
The value False
discards all sets of duplicated entries.
idx.drop_duplicates(keep=False) Index(['cow', 'beetle', 'hippo'], dtype='object')