BUG: in _nsorted for frame with duplicated values index · Issue #13412 · pandas-dev/pandas (original) (raw)
Navigation Menu
- Explore
- Pricing
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Appearance settings
Description
The function below has been incorrectly implemented. If the frame has an index with duplicated values, you will get a result with more than n
rows and not properly sorted. So nsmallest
and nlargest
for DataFrame doesn't return a correct frame in this particular case.
def _nsorted(self, columns, n, method, keep):
if not com.is_list_like(columns):
columns = [columns]
columns = list(columns)
ser = getattr(self[columns[0]], method)(n, keep=keep)
ascending = dict(nlargest=False, nsmallest=True)[method]
return self.loc[ser.index].sort_values(columns, ascending=ascending,
kind='mergesort')