ENH: Support sorting DataFrames by a combination of columns and index levels · Issue #14353 · pandas-dev/pandas (original) (raw)

Background

During the review of @jreback's PR last year to cleanup the sorting API (#10726) there was some discussion of how the DataFrame API could eventually support sorting by a combination of columns and index levels. I'm interested in working on implementing this soon and would like to continue the discussion of where this should fit into the DataFrame sorting API.

In #10726 (comment) @jorisvandenbossche made the following suggestion

If we want to add this enhancement to simultaneously specify to sort on index levels and columns (the 5d option of above), then the question is: where do we add this functionality and how? In sorted, sort_index or both? I would then lean towards saying: only add it in sorted, where the by keyword can also denote a index level name.

This approach makes good sense to me. Each object passed to the by keyword of sort_values (referred to as sorted in the quote above) could refer to either a column or an index level. For backwards compatibility, column references would take precedence. And my assumption is that we would want to preserve the index when sorting by a combination of columns and index levels this way.

This proposal is the sorting analog of the groupby proposal in #5677 (which I will be working on soon)