topkrows - Top rows in sorted order - MATLAB (original) (raw)
Syntax
Description
Array Data
[B](#d126e1906379) = topkrows([X](#bvaomz8-X),[k](#bvaomz8-k))
sorts the rows in X
and returns the top k
rows of the sorted data. The rows are sorted in descending order (for numeric data) or reverse alphabetical order (for text data).topkrows
sorts based on the elements in the first column. When the first column contains elements of equal value,topkrows
sorts according to the elements in the next column and repeats this behavior for succeeding equal values.
[B](#d126e1906379) = topkrows([X](#bvaomz8-X),[k](#bvaomz8-k),[col](#bvaomz8-col))
sorts the results by the columns specified by col
. Use this syntax to perform multiple column sorts in succession. For example,topkrows(X,k,5)
sorts the rows of X
in descending order based on the elements in the fifth column.topkrows(X,k,[4 6])
first sorts the rows in descending order by the elements in the fourth column, and then it sorts based on the elements in the sixth column to break ties.
[B](#d126e1906379) = topkrows([X](#bvaomz8-X),___,[direction](#bvaomz8-direction))
specifies the direction of the sorting using any of the previous syntaxes.
For example, topkrows(A,2,[2 3],{'ascend' 'descend'})
gets the top 2 rows by first sorting rows in ascending order by the elements in column 2. Then, it sorts the rows with equal entries in column 2 in descending order by the elements in column 3.
[B](#d126e1906379) = topkrows([X](#bvaomz8-X),___,'ComparisonMethod',[method](#d126e1906325))
specifies how to compare complex numbers in X
. The comparison method can be 'auto'
, 'real'
, or'abs'
.
[[B](#d126e1906379),[I](#d126e1906403)] = topkrows([X](#bvaomz8-X),___)
also returns an index vector I
that describes the order of the selected rows such that B = X(I,:)
.
Table Data
[B](#d126e1906379) = topkrows([T](#bvaomz8-TT),[k](#bvaomz8-k))
returns the first k
rows in table or timetableT
, in sorted order. Table rows are in descending sorted order by all of their variables, and timetable rows are in descending sorted order by time.
[B](#d126e1906379) = topkrows([T](#bvaomz8-TT),[k](#bvaomz8-k),[vars](#bvaomz8-vars))
sorts the results by the variables specified by vars
. Use this syntax to sort with multiple variables in succession. For example,topkrows(T,k,{'Var1','Var2'})
first sorts the rows ofT
based on the elements in Var1
, and then it sorts by the elements in Var2
.
[B](#d126e1906379) = topkrows([T](#bvaomz8-TT),[k](#bvaomz8-k),[vars](#bvaomz8-vars),[direction](#bvaomz8-direction))
specifies the direction of the sorting. For example, use'ascend'
to sort T
in ascending order.
[B](#d126e1906379) = topkrows([T](#bvaomz8-TT),[k](#bvaomz8-k),[vars](#bvaomz8-vars),___,'ComparisonMethod',[method](#d126e1906325))
specifies how to compare complex numbers in T
.
[[B](#d126e1906379),[I](#d126e1906403)] = topkrows([T](#bvaomz8-TT),___)
also returns an index vector I
that describes the order of the selected rows such that B = T(I,:)
.
Examples
Sort the rows of a matrix using different sorting orders and view the top rows.
Create a 20-by-5 matrix of random integers between 1 and 10.
rng default % for reproducibility X = randi(10,20,5);
Sort the rows of X
in descending order and return the top 4 rows. By default, topkrows
sorts using the first column of the matrix. For any rows that have equal elements in a particular column, the sorting is based on the column immediately to the right.
TA = 4×5
10 10 8 7 6
10 7 8 2 4
10 4 4 3 5
10 3 7 9 6
When called with three input arguments, topkrows
bases the sort entirely on the column specified in the third argument. This means that rows with equal values in the specified column remain in their original order. Sort X
in descending order using the values in the third column and return the top 5 rows.
TB = 5×5
5 7 10 2 6
2 9 8 6 6
10 10 8 7 6
10 7 8 2 4
10 2 8 3 6
Sort X
using both the third and fourth columns. In this case, topkrows
sorts the rows by column 3. Then, for any rows with equal values in column 3, it sorts by column 4.
TC = 5×5
5 7 10 2 6
10 10 8 7 6
2 9 8 6 6
10 2 8 3 6
10 7 8 2 4
Sort a matrix using several columns with different sorting directions.
Create a 100-by-5 matrix of random integers between 1 and 10.
rng default % for reproducibility X = randi(10,100,5);
Sort X
using the first three columns and return the top 10 rows. Specify a sorting direction for each column using a cell array.
TA = topkrows(X,10,1:3,{'descend','ascend','ascend'})
TA = 10×5
10 1 4 6 7
10 1 8 5 1
10 2 3 4 7
10 3 5 10 5
10 4 7 2 4
10 5 5 2 7
10 5 5 6 7
10 6 5 5 7
10 6 6 1 5
10 7 7 8 1
Sort rows of heterogeneous data in a table.
Create a table from the patients.mat
data set, which includes basic health information for a group of patients. Include the patients age, gender, height, and their self-assessed health status in the table. Make the SelfAssessedHealthStatus
variable an ordinal categorical array.
load patients vals = {'Poor','Fair','Good','Excellent'}; SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus,vals,'Ordinal',true); T = table(Age,Gender,Height,SelfAssessedHealthStatus);
Find the top 10 rows when the table is sorted in descending order. The result is sorted by the first variable, Age
, in descending order. The remaining columns are subsorted to break ties:
- The
Gender
variable is subsorted to break ties with age. - The
Height
variable breaks ties with gender. - The
SelfAssessedHealthStatus
variable breaks ties with height.
TA=10×4 table Age Gender Height SelfAssessedHealthStatus ___ __________ ______ ________________________
50 {'Male' } 72 Excellent
50 {'Male' } 68 Good
49 {'Male' } 70 Fair
49 {'Male' } 68 Poor
49 {'Female'} 64 Good
49 {'Female'} 63 Good
48 {'Male' } 71 Good
48 {'Male' } 71 Good
48 {'Male' } 66 Fair
48 {'Female'} 66 Excellent
Find the top 10 rows containing the youngest women by sorting on the Gender
variable and subsorting on the Age
variable.
TB = topkrows(T,10,{'Gender','Age'},'ascend')
TB=10×4 table Age Gender Height SelfAssessedHealthStatus ___ __________ ______ ________________________
25 {'Female'} 63 Good
25 {'Female'} 64 Excellent
27 {'Female'} 69 Fair
28 {'Female'} 65 Good
28 {'Female'} 65 Good
28 {'Female'} 66 Good
29 {'Female'} 63 Excellent
29 {'Female'} 68 Excellent
29 {'Female'} 64 Good
30 {'Female'} 67 Excellent
Find the top 10 oldest women by changing the sorting direction of the Age
variable to 'descend'
.
TB = topkrows(T,10,{'Gender','Age'},{'ascend','descend'})
TB=10×4 table Age Gender Height SelfAssessedHealthStatus ___ __________ ______ ________________________
49 {'Female'} 64 Good
49 {'Female'} 63 Good
48 {'Female'} 65 Excellent
48 {'Female'} 66 Excellent
48 {'Female'} 64 Excellent
48 {'Female'} 64 Good
48 {'Female'} 66 Excellent
47 {'Female'} 66 Excellent
46 {'Female'} 68 Good
45 {'Female'} 68 Excellent
Sort a matrix of complex numbers by absolute value and then by real part.
Create a 100-by-2 matrix of random complex numbers.
valRange = [-10 10]; X = randi(valRange,100,2) + 1i*randi(valRange,100,2);
Find the top 10 rows of the matrix. By default, topkrows
compares the complex numbers by absolute value.
TA = 10×2 complex
-10.0000 + 9.0000i 10.0000 - 2.0000i -8.0000 + 9.0000i 2.0000 - 8.0000i 9.0000 + 8.0000i 4.0000 + 7.0000i -6.0000 +10.0000i -8.0000 - 7.0000i 6.0000 -10.0000i -1.0000 - 5.0000i 6.0000 -10.0000i 0.0000 + 5.0000i -7.0000 + 9.0000i -2.0000 - 5.0000i 9.0000 - 7.0000i 10.0000 + 7.0000i 9.0000 - 7.0000i 6.0000 + 6.0000i -9.0000 - 7.0000i 9.0000 + 9.0000i
Find the top 10 rows of the matrix using only the real part of the complex numbers by specifying the 'ComparisonMethod'
name-value pair.
TB = topkrows(X,10,'ComparisonMethod','real')
TB = 10×2 complex
10.0000 + 4.0000i -3.0000 - 7.0000i 10.0000 + 3.0000i 4.0000 + 5.0000i 10.0000 + 2.0000i 5.0000 - 7.0000i 10.0000 - 1.0000i -1.0000 - 8.0000i 10.0000 - 1.0000i -6.0000 +10.0000i 10.0000 - 4.0000i -9.0000 + 0.0000i 10.0000 - 5.0000i -8.0000 - 3.0000i 9.0000 + 8.0000i 4.0000 + 7.0000i 9.0000 + 5.0000i -10.0000 + 0.0000i 9.0000 + 1.0000i 1.0000 - 9.0000i
Input Arguments
Input array, specified as a numeric, logical, character, string, categorical, datetime, or duration array.
- If
X
is a nonordinal categorical array, thentopkrows
sorts the elements in descending order based on the order of the categories returned bycategories(X)
. - If
X
containsNaN
,NaT
, or other missing values, thentopkrows
places the missing values at the end of a descending sort.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
| logical
| char
| string
| categorical
| datetime
| duration
Complex Number Support: Yes
Input table, specified as a table or timetable.
Data Types: table
| timetable
Number of rows to return, specified as a nonnegative scalar integer. Ifk
is greater than the number of rows inX
, then topkrows
returns all of the rows in X
.
Columns to sort by, specified as a positive scalar integer or a vector of positive integers.
Example: B = topkrows(X,100,[1 3])
sorts over the first and third columns before returning the top 100 rows.
Variables to sort by, specified as one of the options in this table.
Option | Example | Description |
---|---|---|
positive integer | topkrows(T,k,3) | The integer n specifies the index of the variable to sort by as returned byT.Properties.VariableNames{n}. |
vector of positive integers | topkrows(T,k,[1 3]) | The vector [n1 n2 …] specifies the indices of several variables to sort by as returned byT.Properties.VariableNames{[n1 n2 …]}. |
logical vector | topkrows(T,k,[true false true]) | Specifies one or more variables to sort by using values of true orfalse. |
variable name | topkrows(T,k,"Var3") | Specifies the sorting variable as one of the variable names listed inT.Properties.VariableNames. |
string array | topkrows(T,k,["Var1","Var3"]) | Specifies several sorting variables selected fromT.Properties.VariableNames. |
cell array of character vectors | topkrows(T,k,{'Var1','Var3'}) | Specifies several sorting variables selected fromT.Properties.VariableNames. |
pattern scalar | topkrows(T,k,"V" + wildcardPattern) | Specifies several sorting variables selected fromT.Properties.VariableNames. |
'RowNames' | topkrows(T,k,'RowNames') | For tables only. This option sorts the results by the row names. |
Example: B = topkrows(X,k,[1 3])
sorts over the first and third columns.
Example: B = topkrows(X,k,"Year")
sorts using theYear
variable.
Sorting direction, specified as either 'descend'
,'ascend'
, or a string array or cell array of character vectors that specifies some combination of these values.
If direction
is a cell array, then it must contain'descend'
or 'ascend'
for each sorting column specified by col
orvars
. If you do not specify col
orvars
, then the cell array must contain'descend'
or 'ascend'
for each column in X
or variable in T
.
Comparison method for numeric input, specified as one of these values:
'auto'
— (default) Compares real numbers according to'real'
and complex numbers according to'abs'
.'real'
— Compares numbers by real partreal(A)
. Numbers with equal real part are subsorted by imaginary partimag(A)
.'abs'
— Compares numbers by absolute valueabs(A)
. Numbers with equal magnitude are subsorted by phase angleangle(A)
.
Output Arguments
Requested rows, returned as an array, table, or timetable.B
is the same type as the input data.
Row indices, returned as a column vector. I
describes the order of the selected rows such that B = X(I,:)
orB = T(I,:)
.
Tips
topkrows
does not do a full sort of the input data, so it is generally faster thansort
andsortrows
when the number of requested rows is small.
Extended Capabilities
Thetopkrows
function supports tall arrays with the following usage notes and limitations:
- The
ComparisonMethod
name-value argument is not supported. - The
RowNames
option for tables is not supported.
For more information, see Tall Arrays.
Usage notes and limitations:
- The following types are not supported:
cell
,table
,categorical
,duration
, anddatetime
. - For fixed size compilation, the value of
k
must be constant. - The
vars
input argument does not support pattern expressions.
Version History
Introduced in R2016b
The topkrows
function shows improved performance for numeric data, including complex data, and logical data when_k
_ is at least 2048 and is at least 20% of the total number of rows.
For example, this code returns the top 5000 rows from a numeric matrix containing 10,000 rows in descending order, primarily based on the first column. The code is about 2x faster than in the previous release.
function timingTest A = rand(1e4,10); for i = 1:120 T = topkrows(A,5000); end end
The approximate execution times are:
R2024b: 0.14 s
R2025a: 0.07 s
The code was timed on a Windows® 11, AMD EPYC™ 74F3 24-Core Processor @ 3.19 GHz test system using thetimeit
function.
Some behaviors of topkrows
operating on tall arrays have changed:
topkrows
placesNaN
,NaT
, and other missing values at the end of a descending sort. In previous releasestopkrows
placed missing values at the beginning of a descending sort.topkrows
no longer accepts tall cell arrays containing only scalar numeric values as inputs. Usecell2mat
to convert the tall cell array of scalar numeric values into a tall matrix before usingtopkrows
.