rankdata — SciPy v1.15.2 Manual (original) (raw)

scipy.stats.

scipy.stats.rankdata(a, method='average', *, axis=None, nan_policy='propagate')[source]#

Assign ranks to data, dealing with ties appropriately.

By default (axis=None), the data array is first flattened, and a flat array of ranks is returned. Separately reshape the rank array to the shape of the data array if desired (see Examples).

Ranks begin at 1. The method argument controls how ranks are assigned to equal values. See [1] for further discussion of ranking methods.

Parameters:

aarray_like

The array of values to be ranked.

method{‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’}, optional

The method used to assign ranks to tied elements. The following methods are available (default is ‘average’):

axis{None, int}, optional

Axis along which to perform the ranking. If None, the data array is first flattened.

nan_policy{‘propagate’, ‘omit’, ‘raise’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

Note

When nan_policy is ‘propagate’, the output is an array of _all_nans because ranks relative to nans in the input are undefined. When nan_policy is ‘omit’, nans in a are ignored when ranking the other values, and the corresponding locations of the output are nan.

Added in version 1.10.

Returns:

ranksndarray

An array of size equal to the size of a, containing rank scores.

References

Examples

import numpy as np from scipy.stats import rankdata rankdata([0, 2, 3, 2]) array([ 1. , 2.5, 4. , 2.5]) rankdata([0, 2, 3, 2], method='min') array([ 1, 2, 4, 2]) rankdata([0, 2, 3, 2], method='max') array([ 1, 3, 4, 3]) rankdata([0, 2, 3, 2], method='dense') array([ 1, 2, 3, 2]) rankdata([0, 2, 3, 2], method='ordinal') array([ 1, 2, 4, 3]) rankdata([[0, 2], [3, 2]]).reshape(2,2) array([[1. , 2.5], [4. , 2.5]]) rankdata([[0, 2, 2], [3, 2, 5]], axis=1) array([[1. , 2.5, 2.5], [2. , 1. , 3. ]]) rankdata([0, 2, 3, np.nan, -2, np.nan], nan_policy="propagate") array([nan, nan, nan, nan, nan, nan]) rankdata([0, 2, 3, np.nan, -2, np.nan], nan_policy="omit") array([ 2., 3., 4., nan, 1., nan])