pandas.DataFrame.quantile — pandas 2.2.3 documentation (original) (raw)
DataFrame.quantile(q=0.5, axis=0, numeric_only=False, interpolation='linear', method='single')[source]#
Return values at the given quantile over requested axis.
Parameters:
qfloat or array-like, default 0.5 (50% quantile)
Value between 0 <= q <= 1, the quantile(s) to compute.
axis{0 or ‘index’, 1 or ‘columns’}, default 0
Equals 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise.
numeric_onlybool, default False
Include only float, int or boolean data.
Changed in version 2.0.0: The default value of numeric_only
is now False
.
interpolation{‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}
This optional parameter specifies the interpolation method to use, when the desired quantile lies between two data points i and j:
- linear: i + (j - i) * fraction, where fraction is the fractional part of the index surrounded by i and j.
- lower: i.
- higher: j.
- nearest: i or j whichever is nearest.
- midpoint: (i + j) / 2.
method{‘single’, ‘table’}, default ‘single’
Whether to compute quantiles per-column (‘single’) or over all columns (‘table’). When ‘table’, the only allowed interpolation methods are ‘nearest’, ‘lower’, and ‘higher’.
Returns:
Series or DataFrame
If q
is an array, a DataFrame will be returned where the
index is q
, the columns are the columns of self, and the values are the quantiles.
If q
is a float, a Series will be returned where the
index is the columns of self and the values are the quantiles.
Examples
df = pd.DataFrame(np.array([[1, 1], [2, 10], [3, 100], [4, 100]]), ... columns=['a', 'b']) df.quantile(.1) a 1.3 b 3.7 Name: 0.1, dtype: float64 df.quantile([.1, .5]) a b 0.1 1.3 3.7 0.5 2.5 55.0
Specifying method=’table’ will compute the quantile over all columns.
df.quantile(.1, method="table", interpolation="nearest") a 1 b 1 Name: 0.1, dtype: int64 df.quantile([.1, .5], method="table", interpolation="nearest") a b 0.1 1 1 0.5 3 100
Specifying numeric_only=False will also compute the quantile of datetime and timedelta data.
df = pd.DataFrame({'A': [1, 2], ... 'B': [pd.Timestamp('2010'), ... pd.Timestamp('2011')], ... 'C': [pd.Timedelta('1 days'), ... pd.Timedelta('2 days')]}) df.quantile(0.5, numeric_only=False) A 1.5 B 2010-07-02 12:00:00 C 1 days 12:00:00 Name: 0.5, dtype: object