Series — Polars documentation (original) (raw)

This page gives an overview of all public Series methods.

class polars.Series(

name: str | ArrayLike | None = None,

values: ArrayLike | None = None,

dtype: PolarsDataType | None = None,

strict: bool = True,

nan_to_null: bool = False,

)[source]

A Series represents a single column in a Polars DataFrame.

Parameters:

namestr, default None

Name of the Series. Will be used as a column name when used in a DataFrame. When not specified, name is set to an empty string.

valuesArrayLike, default None

One-dimensional data in various forms. Supported are: Sequence, Series, pyarrow Array, and numpy ndarray.

dtypeDataType, default None

Data type of the resulting Series. If set to None (default), the data type is inferred from the values input. The strategy for data type inference depends on the strict parameter:

If strict is set to True (default), the inferred data type is equal to the first non-null value, or Null if all values are null.
If strict is set to False, the inferred data type is the supertype of the values, or Object if no supertype can be found. WARNING: A full pass over the values is required to determine the supertype.
If no values were passed, the resulting data type is Null.

strictbool, default True

Throw an error if any value does not exactly match the given or inferred data type. If set to False, values that do not match the data type are cast to that data type or, if casting is not possible, set to null instead.

nan_to_nullbool, default False

In case a numpy array is used to create this Series, indicate how to deal with np.nan values. (This parameter is a no-op on non-numpy data).

Examples

Constructing a Series by specifying name and values positionally:

s = pl.Series("a", [1, 2, 3]) s shape: (3,) Series: 'a' [i64] [ 1 2 3 ]

Notice that the dtype is automatically inferred as a polars Int64:

Constructing a Series with a specific dtype:

s2 = pl.Series("a", [1, 2, 3], dtype=pl.Float32) s2 shape: (3,) Series: 'a' [f32] [ 1.0 2.0 3.0 ]

It is possible to construct a Series with values as the first positional argument. This syntax considered an anti-pattern, but it can be useful in certain scenarios. You must specify any other arguments through keywords.

s3 = pl.Series([1, 2, 3]) s3 shape: (3,) Series: '' [i64] [ 1 2 3 ]

Methods:

Attributes:

abs() → Series[source]

Compute absolute values.

Same as abs(series).

Examples

s = pl.Series([1, -2, -3]) s.abs() shape: (3,) Series: '' [i64] [ 1 2 3 ]

alias(name: str) → Series[source]

Rename the series.

Parameters:

name

The new name.

Examples

s = pl.Series("a", [1, 2, 3]) s.alias("b") shape: (3,) Series: 'b' [i64] [ 1 2 3 ]

all(*, ignore_nulls: bool = True) → bool | None [source]

Return whether all values in the column are True.

Only works on columns of data type Boolean.

Parameters:

ignore_nulls

If set to True (default), null values are ignored. If there are no non-null values, the output is True.
If set to False, Kleene logic is used to deal with nulls: if the column contains any null values and no False values, the output is None.

Returns:

bool or None

Examples

pl.Series([True, True]).all() True pl.Series([False, True]).all() False pl.Series([None, True]).all() True

Enable Kleene logic by setting ignore_nulls=False.

pl.Series([None, True]).all(ignore_nulls=False) # Returns None

any(*, ignore_nulls: bool = True) → bool | None [source]

Return whether any of the values in the column are True.

Only works on columns of data type Boolean.

Parameters:

ignore_nulls

If set to True (default), null values are ignored. If there are no non-null values, the output is False.
If set to False, Kleene logic is used to deal with nulls: if the column contains any null values and no True values, the output is None.

Returns:

bool or None

Examples

pl.Series([True, False]).any() True pl.Series([False, False]).any() False pl.Series([None, False]).any() False

Enable Kleene logic by setting ignore_nulls=False.

pl.Series([None, False]).any(ignore_nulls=False) # Returns None

append(other: Series) → Self[source]

Append a Series to this one.

The resulting series will consist of multiple chunks.

Parameters:

other

Series to append.

Warning

This method modifies the series in-place. The series is returned for convenience only.

Examples

a = pl.Series("a", [1, 2, 3]) b = pl.Series("b", [4, 5]) a.append(b) shape: (5,) Series: 'a' [i64] [ 1 2 3 4 5 ]

The resulting series will consist of multiple chunks.

approx_n_unique() → PythonLiteral | None [source]

Approximate count of unique values.

This is done using the HyperLogLog++ algorithm for cardinality estimation.

arccos() → Series[source]

Compute the element-wise value for the inverse cosine.

Examples

s = pl.Series("a", [1.0, 0.0, -1.0]) s.arccos() shape: (3,) Series: 'a' [f64] [ 0.0 1.570796 3.141593 ]

arccosh() → Series[source]

Compute the element-wise value for the inverse hyperbolic cosine.

Examples

s = pl.Series("a", [5.0, 1.0, 0.0, -1.0]) s.arccosh() shape: (4,) Series: 'a' [f64] [ 2.292432 0.0 NaN NaN ]

arcsin() → Series[source]

Compute the element-wise value for the inverse sine.

Examples

s = pl.Series("a", [1.0, 0.0, -1.0]) s.arcsin() shape: (3,) Series: 'a' [f64] [ 1.570796 0.0 -1.570796 ]

arcsinh() → Series[source]

Compute the element-wise value for the inverse hyperbolic sine.

Examples

s = pl.Series("a", [1.0, 0.0, -1.0]) s.arcsinh() shape: (3,) Series: 'a' [f64] [ 0.881374 0.0 -0.881374 ]

arctan() → Series[source]

Compute the element-wise value for the inverse tangent.

Examples

s = pl.Series("a", [1.0, 0.0, -1.0]) s.arctan() shape: (3,) Series: 'a' [f64] [ 0.785398 0.0 -0.785398 ]

arctanh() → Series[source]

Compute the element-wise value for the inverse hyperbolic tangent.

Examples

s = pl.Series("a", [2.0, 1.0, 0.5, 0.0, -0.5, -1.0, -1.1]) s.arctanh() shape: (7,) Series: 'a' [f64] [ NaN inf 0.549306 0.0 -0.549306 -inf NaN ]

arg_max() → int | None [source]

Get the index of the maximal value.

Returns:

int

Examples

s = pl.Series("a", [3, 2, 1]) s.arg_max() 0

arg_min() → int | None [source]

Get the index of the minimal value.

Returns:

int

Examples

s = pl.Series("a", [3, 2, 1]) s.arg_min() 2

arg_sort(

descending: bool = False,

nulls_last: bool = False,

) → Series[source]

Get the index values that would sort this Series.

Parameters:

descending

Sort in descending order.

nulls_last

Place null values last instead of first.

Examples

s = pl.Series("a", [5, 3, 4, 1, 2]) s.arg_sort() shape: (5,) Series: 'a' [u32] [ 3 4 1 2 0 ]

arg_true() → Series[source]

Get index values where Boolean Series evaluate True.

Returns:

Series

Series of data type UInt32.

Examples

s = pl.Series("a", [1, 2, 3]) (s == 2).arg_true() shape: (1,) Series: 'a' [u32] [ 1 ]

arg_unique() → Series[source]

Get unique index as Series.

Returns:

Series

Examples

s = pl.Series("a", [1, 2, 2, 3]) s.arg_unique() shape: (3,) Series: 'a' [u32] [ 0 1 3 ]

backward_fill(limit: int | None = None) → Series[source]

Fill missing values with the next non-null value.

This is an alias of .fill_null(strategy="backward").

Parameters:

limit

The number of consecutive null values to backward fill.

bitwise_and() → PythonLiteral | None [source]

Perform an aggregation of bitwise ANDs.

bitwise_count_ones() → Self[source]

Evaluate the number of set bits.

bitwise_count_zeros() → Self[source]

Evaluate the number of unset Self.

bitwise_leading_ones() → Self[source]

Evaluate the number most-significant set bits before seeing an unset bit.

bitwise_leading_zeros() → Self[source]

Evaluate the number most-significant unset bits before seeing a set bit.

bitwise_or() → PythonLiteral | None [source]

Perform an aggregation of bitwise ORs.

bitwise_trailing_ones() → Self[source]

Evaluate the number least-significant set bits before seeing an unset bit.

bitwise_trailing_zeros() → Self[source]

Evaluate the number least-significant unset bits before seeing a set bit.

bitwise_xor() → PythonLiteral | None [source]

Perform an aggregation of bitwise XORs.

bottom_k(k: int = 5) → Series[source]

Return the k smallest elements.

Non-null elements are always preferred over null elements. The output is not guaranteed to be in any particular order, call sort() after this function if you wish the output to be sorted.

This has time complexity:

\[O(n)\]

Parameters:

Number of elements to return.

Examples

s = pl.Series("a", [2, 5, 1, 4, 3]) s.bottom_k(3) shape: (3,) Series: 'a' [i64] [ 1 2 3 ]

bottom_k_by(

by: IntoExpr | Iterable[IntoExpr],

k: int = 5,

reverse: bool | Sequence[bool] = False,

) → Series[source]

Return the k smallest elements of the by column.

Non-null elements are always preferred over null elements, regardless of the value of reverse. The output is not guaranteed to be in any particular order, call sort() after this function if you wish the output to be sorted.

This has time complexity:

\[O(n \log{n})\]

Parameters:

Column used to determine the smallest elements. Accepts expression input. Strings are parsed as column names.

Number of elements to return.

reverse

Consider the k largest elements of the by column( (instead of the ksmallest). This can be specified per column by passing a sequence of booleans.

Examples

s = pl.Series("a", [2, 5, 1, 4, 3]) s.bottom_k_by("a", 3) shape: (3,) Series: 'a' [i64] [ 1 2 3 ]

cast(

dtype: type[int | float | str | bool] | PolarsDataType,

strict: bool = True,

wrap_numerical: bool = False,

) → Self[source]

Cast between data types.

Parameters:

dtype

DataType to cast to.

strict

If True invalid casts generate exceptions instead of nulls.

wrap_numerical

If True numeric casts wrap overflowing values instead of marking the cast as invalid.

Examples

s = pl.Series("a", [True, False, True]) s shape: (3,) Series: 'a' [bool] [ true false true ]

s.cast(pl.UInt32) shape: (3,) Series: 'a' [u32] [ 1 0 1 ]

cbrt() → Series[source]

Compute the cube root of the elements.

Optimization for

pl.Series([1, 2]) ** (1.0 / 3) shape: (2,) Series: '' [f64] [ 1.0 1.259921 ]

Examples

s = pl.Series([1, 2, 3]) s.cbrt() shape: (3,) Series: '' [f64] [ 1.0 1.259921 1.44225 ]

ceil() → Series[source]

Rounds up to the nearest integer value.

Only works on floating point Series.

Examples

s = pl.Series("a", [1.12345, 2.56789, 3.901234]) s.ceil() shape: (3,) Series: 'a' [f64] [ 2.0 3.0 4.0 ]

chunk_lengths() → list[int][source]

Get the length of each individual chunk.

Examples

s = pl.Series("a", [1, 2, 3]) s2 = pl.Series("a", [4, 5, 6])

Concatenate Series with rechunk = True

pl.concat([s, s2], rechunk=True).chunk_lengths() [6]

Concatenate Series with rechunk = False

pl.concat([s, s2], rechunk=False).chunk_lengths() [3, 3]

clear(n: int = 0) → Series[source]

Create an empty copy of the current Series, with zero to ‘n’ elements.

The copy has an identical name/dtype, but no data.

Parameters:

Number of (empty) elements to return in the cleared frame.

See also

clone

Cheap deepcopy/clone.

Examples

s = pl.Series("a", [None, True, False]) s.clear() shape: (0,) Series: 'a' [bool] [ ]

s.clear(n=2) shape: (2,) Series: 'a' [bool] [ null null ]

clip(

lower_bound: NumericLiteral | TemporalLiteral | IntoExprColumn | None = None,

upper_bound: NumericLiteral | TemporalLiteral | IntoExprColumn | None = None,

) → Series[source]

Set values outside the given boundaries to the boundary value.

Parameters:

lower_bound

Lower bound. Accepts expression input. Non-expression inputs are parsed as literals. If set to None (default), no lower bound is applied.

upper_bound

Upper bound. Accepts expression input. Non-expression inputs are parsed as literals. If set to None (default), no upper bound is applied.

Notes

This method only works for numeric and temporal columns. To clip other data types, consider writing a when-then-otherwise expression. See when().

Examples

Specifying both a lower and upper bound:

s = pl.Series([-50, 5, 50, None]) s.clip(1, 10) shape: (4,) Series: '' [i64] [ 1 5 10 null ]

Specifying only a single bound:

s.clip(upper_bound=10) shape: (4,) Series: '' [i64] [ -50 5 10 null ]

clone() → Self[source]

Create a copy of this Series.

This is a cheap operation that does not copy data.

See also

clear

Create an empty copy of the current Series, with identical schema but no data.

Examples

s = pl.Series("a", [1, 2, 3]) s.clone() shape: (3,) Series: 'a' [i64] [ 1 2 3 ]

cos() → Series[source]

Compute the element-wise value for the cosine.

Examples

import math s = pl.Series("a", [0.0, math.pi / 2.0, math.pi]) s.cos() shape: (3,) Series: 'a' [f64] [ 1.0 6.1232e-17 -1.0 ]

cosh() → Series[source]

Compute the element-wise value for the hyperbolic cosine.

Examples

s = pl.Series("a", [1.0, 0.0, -1.0]) s.cosh() shape: (3,) Series: 'a' [f64] [ 1.543081 1.0 1.543081 ]

cot() → Series[source]

Compute the element-wise value for the cotangent.

Examples

import math s = pl.Series("a", [0.0, math.pi / 2.0, math.pi]) s.cot() shape: (3,) Series: 'a' [f64] [ inf 6.1232e-17 -8.1656e15 ]

count() → int [source]

Return the number of non-null elements in the column.

Examples

s = pl.Series("a", [1, 2, None]) s.count() 2

cum_count(*, reverse: bool = False) → Self[source]

Return the cumulative count of the non-null values in the column.

Parameters:

reverse

Reverse the operation.

Examples

s = pl.Series(["x", "k", None, "d"]) s.cum_count() shape: (4,) Series: '' [u32] [ 1 2 2 3 ]

cum_max(*, reverse: bool = False) → Series[source]

Get an array with the cumulative max computed at every element.

Parameters:

reverse

reverse the operation.

Examples

s = pl.Series("s", [3, 5, 1]) s.cum_max() shape: (3,) Series: 's' [i64] [ 3 5 5 ]

cum_min(*, reverse: bool = False) → Series[source]

Get an array with the cumulative min computed at every element.

Parameters:

reverse

reverse the operation.

Examples

s = pl.Series("s", [1, 2, 3]) s.cum_min() shape: (3,) Series: 's' [i64] [ 1 1 1 ]

cum_prod(*, reverse: bool = False) → Series[source]

Get an array with the cumulative product computed at every element.

Parameters:

reverse

reverse the operation.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

s = pl.Series("a", [1, 2, 3]) s.cum_prod() shape: (3,) Series: 'a' [i64] [ 1 2 6 ]

cum_sum(*, reverse: bool = False) → Series[source]

Get an array with the cumulative sum computed at every element.

Parameters:

reverse

reverse the operation.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.

Examples

s = pl.Series("a", [1, 2, 3]) s.cum_sum() shape: (3,) Series: 'a' [i64] [ 1 3 6 ]

cumulative_eval(

expr: Expr,

min_samples: int = 1,

parallel: bool = False,

) → Series[source]

Run an expression over a sliding window that increases 1 slot every iteration.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

expr

Expression to evaluate

min_samples

Number of valid values there should be in the window before the expression is evaluated. valid values = length - null_count

parallel

Run in parallel. Don’t do this in a group by or another operation that already has much parallelization.

Warning

This can be really slow as it can have O(n^2) complexity. Don’t use this for operations that visit all elements.

Examples

s = pl.Series("values", [1, 2, 3, 4, 5]) s.cumulative_eval(pl.element().first() - pl.element().last() ** 2) shape: (5,) Series: 'values' [i64] [ 0 -3 -8 -15 -24 ]

cut(

breaks: Sequence[float],

labels: Sequence[str] | None = None,

left_closed: bool = False,

include_breaks: bool = False,

) → Series[source]

Bin continuous values into discrete categories.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Parameters:

breaks

List of unique cut points.

labels

Names of the categories. The number of labels must be equal to the number of cut points plus one.

left_closed

Set the intervals to be left-closed instead of right-closed.

include_breaks

Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from aCategorical to a Struct.

Returns:

Series

Series of data type Categorical if include_breaks is set toFalse (default), otherwise a Series of data type Struct.

Examples

Divide the column into three categories.

s = pl.Series("foo", [-2, -1, 0, 1, 2]) s.cut([-1, 1], labels=["a", "b", "c"]) shape: (5,) Series: 'foo' [cat] [ "a" "a" "b" "b" "c" ]

Create a DataFrame with the breakpoint and category for each value.

cut = s.cut([-1, 1], include_breaks=True).alias("cut") s.to_frame().with_columns(cut).unnest("cut") shape: (5, 3) ┌─────┬────────────┬────────────┐ │ foo ┆ breakpoint ┆ category │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ cat │ ╞═════╪════════════╪════════════╡ │ -2 ┆ -1.0 ┆ (-inf, -1] │ │ -1 ┆ -1.0 ┆ (-inf, -1] │ │ 0 ┆ 1.0 ┆ (-1, 1] │ │ 1 ┆ 1.0 ┆ (-1, 1] │ │ 2 ┆ inf ┆ (1, inf] │ └─────┴────────────┴────────────┘

describe(

percentiles: Sequence[float] | float | None = (0.25, 0.5, 0.75),

interpolation: QuantileMethod = 'nearest',

) → DataFrame[source]

Quick summary statistics of a Series.

Series with mixed datatypes will return summary statistics for the datatype of the first value.

Parameters:

percentiles

One or more percentiles to include in the summary statistics (if the Series has a numeric dtype). All values must be in the range [0, 1].

interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’, ‘equiprobable’}

Interpolation method used when calculating percentiles.

Returns:

DataFrame

Mapping with summary statistics of a Series.

Notes

The median is included by default as the 50% percentile.

Examples

s = pl.Series([1, 2, 3, 4, 5]) s.describe() shape: (9, 2) ┌────────────┬──────────┐ │ statistic ┆ value │ │ --- ┆ --- │ │ str ┆ f64 │ ╞════════════╪══════════╡ │ count ┆ 5.0 │ │ null_count ┆ 0.0 │ │ mean ┆ 3.0 │ │ std ┆ 1.581139 │ │ min ┆ 1.0 │ │ 25% ┆ 2.0 │ │ 50% ┆ 3.0 │ │ 75% ┆ 4.0 │ │ max ┆ 5.0 │ └────────────┴──────────┘

Non-numeric data types may not have all statistics available.

s = pl.Series(["aa", "aa", None, "bb", "cc"]) s.describe() shape: (4, 2) ┌────────────┬───────┐ │ statistic ┆ value │ │ --- ┆ --- │ │ str ┆ str │ ╞════════════╪═══════╡ │ count ┆ 4 │ │ null_count ┆ 1 │ │ min ┆ aa │ │ max ┆ cc │ └────────────┴───────┘

diff(n: int = 1, null_behavior: NullBehavior = 'ignore') → Series[source]

Calculate the first discrete difference between shifted items.

Parameters:

Number of slots to shift.

null_behavior{‘ignore’, ‘drop’}

How to handle null values.

Examples

s = pl.Series("s", values=[20, 10, 30, 25, 35], dtype=pl.Int8) s.diff() shape: (5,) Series: 's' [i8] [ null -10 20 -5 10 ]

s.diff(n=2) shape: (5,) Series: 's' [i8] [ null null 10 15 5 ]

s.diff(n=2, null_behavior="drop") shape: (3,) Series: 's' [i8] [ 10 15 5 ]

dot(other: Series | ArrayLike) → int | float | None [source]

Compute the dot/inner product between two Series.

Parameters:

other

Series (or array) to compute dot product with.

Examples

s = pl.Series("a", [1, 2, 3]) s2 = pl.Series("b", [4.0, 5.0, 6.0]) s.dot(s2) 32.0

drop_nans() → Series[source]

Drop all floating point NaN values.

The original order of the remaining elements is preserved.

Notes

A NaN value is not the same as a null value. To drop null values, use drop_nulls().

Examples

s = pl.Series([1.0, None, 3.0, float("nan")]) s.drop_nans() shape: (3,) Series: '' [f64] [ 1.0 null 3.0 ]

drop_nulls() → Series[source]

Drop all null values.

The original order of the remaining elements is preserved.

Notes

A null value is not the same as a NaN value. To drop NaN values, use drop_nans().

Examples

s = pl.Series([1.0, None, 3.0, float("nan")]) s.drop_nulls() shape: (3,) Series: '' [f64] [ 1.0 3.0 NaN ]

property dtype_: DataType_[source]

Get the data type of this Series.

Examples

s = pl.Series("a", [1, 2, 3]) s.dtype Int64

entropy(

base: float = 2.718281828459045,

normalize: bool = True,

) → float | None [source]

Computes the entropy.

Uses the formula -sum(pk * log(pk) where pk are discrete probabilities.

Parameters:

base

Given base, defaults to e

normalize

Normalize pk if it doesn’t sum to 1.

Examples

a = pl.Series([0.99, 0.005, 0.005]) a.entropy(normalize=True) 0.06293300616044681 b = pl.Series([0.65, 0.10, 0.25]) b.entropy(normalize=True) 0.8568409950394724

eq(other: Any) → Series | Expr[source]

Method equivalent of operator expression series == other.

eq_missing(other: Any) → Series | Expr[source]

Method equivalent of equality operator series == other where None == None.

This differs from the standard eq where null values are propagated.

Parameters:

other

A literal or expression value to compare with.

Examples

s1 = pl.Series("a", [333, 200, None]) s2 = pl.Series("a", [100, 200, None]) s1.eq(s2) shape: (3,) Series: 'a' [bool] [ false true null ] s1.eq_missing(s2) shape: (3,) Series: 'a' [bool] [ false true true ]

equals(

other: Series,

check_dtypes: bool = False,

check_names: bool = False,

null_equal: bool = True,

) → bool [source]

Check whether the Series is equal to another Series.

Changed in version 0.20.31: The strict parameter was renamed check_dtypes.

Parameters:

other

Series to compare with.

check_dtypes

Require data types to match.

check_names

Require names to match.

null_equal

Consider null values as equal.

Examples

s1 = pl.Series("a", [1, 2, 3]) s2 = pl.Series("b", [4, 5, 6]) s1.equals(s1) True s1.equals(s2) False

estimated_size(unit: SizeUnit = 'b') → int | float [source]

Return an estimation of the total (heap) allocated size of the Series.

Estimated size is given in the specified unit (bytes by default).

This estimation is the sum of the size of its buffers, validity, including nested arrays. Multiple arrays may share buffers and bitmaps. Therefore, the size of 2 arrays is not the sum of the sizes computed from this function. In particular, [StructArray]’s size is an upper bound.

When an array is sliced, its allocated size remains constant because the buffer unchanged. However, this function will yield a smaller number. This is because this function returns the visible size of the buffer, not its total capacity.

FFI buffers are included in this estimation.

Parameters:

unit{‘b’, ‘kb’, ‘mb’, ‘gb’, ‘tb’}

Scale the returned size to the given unit.

Notes

For data with Object dtype, the estimated size only reports the pointer size, which is a huge underestimation.

Examples

s = pl.Series("values", list(range(1_000_000)), dtype=pl.UInt32) s.estimated_size() 4000000 s.estimated_size("mb") 3.814697265625

ewm_mean(

com: float | None = None,

span: float | None = None,

half_life: float | None = None,

alpha: float | None = None,

adjust: bool = True,

min_samples: int = 1,

ignore_nulls: bool = False,

) → Series[source]

Compute exponentially-weighted moving average.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

com

Specify decay in terms of center of mass, \(\gamma\), with

\[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]

span

Specify decay in terms of span, \(\theta\), with

\[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]

half_life

Specify decay in terms of half-life, \(\tau\), with

\[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \tau } \right\} \; \forall \; \tau > 0\]

alpha

Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\).

adjust

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

When adjust=True (the default) the EW function is calculated using weights \(w_i = (1 - \alpha)^i\)

When adjust=False the EW function is calculated recursively by
\[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]

min_samples

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are\((1-\alpha)^2\) and \(1\) if adjust=True, and\((1-\alpha)^2\) and \(\alpha\) if adjust=False.

When ignore_nulls=True, weights are based on relative positions. For example, the weights of\(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are\(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.

Examples

s = pl.Series([1, 2, 3]) s.ewm_mean(com=1, ignore_nulls=False) shape: (3,) Series: '' [f64] [ 1.0 1.666667 2.428571 ]

ewm_mean_by(by: IntoExpr, *, half_life: str | timedelta) → Series[source]

Compute time-based exponentially weighted moving average.

Given observations \(x_0, x_1, \ldots, x_{n-1}\) at times\(t_0, t_1, \ldots, t_{n-1}\), the EWMA is calculated as

\[ \begin{align}\begin{aligned}y_0 &= x_0\\\alpha_i &= 1 - \exp \left\{ \frac{ -\ln(2)(t_i-t_{i-1}) } { \tau } \right\}\\y_i &= \alpha_i x_i + (1 - \alpha_i) y_{i-1}; \quad i > 0\end{aligned}\end{align} \]

where \(\tau\) is the half_life.

Parameters:

Times to calculate average by. Should be DateTime, Date, UInt64,UInt32, Int64, or Int32 data type.

half_life

Unit over which observation decays to half its value.

Can be created either from a timedelta, or by using the following string language:

1ns (1 nanosecond)
1us (1 microsecond)
1ms (1 millisecond)
1s (1 second)
1m (1 minute)
1h (1 hour)
1d (1 day)
1w (1 week)
1i (1 index count)

Or combine them: “3d12h4m25s” # 3 days, 12 hours, 4 minutes, and 25 seconds

Note that half_life is treated as a constant duration - calendar durations such as months (or even days in the time-zone-aware case) are not supported, please express your duration in an approximately equivalent number of hours (e.g. ‘370h’ instead of ‘1mo’).

Returns:

Expr

Float32 if input is Float32, otherwise Float64.

Examples

from datetime import date, timedelta df = pl.DataFrame( ... { ... "values": [0, 1, 2, None, 4], ... "times": [ ... date(2020, 1, 1), ... date(2020, 1, 3), ... date(2020, 1, 10), ... date(2020, 1, 15), ... date(2020, 1, 17), ... ], ... } ... ).sort("times") df["values"].ewm_mean_by(df["times"], half_life="4d") shape: (5,) Series: 'values' [f64] [ 0.0 0.292893 1.492474 null 3.254508 ]

ewm_std(

com: float | None = None,

span: float | None = None,

half_life: float | None = None,

alpha: float | None = None,

adjust: bool = True,

bias: bool = False,

min_samples: int = 1,

ignore_nulls: bool = False,

) → Series[source]

Compute exponentially-weighted moving standard deviation.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

com

Specify decay in terms of center of mass, \(\gamma\), with

\[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]

span

Specify decay in terms of span, \(\theta\), with

\[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]

half_life

Specify decay in terms of half-life, \(\lambda\), with

\[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]

alpha

Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\).

adjust

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

When adjust=True (the default) the EW function is calculated using weights \(w_i = (1 - \alpha)^i\)

When adjust=False the EW function is calculated recursively by
\[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]

bias

When bias=False, apply a correction to make the estimate statistically unbiased.

min_samples

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are\((1-\alpha)^2\) and \(1\) if adjust=True, and\((1-\alpha)^2\) and \(\alpha\) if adjust=False.

When ignore_nulls=True, weights are based on relative positions. For example, the weights of\(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are\(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.

Examples

s = pl.Series("a", [1, 2, 3]) s.ewm_std(com=1, ignore_nulls=False) shape: (3,) Series: 'a' [f64] [ 0.0 0.707107 0.963624 ]

ewm_var(

com: float | None = None,

span: float | None = None,

half_life: float | None = None,

alpha: float | None = None,

adjust: bool = True,

bias: bool = False,

min_samples: int = 1,

ignore_nulls: bool = False,

) → Series[source]

Compute exponentially-weighted moving variance.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

com

Specify decay in terms of center of mass, \(\gamma\), with

\[\alpha = \frac{1}{1 + \gamma} \; \forall \; \gamma \geq 0\]

span

Specify decay in terms of span, \(\theta\), with

\[\alpha = \frac{2}{\theta + 1} \; \forall \; \theta \geq 1\]

half_life

Specify decay in terms of half-life, \(\lambda\), with

\[\alpha = 1 - \exp \left\{ \frac{ -\ln(2) }{ \lambda } \right\} \; \forall \; \lambda > 0\]

alpha

Specify smoothing factor alpha directly, \(0 < \alpha \leq 1\).

adjust

Divide by decaying adjustment factor in beginning periods to account for imbalance in relative weightings

When adjust=True (the default) the EW function is calculated using weights \(w_i = (1 - \alpha)^i\)

When adjust=False the EW function is calculated recursively by
\[\begin{split}y_0 &= x_0 \\ y_t &= (1 - \alpha)y_{t - 1} + \alpha x_t\end{split}\]

bias

When bias=False, apply a correction to make the estimate statistically unbiased.

min_samples

Minimum number of observations in window required to have a value (otherwise result is null).

ignore_nulls

Ignore missing values when calculating weights.

When ignore_nulls=False (default), weights are based on absolute positions. For example, the weights of \(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are\((1-\alpha)^2\) and \(1\) if adjust=True, and\((1-\alpha)^2\) and \(\alpha\) if adjust=False.

When ignore_nulls=True, weights are based on relative positions. For example, the weights of\(x_0\) and \(x_2\) used in calculating the final weighted average of [\(x_0\), None, \(x_2\)] are\(1-\alpha\) and \(1\) if adjust=True, and \(1-\alpha\) and \(\alpha\) if adjust=False.

Examples

s = pl.Series("a", [1, 2, 3]) s.ewm_var(com=1, ignore_nulls=False) shape: (3,) Series: 'a' [f64] [ 0.0 0.5 0.928571 ]

exp() → Series[source]

Compute the exponential, element-wise.

Examples

s = pl.Series([1, 2, 3]) s.exp() shape: (3,) Series: '' [f64] [ 2.718282 7.389056 20.085537 ]

explode() → Series[source]

Explode a list Series.

This means that every item is expanded to a new row.

Returns:

Series

Series with the data type of the list elements.

Examples

s = pl.Series("a", [[1, 2, 3], [4, 5, 6]]) s shape: (2,) Series: 'a' [list[i64]] [ [1, 2, 3] [4, 5, 6] ] s.explode() shape: (6,) Series: 'a' [i64] [ 1 2 3 4 5 6 ]

extend(other: Series) → Self[source]

Extend the memory backed by this Series with the values from another.

Different from append, which adds the chunks from other to the chunks of this series, extend appends the data from other to the underlying memory locations and thus may cause a reallocation (which is expensive).

If this does not cause a reallocation, the resulting data structure will not have any extra chunks and thus will yield faster queries.

Prefer extend over append when you want to do a query after a single append. For instance, during online operations where you add n rows and rerun a query.

Prefer append over extend when you want to append many times before doing a query. For instance, when you read in multiple files and want to store them in a single Series. In the latter case, finish the sequence of append operations with a rechunk.

Parameters:

other

Series to extend the series with.

Warning

This method modifies the series in-place. The series is returned for convenience only.

Examples

a = pl.Series("a", [1, 2, 3]) b = pl.Series("b", [4, 5]) a.extend(b) shape: (5,) Series: 'a' [i64] [ 1 2 3 4 5 ]

The resulting series will consist of a single chunk.

extend_constant(value: IntoExpr, n: int | IntoExprColumn) → Series[source]

Extremely fast method for extending the Series with ‘n’ copies of a value.

Parameters:

value

A constant literal value or a unit expression with which to extend the expression result Series; can pass None to extend with nulls.

The number of additional values that will be added.

Examples

s = pl.Series([1, 2, 3]) s.extend_constant(99, n=2) shape: (5,) Series: '' [i64] [ 1 2 3 99 99 ]

fill_nan(value: int | float | Expr | None) → Series[source]

Fill floating point NaN value with a fill value.

Parameters:

value

Value used to fill NaN values.

Notes

A NaN value is not the same as a null value. To fill null values, use fill_null().

Examples

s = pl.Series("a", [1.0, 2.0, 3.0, float("nan")]) s.fill_nan(0) shape: (4,) Series: 'a' [f64] [ 1.0 2.0 3.0 0.0 ]

fill_null(

value: Any | Expr | None = None,

strategy: FillNullStrategy | None = None,

limit: int | None = None,

) → Series[source]

Fill null values using the specified value or strategy.

Parameters:

value

Value used to fill null values.

strategy{None, ‘forward’, ‘backward’, ‘min’, ‘max’, ‘mean’, ‘zero’, ‘one’}

Strategy used to fill null values.

limit

Number of consecutive null values to fill when using the ‘forward’ or ‘backward’ strategy.

Notes

A null value is not the same as a NaN value. To fill NaN values, use fill_nan().

Examples

s = pl.Series("a", [1, 2, 3, None]) s.fill_null(strategy="forward") shape: (4,) Series: 'a' [i64] [ 1 2 3 3 ] s.fill_null(strategy="min") shape: (4,) Series: 'a' [i64] [ 1 2 3 1 ] s = pl.Series("b", ["x", None, "z"]) s.fill_null(pl.lit("")) shape: (3,) Series: 'b' [str] [ "x" "" "z" ]

filter(predicate: Series | Iterable[bool]) → Self[source]

Filter elements by a boolean mask.

The original order of the remaining elements is preserved.

Elements where the filter does not evaluate to True are discarded, including nulls.

Parameters:

predicate

Boolean mask.

Examples

s = pl.Series("a", [1, 2, 3]) mask = pl.Series("", [True, False, True]) s.filter(mask) shape: (2,) Series: 'a' [i64] [ 1 3 ]

first() → PythonLiteral | None [source]

Get the first element of the Series.

Returns None if the Series is empty.

property flags_: dict[str, bool]_[source]

Get flags that are set on the Series.

Examples

s = pl.Series("a", [1, 2, 3]) s.flags {'SORTED_ASC': False, 'SORTED_DESC': False}

floor() → Series[source]

Rounds down to the nearest integer value.

Only works on floating point Series.

Examples

s = pl.Series("a", [1.12345, 2.56789, 3.901234]) s.floor() shape: (3,) Series: 'a' [f64] [ 1.0 2.0 3.0 ]

forward_fill(limit: int | None = None) → Series[source]

Fill missing values with the last non-null value.

This is an alias of .fill_null(strategy="forward").

Parameters:

limit

The number of consecutive null values to forward fill.

gather(indices: int | list[int] | Expr | Series | np.ndarray[Any, Any]) → Series[source]

Take values by index.

Parameters:

indices

Index location used for selection.

Examples

s = pl.Series("a", [1, 2, 3, 4]) s.gather([1, 3]) shape: (2,) Series: 'a' [i64] [ 2 4 ]

gather_every(n: int, offset: int = 0) → Series[source]

Take every nth value in the Series and return as new Series.

Parameters:

Gather every _n_-th row.

offset

Start the row index at this offset.

Examples

s = pl.Series("a", [1, 2, 3, 4]) s.gather_every(2) shape: (2,) Series: 'a' [i64] [ 1 3 ] s.gather_every(2, offset=1) shape: (2,) Series: 'a' [i64] [ 2 4 ]

ge(other: Any) → Series | Expr[source]

Method equivalent of operator expression series >= other.

get_chunks() → list[Series][source]

Get the chunks of this Series as a list of Series.

Examples

s1 = pl.Series("a", [1, 2, 3]) s2 = pl.Series("a", [4, 5, 6]) s = pl.concat([s1, s2], rechunk=False) s.get_chunks() [shape: (3,) Series: 'a' [i64] [ 1 2 3 ], shape: (3,) Series: 'a' [i64] [ 4 5 6 ]]

gt(other: Any) → Series | Expr[source]

Method equivalent of operator expression series > other.

has_nulls() → bool [source]

Check whether the Series contains one or more null values.

Examples

s = pl.Series([1, 2, None]) s.has_nulls() True s[:2].has_nulls() False

has_validity() → bool [source]

Check whether the Series contains one or more null values.

Deprecated since version 0.20.30: Use the has_nulls() method instead.

hash(

seed: int = 0,

seed_1: int | None = None,

seed_2: int | None = None,

seed_3: int | None = None,

) → Series[source]

Hash the Series.

The hash value is of type UInt64.

Parameters:

seed

Random seed parameter. Defaults to 0.

seed_1

Random seed parameter. Defaults to seed if not set.

seed_2

Random seed parameter. Defaults to seed if not set.

seed_3

Random seed parameter. Defaults to seed if not set.

Notes

This implementation of hash does not guarantee stable results across different Polars versions. Its stability is only guaranteed within a single version.

Examples

s = pl.Series("a", [1, 2, 3]) s.hash(seed=42)
shape: (3,) Series: 'a' [u64] [ 10734580197236529959 3022416320763508302 13756996518000038261 ]

head(n: int = 10) → Series[source]

Get the first n elements.

Parameters:

Number of elements to return. If a negative value is passed, return all elements except the last abs(n).

Examples

s = pl.Series("a", [1, 2, 3, 4, 5]) s.head(3) shape: (3,) Series: 'a' [i64] [ 1 2 3 ]

Pass a negative value to get all rows except the last abs(n).

s.head(-3) shape: (2,) Series: 'a' [i64] [ 1 2 ]

hist(

bins: list[float] | None = None,

bin_count: int | None = None,

include_category: bool = True,

include_breakpoint: bool = True,

) → DataFrame[source]

Bin values into buckets and count their occurrences.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Parameters:

bins

Bin edges. If None given, we determine the edges based on the data.

bin_count

If bins is not provided, bin_count uniform bins are created that fully encompass the data.

include_breakpoint

Include a column that indicates the upper breakpoint.

include_category

Include a column that shows the intervals as categories.

Returns:

DataFrame

Examples

a = pl.Series("a", [1, 3, 8, 8, 2, 1, 3]) a.hist(bin_count=4) shape: (4, 3) ┌────────────┬─────────────┬───────┐ │ breakpoint ┆ category ┆ count │ │ --- ┆ --- ┆ --- │ │ f64 ┆ cat ┆ u32 │ ╞════════════╪═════════════╪═══════╡ │ 2.75 ┆ [1.0, 2.75] ┆ 3 │ │ 4.5 ┆ (2.75, 4.5] ┆ 2 │ │ 6.25 ┆ (4.5, 6.25] ┆ 0 │ │ 8.0 ┆ (6.25, 8.0] ┆ 2 │ └────────────┴─────────────┴───────┘

implode() → Self[source]

Aggregate values into a list.

Examples

s = pl.Series("a", [1, 2, 3]) s.implode() shape: (1,) Series: 'a' [list[i64]] [ [1, 2, 3] ]

index_of(element: IntoExpr) → int | None [source]

Get the index of the first occurrence of a value, or None if it’s not found.

Parameters:

element

Value to find.

Examples

s = pl.Series("a", [1, None, 17]) s.index_of(17) 2 s.index_of(None) # search for a null 1 s.index_of(55) is None True

interpolate(method: InterpolationMethod = 'linear') → Series[source]

Interpolate intermediate values.

Nulls at the beginning and end of the series remain null.

Parameters:

method{‘linear’, ‘nearest’}

Interpolation method.

Examples

s = pl.Series("a", [1, 2, None, None, 5]) s.interpolate() shape: (5,) Series: 'a' [f64] [ 1.0 2.0 3.0 4.0 5.0 ]

interpolate_by(by: IntoExpr) → Series[source]

Interpolate intermediate values with x-coordinate based on another column.

Nulls at the beginning and end of the series remain null.

Parameters:

Column to interpolate values based on.

Examples

Fill null values using linear interpolation.

s = pl.Series([1, None, None, 3]) by = pl.Series([1, 2, 7, 8]) s.interpolate_by(by) shape: (4,) Series: '' [f64] [ 1.0 1.285714 2.714286 3.0 ]

is_between(

lower_bound: IntoExpr,

upper_bound: IntoExpr,

closed: ClosedInterval = 'both',

) → Series[source]

Get a boolean mask of the values that are between the given lower/upper bounds.

Parameters:

lower_bound

Lower bound value. Accepts expression input. Non-expression inputs (including strings) are parsed as literals.

upper_bound

Upper bound value. Accepts expression input. Non-expression inputs (including strings) are parsed as literals.

closed{‘both’, ‘left’, ‘right’, ‘none’}

Define which sides of the interval are closed (inclusive).

Notes

If the value of the lower_bound is greater than that of the upper_boundthen the result will be False, as no value can satisfy the condition.

Examples

s = pl.Series("num", [1, 2, 3, 4, 5]) s.is_between(2, 4) shape: (5,) Series: 'num' [bool] [ false true true true false ]

Use the closed argument to include or exclude the values at the bounds:

s.is_between(2, 4, closed="left") shape: (5,) Series: 'num' [bool] [ false true true false false ]

You can also use strings as well as numeric/temporal values:

s = pl.Series("s", ["a", "b", "c", "d", "e"]) s.is_between("b", "d", closed="both") shape: (5,) Series: 's' [bool] [ false true true true false ]

is_duplicated() → Series[source]

Get mask of all duplicated values.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series("a", [1, 2, 2, 3]) s.is_duplicated() shape: (4,) Series: 'a' [bool] [ false true true false ]

is_empty() → bool [source]

Check if the Series is empty.

Examples

s = pl.Series("a", [], dtype=pl.Float32) s.is_empty() True

is_finite() → Series[source]

Returns a boolean Series indicating which values are finite.

Returns:

Series

Series of data type Boolean.

Examples

import numpy as np s = pl.Series("a", [1.0, 2.0, np.inf]) s.is_finite() shape: (3,) Series: 'a' [bool] [ true true false ]

is_first_distinct() → Series[source]

Return a boolean mask indicating the first occurrence of each distinct value.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series([1, 1, 2, 3, 2]) s.is_first_distinct() shape: (5,) Series: '' [bool] [ true false true true false ]

is_in(other: Series | Collection[Any], *, nulls_equal: bool = False) → Series[source]

Check if elements of this Series are in the other Series.

Parameters:

nulls_equalbool, default False

If True, treat null as a distinct value. Null values will not propagate.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series("a", [1, 2, 3]) s2 = pl.Series("b", [2, 4, None]) s2.is_in(s) shape: (3,) Series: 'b' [bool] [ true false null ]

when nulls_equal=True, None is treated as a distinct value
s2.is_in(s, nulls_equal=True) shape: (3,) Series: 'b' [bool] [ true false false ]

check if some values are a member of sublists
sets = pl.Series("sets", [[1, 2, 3], [1, 2], [9, 10]]) optional_members = pl.Series("optional_members", [1, 2, 3]) print(sets) shape: (3,) Series: 'sets' [list[i64]] [ [1, 2, 3] [1, 2] [9, 10] ] print(optional_members) shape: (3,) Series: 'optional_members' [i64] [ 1 2 3 ] optional_members.is_in(sets) shape: (3,) Series: 'optional_members' [bool] [ true true false ]

is_infinite() → Series[source]

Returns a boolean Series indicating which values are infinite.

Returns:

Series

Series of data type Boolean.

Examples

import numpy as np s = pl.Series("a", [1.0, 2.0, np.inf]) s.is_infinite() shape: (3,) Series: 'a' [bool] [ false false true ]

is_last_distinct() → Series[source]

Return a boolean mask indicating the last occurrence of each distinct value.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series([1, 1, 2, 3, 2]) s.is_last_distinct() shape: (5,) Series: '' [bool] [ false true false true true ]

is_nan() → Series[source]

Returns a boolean Series indicating which values are NaN.

Returns:

Series

Series of data type Boolean.

Examples

import numpy as np s = pl.Series("a", [1.0, 2.0, 3.0, np.nan]) s.is_nan() shape: (4,) Series: 'a' [bool] [ false false false true ]

is_not_nan() → Series[source]

Returns a boolean Series indicating which values are not NaN.

Returns:

Series

Series of data type Boolean.

Examples

import numpy as np s = pl.Series("a", [1.0, 2.0, 3.0, np.nan]) s.is_not_nan() shape: (4,) Series: 'a' [bool] [ true true true false ]

is_not_null() → Series[source]

Returns a boolean Series indicating which values are not null.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series("a", [1.0, 2.0, 3.0, None]) s.is_not_null() shape: (4,) Series: 'a' [bool] [ true true true false ]

is_null() → Series[source]

Returns a boolean Series indicating which values are null.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series("a", [1.0, 2.0, 3.0, None]) s.is_null() shape: (4,) Series: 'a' [bool] [ false false false true ]

is_sorted(*, descending: bool = False, nulls_last: bool = False) → bool [source]

Check if the Series is sorted.

Parameters:

descending

Check if the Series is sorted in descending order

nulls_last

Set nulls at the end of the Series in sorted check.

Examples

s = pl.Series([1, 3, 2]) s.is_sorted() False

s = pl.Series([3, 2, 1]) s.is_sorted(descending=True) True

is_unique() → Series[source]

Get mask of all unique values.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series("a", [1, 2, 2, 3]) s.is_unique() shape: (4,) Series: 'a' [bool] [ true false false true ]

item(index: int | None = None) → Any [source]

Return the Series as a scalar, or return the element at the given index.

If no index is provided, this is equivalent to s[0], with a check that the shape is (1,). With an index, this is equivalent to s[index].

Examples

s1 = pl.Series("a", [1]) s1.item() 1 s2 = pl.Series("a", [9, 8, 7]) s2.cum_sum().item(-1) 24

kurtosis(*, fisher: bool = True, bias: bool = True) → float | None [source]

Compute the kurtosis (Fisher or Pearson) of a dataset.

Kurtosis is the fourth central moment divided by the square of the variance. If Fisher’s definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution. If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators

See scipy.stats for more information

Parameters:

fisherbool, optional

If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0).

biasbool, optional

If False, the calculations are corrected for statistical bias.

Examples

s = pl.Series("grades", [66, 79, 54, 97, 96, 70, 69, 85, 93, 75]) s.kurtosis() -1.0522623626787952 s.kurtosis(fisher=False) 1.9477376373212048 s.kurtosis(fisher=False, bias=False) 2.1040361802642717

last() → PythonLiteral | None [source]

Get the last element of the Series.

Returns None if the Series is empty.

le(other: Any) → Series | Expr[source]

Method equivalent of operator expression series <= other.

len() → int [source]

Return the number of elements in the Series.

Null values count towards the total.

Examples

s = pl.Series("a", [1, 2, None]) s.len() 3

limit(n: int = 10) → Series[source]

Get the first n elements.

Alias for Series.head().

Parameters:

Number of elements to return. If a negative value is passed, return all elements except the last abs(n).

Examples

s = pl.Series("a", [1, 2, 3, 4, 5]) s.limit(3) shape: (3,) Series: 'a' [i64] [ 1 2 3 ]

Pass a negative value to get all rows except the last abs(n).

s.limit(-3) shape: (2,) Series: 'a' [i64] [ 1 2 ]

log(base: float = 2.718281828459045) → Series[source]

Compute the logarithm to a given base.

Examples

s = pl.Series([1, 2, 3]) s.log() shape: (3,) Series: '' [f64] [ 0.0 0.693147 1.098612 ]

log10() → Series[source]

Compute the base 10 logarithm of the input array, element-wise.

Examples

s = pl.Series([10, 100, 1000]) s.log10() shape: (3,) Series: '' [f64] [ 1.0 2.0 3.0 ]

log1p() → Series[source]

Compute the natural logarithm of the input array plus one, element-wise.

Examples

s = pl.Series([1, 2, 3]) s.log1p() shape: (3,) Series: '' [f64] [ 0.693147 1.098612 1.386294 ]

lower_bound() → Self[source]

Return the lower bound of this Series’ dtype as a unit Series.

See also

upper_bound

return the upper bound of the given Series’ dtype.

Examples

s = pl.Series("s", [-1, 0, 1], dtype=pl.Int32) s.lower_bound() shape: (1,) Series: 's' [i32] [ -2147483648 ]

s = pl.Series("s", [1.0, 2.5, 3.0], dtype=pl.Float32) s.lower_bound() shape: (1,) Series: 's' [f32] [ -inf ]

lt(other: Any) → Series | Expr[source]

Method equivalent of operator expression series < other.

map_elements(

function: Callable[[Any], Any],

return_dtype: PolarsDataType | None = None,

skip_nulls: bool = True,

) → Self[source]

Map a custom/user-defined function (UDF) over elements in this Series.

Warning

This method is much slower than the native expressions API. Only use it if you cannot implement your logic otherwise.

Suppose that the function is: x ↦ sqrt(x):

For mapping elements of a series, consider: s.sqrt().
For mapping inner elements of lists, consider:s.list.eval(pl.element().sqrt()).
For mapping elements of struct fields, consider:s.struct.field("field_name").sqrt().

If the function returns a different datatype, the return_dtype arg should be set, otherwise the method will fail.

Implementing logic using a Python function is almost always _significantly_slower and more memory intensive than implementing the same logic using the native expression API because:

The native expression engine runs in Rust; UDFs run in Python.
Use of Python UDFs forces the DataFrame to be materialized in memory.
Polars-native expressions can be parallelised (UDFs typically cannot).
Polars-native expressions can be logically optimised (UDFs cannot).

Wherever possible you should strongly prefer the native expression API to achieve the best performance.

Parameters:

function

Custom function or lambda.

return_dtype

Output datatype. If not set, the dtype will be inferred based on the first non-null value that is returned by the function.

skip_nulls

Nulls will be skipped and not passed to the python function. This is faster because python can be skipped and because we call more specialized functions.

Returns:

Series

Warning

If return_dtype is not provided, this may lead to unexpected results. We allow this, but it is considered a bug in the user’s query.

Notes

If your function is expensive and you don’t want it to be called more than once for a given input, consider applying an @lru_cache decorator to it. If your data is suitable you may achieve significant speedups.

Examples

s = pl.Series("a", [1, 2, 3]) s.map_elements(lambda x: x + 10, return_dtype=pl.Int64)
shape: (3,) Series: 'a' [i64] [ 11 12 13 ]

max() → PythonLiteral | None [source]

Get the maximum value in this Series.

Examples

s = pl.Series("a", [1, 2, 3]) s.max() 3

mean() → PythonLiteral | None [source]

Reduce this Series to the mean value.

Examples

s = pl.Series("a", [1, 2, 3]) s.mean() 2.0

median() → PythonLiteral | None [source]

Get the median of this Series.

Examples

s = pl.Series("a", [1, 2, 3]) s.median() 2.0

min() → PythonLiteral | None [source]

Get the minimal value in this Series.

Examples

s = pl.Series("a", [1, 2, 3]) s.min() 1

mode() → Series[source]

Compute the most occurring value(s).

Can return multiple Values.

Examples

s = pl.Series("a", [1, 2, 2, 3]) s.mode() shape: (1,) Series: 'a' [i64] [ 2 ]

n_chunks() → int [source]

Get the number of chunks that this Series contains.

Examples

s = pl.Series("a", [1, 2, 3]) s.n_chunks() 1 s2 = pl.Series("a", [4, 5, 6])

Concatenate Series with rechunk = True

pl.concat([s, s2], rechunk=True).n_chunks() 1

Concatenate Series with rechunk = False

pl.concat([s, s2], rechunk=False).n_chunks() 2

n_unique() → int [source]

Count the number of unique values in this Series.

Examples

s = pl.Series("a", [1, 2, 2, 3]) s.n_unique() 3

property name_: str_[source]

Get the name of this Series.

Examples

s = pl.Series("a", [1, 2, 3]) s.name 'a'

Get maximum value, but propagate/poison encountered NaN values.

This differs from numpy’s nanmax as numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.

Examples

s = pl.Series("a", [1, 3, 4]) s.nan_max() 4

s = pl.Series("a", [1.0, float("nan"), 4.0]) s.nan_max() nan

Get minimum value, but propagate/poison encountered NaN values.

This differs from numpy’s nanmax as numpy defaults to propagating NaN values, whereas polars defaults to ignoring them.

Examples

s = pl.Series("a", [1, 3, 4]) s.nan_min() 1

s = pl.Series("a", [1.0, float("nan"), 4.0]) s.nan_min() nan

ne(other: Any) → Series | Expr[source]

Method equivalent of operator expression series != other.

ne_missing(other: Any) → Series | Expr[source]

Method equivalent of equality operator series != other where None == None.

This differs from the standard ne where null values are propagated.

Parameters:

other

A literal or expression value to compare with.

Examples

s1 = pl.Series("a", [333, 200, None]) s2 = pl.Series("a", [100, 200, None]) s1.ne(s2) shape: (3,) Series: 'a' [bool] [ true false null ] s1.ne_missing(s2) shape: (3,) Series: 'a' [bool] [ true false false ]

new_from_index(index: int, length: int) → Self[source]

Create a new Series filled with values from the given index.

Examples

s = pl.Series("a", [1, 2, 3, 4, 5]) s.new_from_index(1, 3) shape: (3,) Series: 'a' [i64] [ 2 2 2 ]

not_() → Series[source]

Negate a boolean Series.

Returns:

Series

Series of data type Boolean.

Examples

s = pl.Series("a", [True, False, False]) s.not_() shape: (3,) Series: 'a' [bool] [ false true true ]

null_count() → int [source]

Count the null values in this Series.

Examples

s = pl.Series([1, None, None]) s.null_count() 2

pct_change(n: int | IntoExprColumn = 1) → Series[source]

Computes percentage change between values.

Percentage change (as fraction) between current element and most-recent non-null element at least n period(s) before the current element.

Computes the change from the previous row by default.

Parameters:

periods to shift for forming percent change.

Examples

pl.Series(range(10)).pct_change() shape: (10,) Series: '' [f64] [ null inf 1.0 0.5 0.333333 0.25 0.2 0.166667 0.142857 0.125 ]

pl.Series([1, 2, 4, 8, 16, 32, 64, 128, 256, 512]).pct_change(2) shape: (10,) Series: '' [f64] [ null null 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 ]

peak_max() → Self[source]

Get a boolean mask of the local maximum peaks.

Examples

s = pl.Series("a", [1, 2, 3, 4, 5]) s.peak_max() shape: (5,) Series: 'a' [bool] [ false false false false true ]

peak_min() → Self[source]

Get a boolean mask of the local minimum peaks.

Examples

s = pl.Series("a", [4, 1, 3, 2, 5]) s.peak_min() shape: (5,) Series: 'a' [bool] [ false true false true false ]

property plot_: SeriesPlot_[source]

Create a plot namespace.

Warning

This functionality is currently considered unstable. It may be changed at any point without it being considered a breaking change.

Changed in version 1.6.0: In prior versions of Polars, HvPlot was the plotting backend. If you would like to restore the previous plotting functionality, all you need to do is add import hvplot.polars at the top of your script and replacedf.plot with df.hvplot.

Polars does not implement plotting logic itself, but instead defers to Altair:

s.plot.hist(**kwargs)is shorthand foralt.Chart(s.to_frame()).mark_bar(tooltip=True).encode(x=alt.X(f'{s.name}:Q', bin=True), y='count()', **kwargs).interactive()
s.plot.kde(**kwargs)is shorthand foralt.Chart(s.to_frame()).transform_density(s.name, as_=[s.name, 'density']).mark_area(tooltip=True).encode(x=s.name, y='density:Q', **kwargs).interactive()
for any other attribute attr, s.plot.attr(**kwargs)is shorthand foralt.Chart(s.to_frame().with_row_index()).mark_attr(tooltip=True).encode(x='index', y=s.name, **kwargs).interactive()

For configuration, we suggest readingChart Configuration. For example, you can:

Change the width/height/title with .properties(width=500, height=350, title="My amazing plot").
Change the x-axis label rotation with .configure_axisX(labelAngle=30).
Change the opacity of the points in your scatter plot with .configure_point(opacity=.5).

Examples

Histogram:

s = pl.Series([1, 4, 4, 6, 2, 4, 3, 5, 5, 7, 1]) s.plot.hist()

KDE plot:

Line plot:

pow(

exponent: int | float | Series,

) → Series[source]

Raise to the power of the given exponent.

If the exponent is float, the result follows the dtype of exponent. Otherwise, it follows dtype of base.

Parameters:

exponent

The exponent. Accepts Series input.

Examples

Raising integers to positive integers results in integers:

s = pl.Series("foo", [1, 2, 3, 4]) s.pow(3) shape: (4,) Series: 'foo' [i64] [ 1 8 27 64 ]

In order to raise integers to negative integers, you can cast either the base or the exponent to float:

s.pow(-3.0) shape: (4,) Series: 'foo' [f64] [ 1.0 0.125 0.037037 0.015625 ]

product() → int | float [source]

Reduce this Series to the product value.

Notes

If there are no non-null values, then the output is 1. If you would prefer empty products to return None, you can use s.product() if s.count() else None instead of s.product().

Examples

s = pl.Series("a", [1, 2, 3]) s.product() 6

qcut(

quantiles: Sequence[float] | int,

labels: Sequence[str] | None = None,

left_closed: bool = False,

allow_duplicates: bool = False,

include_breaks: bool = False,

) → Series[source]

Bin continuous values into discrete categories based on their quantiles.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Parameters:

quantiles

Either a list of quantile probabilities between 0 and 1 or a positive integer determining the number of bins with uniform probability.

labels

Names of the categories. The number of labels must be equal to the number of cut points plus one.

left_closed

Set the intervals to be left-closed instead of right-closed.

allow_duplicates

If set to True, duplicates in the resulting quantiles are dropped, rather than raising a DuplicateError. This can happen even with unique probabilities, depending on the data.

include_breaks

Include a column with the right endpoint of the bin each observation falls in. This will change the data type of the output from aCategorical to a Struct.

Returns:

Series

Series of data type Categorical if include_breaks is set toFalse (default), otherwise a Series of data type Struct.

Examples

Divide a column into three categories according to pre-defined quantile probabilities.

s = pl.Series("foo", [-2, -1, 0, 1, 2]) s.qcut([0.25, 0.75], labels=["a", "b", "c"]) shape: (5,) Series: 'foo' [cat] [ "a" "a" "b" "b" "c" ]

Divide a column into two categories using uniform quantile probabilities.

s.qcut(2, labels=["low", "high"], left_closed=True) shape: (5,) Series: 'foo' [cat] [ "low" "low" "high" "high" "high" ]

Create a DataFrame with the breakpoint and category for each value.

cut = s.qcut([0.25, 0.75], include_breaks=True).alias("cut") s.to_frame().with_columns(cut).unnest("cut") shape: (5, 3) ┌─────┬────────────┬────────────┐ │ foo ┆ breakpoint ┆ category │ │ --- ┆ --- ┆ --- │ │ i64 ┆ f64 ┆ cat │ ╞═════╪════════════╪════════════╡ │ -2 ┆ -1.0 ┆ (-inf, -1] │ │ -1 ┆ -1.0 ┆ (-inf, -1] │ │ 0 ┆ 1.0 ┆ (-1, 1] │ │ 1 ┆ 1.0 ┆ (-1, 1] │ │ 2 ┆ inf ┆ (1, inf] │ └─────┴────────────┴────────────┘

quantile(

quantile: float,

interpolation: QuantileMethod = 'nearest',

) → float | None [source]

Get the quantile value of this Series.

Parameters:

quantile

Quantile between 0.0 and 1.0.

interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’, ‘equiprobable’}

Interpolation method.

Examples

s = pl.Series("a", [1, 2, 3]) s.quantile(0.5) 2.0

rank(

method: RankMethod = 'average',

descending: bool = False,

seed: int | None = None,

) → Series[source]

Assign ranks to data, dealing with ties appropriately.

Parameters:

method{‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’, ‘random’}

The method used to assign ranks to tied elements. The following methods are available (default is ‘average’):

‘average’ : The average of the ranks that would have been assigned to all the tied values is assigned to each value.
‘min’ : The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.)
‘max’ : The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.
‘dense’ : Like ‘min’, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements.
‘ordinal’ : All values are given a distinct rank, corresponding to the order that the values occur in the Series.
‘random’ : Like ‘ordinal’, but the rank for ties is not dependent on the order that the values occur in the Series.

descending

Rank in descending order.

seed

If method="random", use this as seed.

Examples

The ‘average’ method:

s = pl.Series("a", [3, 6, 1, 1, 6]) s.rank() shape: (5,) Series: 'a' [f64] [ 3.0 4.5 1.5 1.5 4.5 ]

The ‘ordinal’ method:

s = pl.Series("a", [3, 6, 1, 1, 6]) s.rank("ordinal") shape: (5,) Series: 'a' [u32] [ 3 4 1 2 5 ]

rechunk(*, in_place: bool = False) → Self[source]

Create a single chunk of memory for this Series.

Parameters:

in_place

In place or not.

Examples

s1 = pl.Series("a", [1, 2, 3]) s1.n_chunks() 1 s2 = pl.Series("a", [4, 5, 6]) s = pl.concat([s1, s2], rechunk=False) s.n_chunks() 2 s.rechunk(in_place=True) shape: (6,) Series: 'a' [i64] [ 1 2 3 4 5 6 ] s.n_chunks() 1

reinterpret(*, signed: bool = True) → Series[source]

Reinterpret the underlying bits as a signed/unsigned integer.

This operation is only allowed for 64bit integers. For lower bits integers, you can safely use that cast operation.

Parameters:

signed

If True, reinterpret as pl.Int64. Otherwise, reinterpret as pl.UInt64.

Examples

s = pl.Series("a", [-(2**60), -2, 3]) s shape: (3,) Series: 'a' [i64] [ -1152921504606846976 -2 3 ] s.reinterpret(signed=False) shape: (3,) Series: 'a' [u64] [ 17293822569102704640 18446744073709551614 3 ]

rename(name: str) → Series[source]

Rename this Series.

Alias for Series.alias().

Parameters:

name

New name.

Examples

s = pl.Series("a", [1, 2, 3]) s.rename("b") shape: (3,) Series: 'b' [i64] [ 1 2 3 ]

repeat_by(by: int | IntoExprColumn) → Self[source]

Repeat the elements in this Series as specified in the given expression.

The repeated elements are expanded into a List.

Parameters:

Numeric column that determines how often the values will be repeated. The column will be coerced to UInt32. Give this dtype to make the coercion a no-op.

Returns:

Expr

Expression of data type List, where the inner data type is equal to the original data type.

Replace values by different values of the same data type.

Parameters:

old

Value or sequence of values to replace. Also accepts a mapping of values to their replacement as syntactic sugar forreplace(old=Series(mapping.keys()), new=Series(mapping.values())).

new

Value or sequence of values to replace by. Length must match the length of old or have length 1.

default

Set values that were not replaced to this value. Defaults to keeping the original value. Accepts expression input. Non-expression inputs are parsed as literals.

Deprecated since version 0.20.31: Use replace_all() instead to set a default while replacing values.

return_dtype

The data type of the resulting expression. If set to None (default), the data type is determined automatically based on the other inputs.

Deprecated since version 0.20.31: Use replace_all() instead to set a return data type while replacing values.

Notes

The global string cache must be enabled when replacing categorical values.

Examples

Replace a single value by another value. Values that were not replaced remain unchanged.

s = pl.Series([1, 2, 2, 3]) s.replace(2, 100) shape: (4,) Series: '' [i64] [ 1 100 100 3 ]

Replace multiple values by passing sequences to the old and new parameters.

s.replace([2, 3], [100, 200]) shape: (4,) Series: '' [i64] [ 1 100 100 200 ]

Passing a mapping with replacements is also supported as syntactic sugar.

mapping = {2: 100, 3: 200} s.replace(mapping) shape: (4,) Series: '' [i64] [ 1 100 100 200 ]

The original data type is preserved when replacing by values of a different data type. Use replace_strict() to replace and change the return data type.

s = pl.Series(["x", "y", "z"]) mapping = {"x": 1, "y": 2, "z": 3} s.replace(mapping) shape: (3,) Series: '' [str] [ "1" "2" "3" ]

Replace all values by different values.

Parameters:

old

Value or sequence of values to replace. Also accepts a mapping of values to their replacement as syntactic sugar forreplace_strict(old=Series(mapping.keys()), new=Series(mapping.values())).

new

Value or sequence of values to replace by. Length must match the length of old or have length 1.

default

Set values that were not replaced to this value. If no default is specified, (default), an error is raised if any values were not replaced. Accepts expression input. Non-expression inputs are parsed as literals.

return_dtype

The data type of the resulting Series. If set to None (default), the data type is determined automatically based on the other inputs.

Raises:

InvalidOperationError

If any non-null values in the original column were not replaced, and nodefault was specified.

Notes

The global string cache must be enabled when replacing categorical values.

Examples

Replace values by passing sequences to the old and new parameters.

s = pl.Series([1, 2, 2, 3]) s.replace_strict([1, 2, 3], [100, 200, 300]) shape: (4,) Series: '' [i64] [ 100 200 200 300 ]

Passing a mapping with replacements is also supported as syntactic sugar.

mapping = {1: 100, 2: 200, 3: 300} s.replace_strict(mapping) shape: (4,) Series: '' [i64] [ 100 200 200 300 ]

By default, an error is raised if any non-null values were not replaced. Specify a default to set all values that were not matched.

mapping = {2: 200, 3: 300} s.replace_strict(mapping)
Traceback (most recent call last): ... polars.exceptions.InvalidOperationError: incomplete mapping specified for replace_strict s.replace_strict(mapping, default=-1) shape: (4,) Series: '' [i64] [ -1 200 200 300 ]

The default can be another Series.

default = pl.Series([2.5, 5.0, 7.5, 10.0]) s.replace_strict(2, 200, default=default) shape: (4,) Series: '' [f64] [ 2.5 200.0 200.0 10.0 ]

Replacing by values of a different data type sets the return type based on a combination of the new data type and the default data type.

s = pl.Series(["x", "y", "z"]) mapping = {"x": 1, "y": 2, "z": 3} s.replace_strict(mapping) shape: (3,) Series: '' [i64] [ 1 2 3 ] s.replace_strict(mapping, default="x") shape: (3,) Series: '' [str] [ "1" "2" "3" ]

Set the return_dtype parameter to control the resulting data type directly.

s.replace_strict(mapping, return_dtype=pl.UInt8) shape: (3,) Series: '' [u8] [ 1 2 3 ]

reshape(dimensions: tuple[int, ...]) → Series[source]

Reshape this Series to a flat Series or an Array Series.

Parameters:

dimensions

Tuple of the dimension sizes. If a -1 is used in any of the dimensions, that dimension is inferred.

Returns:

Series

If a single dimension is given, results in a Series of the original data type. If a multiple dimensions are given, results in a Series of data typeArray with shape dimensions.

Examples

s = pl.Series("foo", [1, 2, 3, 4, 5, 6, 7, 8, 9]) square = s.reshape((3, 3)) square shape: (3,) Series: 'foo' [array[i64, 3]] [ [1, 2, 3] [4, 5, 6] [7, 8, 9] ] square.reshape((9,)) shape: (9,) Series: 'foo' [i64] [ 1 2 3 4 5 6 7 8 9 ]

reverse() → Series[source]

Return Series in reverse order.

Examples

s = pl.Series("a", [1, 2, 3], dtype=pl.Int8) s.reverse() shape: (3,) Series: 'a' [i8] [ 3 2 1 ]

rle() → Series[source]

Compress the Series data using run-length encoding.

Run-length encoding (RLE) encodes data by storing each run of identical values as a single value and its length.

Returns:

Series

Series of data type Struct with fields len of data type UInt32and value of the original data type.

Examples

s = pl.Series("s", [1, 1, 2, 1, None, 1, 3, 3]) s.rle().struct.unnest() shape: (6, 2) ┌─────┬───────┐ │ len ┆ value │ │ --- ┆ --- │ │ u32 ┆ i64 │ ╞═════╪═══════╡ │ 2 ┆ 1 │ │ 1 ┆ 2 │ │ 1 ┆ 1 │ │ 1 ┆ null │ │ 1 ┆ 1 │ │ 2 ┆ 3 │ └─────┴───────┘

rle_id() → Series[source]

Get a distinct integer ID for each run of identical values.

The ID starts at 0 and increases by one each time the value of the column changes.

Returns:

Series

Series of data type UInt32.

Notes

This functionality is especially useful for defining a new group for every time a column’s value changes, rather than for every distinct value of that column.

Examples

s = pl.Series("s", [1, 1, 2, 1, None, 1, 3, 3]) s.rle_id() shape: (8,) Series: 's' [u32] [ 0 0 1 2 3 4 5 5 ]

rolling_kurtosis(

window_size: int,

fisher: bool = True,

bias: bool = True,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Compute a rolling kurtosis.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

The window at a given row will include the row itself, and the window_size - 1elements before it.

Parameters:

window_size

Integer size of the rolling window.

fisherbool, optional

If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0).

biasbool, optional

If False, the calculations are corrected for statistical bias.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

pl.Series([1, 4, 2, 9]).rolling_kurtosis(3) shape: (4,) Series: '' [f64] [ null null -1.5 -1.5 ]

rolling_map(

function: Callable[[Series], Any],

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Compute a custom rolling window function.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

function

Custom aggregation function.

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Warning

Computing custom functions is extremely slow. Use specialized rolling functions such as Series.rolling_sum() if at all possible.

Examples

from numpy import nansum s = pl.Series([11.0, 2.0, 9.0, float("nan"), 8.0]) s.rolling_map(nansum, window_size=3) shape: (5,) Series: '' [f64] [ null null 22.0 11.0 17.0 ]

rolling_max(

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Apply a rolling max (moving max) over the values in this array.

A window of length window_size will traverse the array. The values that fill this window will (optionally) be multiplied with the weights given by theweight vector. The resulting values will be aggregated to their max.

The window at a given row will include the row itself and the window_size - 1elements before it.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

s = pl.Series("a", [100, 200, 300, 400, 500]) s.rolling_max(window_size=2) shape: (5,) Series: 'a' [i64] [ null 200 300 400 500 ]

rolling_mean(

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Apply a rolling mean (moving mean) over the values in this array.

The window at a given row will include the row itself and the window_size - 1elements before it.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

s = pl.Series("a", [100, 200, 300, 400, 500]) s.rolling_mean(window_size=2) shape: (5,) Series: 'a' [f64] [ null 150.0 250.0 350.0 450.0 ]

rolling_median(

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Compute a rolling median.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

The window at a given row will include the row itself and the window_size - 1elements before it.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]) s.rolling_median(window_size=3) shape: (6,) Series: 'a' [f64] [ null null 2.0 3.0 4.0 6.0 ]

rolling_min(

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Apply a rolling min (moving min) over the values in this array.

The window at a given row will include the row itself and the window_size - 1elements before it.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

s = pl.Series("a", [100, 200, 300, 400, 500]) s.rolling_min(window_size=3) shape: (5,) Series: 'a' [i64] [ null null 100 200 300 ]

rolling_quantile(

quantile: float,

interpolation: QuantileMethod = 'nearest',

window_size: int = 2,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Compute a rolling quantile.

The window at a given row will include the row itself and the window_size - 1elements before it.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

quantile

Quantile between 0.0 and 1.0.

interpolation{‘nearest’, ‘higher’, ‘lower’, ‘midpoint’, ‘linear’, ‘equiprobable’}

Interpolation method.

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]) s.rolling_quantile(quantile=0.33, window_size=3) shape: (6,) Series: 'a' [f64] [ null null 1.0 2.0 3.0 4.0 ] s.rolling_quantile(quantile=0.33, interpolation="linear", window_size=3) shape: (6,) Series: 'a' [f64] [ null null 1.66 2.66 3.66 5.32 ]

rolling_skew(

window_size: int,

bias: bool = True,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Compute a rolling skew.

Warning

This functionality is considered unstable. It may be changed at any point without it being considered a breaking change.

The window at a given row includes the row itself and thewindow_size - 1 elements before it.

Parameters:

window_size

Integer size of the rolling window.

bias

If False, the calculations are corrected for statistical bias.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

pl.Series([1, 4, 2, 9]).rolling_skew(3) shape: (4,) Series: '' [f64] [ null null 0.381802 0.47033 ]

Note how the values match

pl.Series([1, 4, 2]).skew(), pl.Series([4, 2, 9]).skew() (0.38180177416060584, 0.47033046033698594)

rolling_std(

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

ddof: int = 1,

) → Series[source]

Compute a rolling std dev.

The window at a given row will include the row itself and the window_size - 1elements before it.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

ddof

“Delta Degrees of Freedom”: The divisor for a length N window is N - ddof

Examples

s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]) s.rolling_std(window_size=3) shape: (6,) Series: 'a' [f64] [ null null 1.0 1.0 1.527525 2.0 ]

rolling_sum(

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

) → Series[source]

Apply a rolling sum (moving sum) over the values in this array.

The window at a given row will include the row itself and the window_size - 1elements before it.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

Examples

s = pl.Series("a", [1, 2, 3, 4, 5]) s.rolling_sum(window_size=2) shape: (5,) Series: 'a' [i64] [ null 3 5 7 9 ]

rolling_var(

window_size: int,

weights: list[float] | None = None,

min_samples: int | None = None,

center: bool = False,

ddof: int = 1,

) → Series[source]

Compute a rolling variance.

The window at a given row will include the row itself and the window_size - 1elements before it.

Changed in version 1.21.0: The min_periods parameter was renamed min_samples.

Parameters:

window_size

The length of the window in number of elements.

weights

An optional slice with the same length as the window that will be multiplied elementwise with the values in the window.

min_samples

The number of values in the window that should be non-null before computing a result. If set to None (default), it will be set equal to window_size.

center

Set the labels at the center of the window.

ddof

“Delta Degrees of Freedom”: The divisor for a length N window is N - ddof

Examples

s = pl.Series("a", [1.0, 2.0, 3.0, 4.0, 6.0, 8.0]) s.rolling_var(window_size=3) shape: (6,) Series: 'a' [f64] [ null null 1.0 1.0 2.333333 4.0 ]

round(decimals: int = 0, mode: RoundMode = 'half_to_even') → Series[source]

Round underlying floating point data by decimals digits.

The default rounding mode is “half to even” (also known as “bankers’ rounding”).

Parameters:

decimals

Number of decimals to round by.

mode{‘half_to_even’, ‘half_away_from_zero’}

Rounding mode.

Examples

s = pl.Series("a", [1.12345, 2.56789, 3.901234]) s.round(2) shape: (3,) Series: 'a' [f64] [ 1.12 2.57 3.9 ]

s = pl.Series([-3.5, -2.5, -1.5, -0.5, 0.5, 1.5, 2.5, 3.5]) s.round(mode="half_to_even") shape: (8,) Series: '' [f64] [ -4.0 -2.0 -2.0 -0.0 0.0 2.0 2.0 4.0 ]

round_sig_figs(digits: int) → Series[source]

Round to a number of significant figures.

Parameters:

digits

Number of significant figures to round to.

Examples

s = pl.Series([0.01234, 3.333, 3450.0]) s.round_sig_figs(2) shape: (3,) Series: '' [f64] [ 0.012 3.3 3500.0 ]

sample(

n: int | None = None,

fraction: float | None = None,

with_replacement: bool = False,

shuffle: bool = False,

seed: int | None = None,

) → Series[source]

Sample from this Series.

Parameters:

Number of items to return. Cannot be used with fraction. Defaults to 1 iffraction is None.

fraction

Fraction of items to return. Cannot be used with n.

with_replacement

Allow values to be sampled more than once.

shuffle

Shuffle the order of sampled data points.

seed

Seed for the random number generator. If set to None (default), a random seed is generated for each sample operation.

Examples

s = pl.Series("a", [1, 2, 3, 4, 5]) s.sample(2, seed=0)
shape: (2,) Series: 'a' [i64] [ 1 5 ]

scatter(

indices: Series | Iterable[int] | int | np.ndarray[Any, Any],

values: Series | Iterable[PythonLiteral] | PythonLiteral | None,

) → Series[source]

Set values at the index locations.

Parameters:

indices

Integers representing the index locations.

values

Replacement values.

Notes

Use of this function is frequently an anti-pattern, as it can block optimization (predicate pushdown, etc). Consider usingpl.when(predicate).then(value).otherwise(self) instead.

Examples

s = pl.Series("a", [1, 2, 3]) s.scatter(1, 10) shape: (3,) Series: 'a' [i64] [ 1 10 3 ]

It is better to implement this as follows:

s.to_frame().with_row_index().select( ... pl.when(pl.col("index") == 1).then(10).otherwise(pl.col("a")) ... ) shape: (3, 1) ┌─────────┐ │ literal │ │ --- │ │ i64 │ ╞═════════╡ │ 1 │ │ 10 │ │ 3 │ └─────────┘

search_sorted(

element: IntoExpr | np.ndarray[Any, Any] | None,

side: SearchSortedSide = 'any',

descending: bool = False,

) → int | Series[source]

Find indices where elements should be inserted to maintain order.

\[a[i-1] < v <= a[i]\]

Parameters:

element

Expression or scalar value.

side{‘any’, ‘left’, ‘right’}

If ‘any’, the index of the first suitable location found is given. If ‘left’, the index of the leftmost suitable location found is given. If ‘right’, return the rightmost suitable location found is given.

descending

Boolean indicating whether the values are descending or not (they are required to be sorted either way).

Examples

s = pl.Series("set", [1, 2, 3, 4, 4, 5, 6, 7]) s.search_sorted(4) 3 s.search_sorted(4, "left") 3 s.search_sorted(4, "right") 5 s.search_sorted([1, 4, 5]) shape: (3,) Series: 'set' [u32] [ 0 3 5 ] s.search_sorted([1, 4, 5], "left") shape: (3,) Series: 'set' [u32] [ 0 3 5 ] s.search_sorted([1, 4, 5], "right") shape: (3,) Series: 'set' [u32] [ 1 5 6 ]

set(

filter: Series,

value: int | float | str | bool | None,

) → Series[source]

Set masked values.

Parameters:

filter

Boolean mask.

value

Value with which to replace the masked values.

Notes

Use of this function is frequently an anti-pattern, as it can block optimisation (predicate pushdown, etc). Consider usingpl.when(predicate).then(value).otherwise(self) instead.

Examples

s = pl.Series("a", [1, 2, 3]) s.set(s == 2, 10) shape: (3,) Series: 'a' [i64] [ 1 10 3 ]

It is better to implement this as follows:

s.to_frame().select( ... pl.when(pl.col("a") == 2).then(10).otherwise(pl.col("a")) ... ) shape: (3, 1) ┌─────────┐ │ literal │ │ --- │ │ i64 │ ╞═════════╡ │ 1 │ │ 10 │ │ 3 │ └─────────┘

set_sorted(*, descending: bool = False) → Self[source]

Flags the Series as ‘sorted’.

Enables downstream code to user fast paths for sorted arrays.

Parameters:

descending

If the Series order is descending.

Warning

This can lead to incorrect results if this Series is not sorted!! Use with care!

Examples

s = pl.Series("a", [1, 2, 3]) s.set_sorted().max() 3

property shape_: tuple[int]_[source]

Shape of this Series.

Examples

s = pl.Series("a", [1, 2, 3]) s.shape (3,)

shift(n: int = 1, *, fill_value: IntoExpr | None = None) → Series[source]

Shift values by the given number of indices.

Parameters:

Number of indices to shift forward. If a negative value is passed, values are shifted in the opposite direction instead.

fill_value

Fill the resulting null values with this value. Accepts scalar expression input. Non-expression inputs are parsed as literals.

Notes

This method is similar to the LAG operation in SQL when the value for nis positive. With a negative value for n, it is similar to LEAD.

Examples

By default, values are shifted forward by one index.

s = pl.Series([1, 2, 3, 4]) s.shift() shape: (4,) Series: '' [i64] [ null 1 2 3 ]

Pass a negative value to shift in the opposite direction instead.

s.shift(-2) shape: (4,) Series: '' [i64] [ 3 4 null null ]

Specify fill_value to fill the resulting null values.

s.shift(-2, fill_value=100) shape: (4,) Series: '' [i64] [ 3 4 100 100 ]

shrink_dtype() → Series[source]

Shrink numeric columns to the minimal required datatype.

Shrink to the dtype needed to fit the extrema of this [Series]. This can be used to reduce memory pressure.

Examples

s = pl.Series("a", [1, 2, 3, 4, 5, 6]) s shape: (6,) Series: 'a' [i64] [ 1 2 3 4 5 6 ] s.shrink_dtype() shape: (6,) Series: 'a' [i8] [ 1 2 3 4 5 6 ]

shrink_to_fit(*, in_place: bool = False) → Series[source]

Shrink Series memory usage.

Shrinks the underlying array capacity to exactly fit the actual data. (Note that this function does not change the Series data type).

shuffle(seed: int | None = None) → Series[source]

Shuffle the contents of this Series.

Parameters:

seed

Seed for the random number generator. If set to None (default), a random seed is generated each time the shuffle is called.

Examples

s = pl.Series("a", [1, 2, 3]) s.shuffle(seed=1) shape: (3,) Series: 'a' [i64] [ 2 1 3 ]

sign() → Series[source]

Compute the element-wise sign function on numeric types.

The returned value is computed as follows:

-1 if x < 0.
1 if x > 0.
x otherwise (typically 0, but could be NaN if the input is).

Null values are preserved as-is, and the dtype of the input is preserved.

Examples

s = pl.Series("a", [-9.0, -0.0, 0.0, 4.0, float("nan"), None]) s.sign() shape: (6,) Series: 'a' [f64] [ -1.0 -0.0 0.0 1.0 NaN null ]

sin() → Series[source]

Compute the element-wise value for the sine.

Examples

import math s = pl.Series("a", [0.0, math.pi / 2.0, math.pi]) s.sin() shape: (3,) Series: 'a' [f64] [ 0.0 1.0 1.2246e-16 ]

sinh() → Series[source]

Compute the element-wise value for the hyperbolic sine.

Examples

s = pl.Series("a", [1.0, 0.0, -1.0]) s.sinh() shape: (3,) Series: 'a' [f64] [ 1.175201 0.0 -1.175201 ]

skew(*, bias: bool = True) → float | None [source]

Compute the sample skewness of a data set.

For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.

See scipy.stats for more information.

Parameters:

biasbool, optional

If False, the calculations are corrected for statistical bias.

Notes

The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.

\[g_1=\frac{m_3}{m_2^{3/2}}\]

where

\[m_i=\frac{1}{N}\sum_{n=1}^N(x[n]-\bar{x})^i\]

is the biased sample \(i\texttt{th}\) central moment, and\(\bar{x}\) is the sample mean. If bias is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.

\[G_1 = \frac{k_3}{k_2^{3/2}} = \frac{\sqrt{N(N-1)}}{N-2}\frac{m_3}{m_2^{3/2}}\]

Examples

s = pl.Series([1, 2, 2, 4, 5]) s.skew() 0.34776706224699483

slice(offset: int, length: int | None = None) → Series[source]

Get a slice of this Series.

Parameters:

offset

Start index. Negative indexing is supported.

length

Length of the slice. If set to None, all rows starting at the offset will be selected.

Examples

s = pl.Series("a", [1, 2, 3, 4]) s.slice(1, 2) shape: (2,) Series: 'a' [i64] [ 2 3 ]

sort(

descending: bool = False,

nulls_last: bool = False,

multithreaded: bool = True,

in_place: bool = False,

) → Self[source]

Sort this Series.

Parameters:

descending

Sort in descending order.

nulls_last

Place null values last instead of first.

multithreaded

Sort using multiple threads.

in_place

Sort in-place.

Examples

s = pl.Series("a", [1, 3, 4, 2]) s.sort() shape: (4,) Series: 'a' [i64] [ 1 2 3 4 ] s.sort(descending=True) shape: (4,) Series: 'a' [i64] [ 4 3 2 1 ]

sqrt() → Series[source]

Compute the square root of the elements.

Syntactic sugar for

pl.Series([1, 2]) ** 0.5 shape: (2,) Series: '' [f64] [ 1.0 1.414214 ]

Examples

s = pl.Series([1, 2, 3]) s.sqrt() shape: (3,) Series: '' [f64] [ 1.0 1.414214 1.732051 ]

std(ddof: int = 1) → float | timedelta | None [source]

Get the standard deviation of this Series.

Parameters:

ddof

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Examples

s = pl.Series("a", [1, 2, 3]) s.std() 1.0

sum() → int | float [source]

Reduce this Series to the sum value.

Notes

Dtypes in {Int8, UInt8, Int16, UInt16} are cast to Int64 before summing to prevent overflow issues.
If there are no non-null values, then the output is 0. If you would prefer empty sums to return None, you can use s.sum() if s.count() else None instead of s.sum().

Examples

s = pl.Series("a", [1, 2, 3]) s.sum() 6

tail(n: int = 10) → Series[source]

Get the last n elements.

Parameters:

Number of elements to return. If a negative value is passed, return all elements except the first abs(n).

Examples

s = pl.Series("a", [1, 2, 3, 4, 5]) s.tail(3) shape: (3,) Series: 'a' [i64] [ 3 4 5 ]

Pass a negative value to get all rows except the first abs(n).

s.tail(-3) shape: (2,) Series: 'a' [i64] [ 4 5 ]

tan() → Series[source]

Compute the element-wise value for the tangent.

Examples

import math s = pl.Series("a", [0.0, math.pi / 2.0, math.pi]) s.tan() shape: (3,) Series: 'a' [f64] [ 0.0 1.6331e16 -1.2246e-16 ]

tanh() → Series[source]

Compute the element-wise value for the hyperbolic tangent.

Examples

s = pl.Series("a", [1.0, 0.0, -1.0]) s.tanh() shape: (3,) Series: 'a' [f64] [ 0.761594 0.0 -0.761594 ]

to_arrow(

compat_level: CompatLevel | None = None,

) → Array [source]

Return the underlying Arrow array.

If the Series contains only a single chunk this operation is zero copy.

Changed in version 1.24: The future parameter was renamed compat_level.

Parameters:

compat_level

Use a specific compatibility level when exporting Polars’ internal data structures.

Examples

s = pl.Series("a", [1, 2, 3]) s = s.to_arrow() s
<pyarrow.lib.Int64Array object at ...> [ 1, 2, 3 ]

to_dummies(*, separator: str = '_', drop_first: bool = False) → DataFrame[source]

Get dummy/indicator variables.

Parameters:

separator

Separator/delimiter used when generating column names.

drop_first

Remove the first category from the variable being encoded.

Examples

s = pl.Series("a", [1, 2, 3]) s.to_dummies() shape: (3, 3) ┌─────┬─────┬─────┐ │ a_1 ┆ a_2 ┆ a_3 │ │ --- ┆ --- ┆ --- │ │ u8 ┆ u8 ┆ u8 │ ╞═════╪═════╪═════╡ │ 1 ┆ 0 ┆ 0 │ │ 0 ┆ 1 ┆ 0 │ │ 0 ┆ 0 ┆ 1 │ └─────┴─────┴─────┘

s.to_dummies(drop_first=True) shape: (3, 2) ┌─────┬─────┐ │ a_2 ┆ a_3 │ │ --- ┆ --- │ │ u8 ┆ u8 │ ╞═════╪═════╡ │ 0 ┆ 0 │ │ 1 ┆ 0 │ │ 0 ┆ 1 │ └─────┴─────┘

to_frame(name: str | None = None) → DataFrame[source]

Cast this Series to a DataFrame.

Parameters:

name

optionally name/rename the Series column in the new DataFrame.

Examples

s = pl.Series("a", [123, 456]) df = s.to_frame() df shape: (2, 1) ┌─────┐ │ a │ │ --- │ │ i64 │ ╞═════╡ │ 123 │ │ 456 │ └─────┘

df = s.to_frame("xyz") df shape: (2, 1) ┌─────┐ │ xyz │ │ --- │ │ i64 │ ╞═════╡ │ 123 │ │ 456 │ └─────┘

to_init_repr(n: int = 1000) → str [source]

Convert Series to instantiable string representation.

Parameters:

Only use first n elements.

Examples

s = pl.Series("a", [1, 2, None, 4], dtype=pl.Int16) print(s.to_init_repr()) pl.Series('a', [1, 2, None, 4], dtype=pl.Int16) s_from_str_repr = eval(s.to_init_repr()) s_from_str_repr shape: (4,) Series: 'a' [i16] [ 1 2 null 4 ]

to_jax(device: jax.Device | str | None = None) → jax.Array[source]

Convert this Series to a Jax Array.

Added in version 0.20.27.

Warning

This functionality is currently considered unstable. It may be changed at any point without it being considered a breaking change.

Parameters:

device

Specify the jax Device on which the array will be created; can provide a string (such as “cpu”, “gpu”, or “tpu”) in which case the device is retrieved as jax.devices(string)[0]. For more specific control you can supply the instantiated Device directly. If None, arrays are created on the default device.

Examples

s = pl.Series("x", [10.5, 0.0, -10.0, 5.5]) s.to_jax() Array([ 10.5, 0. , -10. , 5.5], dtype=float32)

to_list() → list[Any][source]

Convert this Series to a Python list.

This operation copies data.

Examples

s = pl.Series("a", [1, 2, 3]) s.to_list() [1, 2, 3] type(s.to_list()) <class 'list'>

to_numpy(

writable: bool = False,

allow_copy: bool = True,

use_pyarrow: bool | None = None,

zero_copy_only: bool | None = None,

) → ndarray[Any, Any][source]

Convert this Series to a NumPy ndarray.

This operation copies data only when necessary. The conversion is zero copy when all of the following hold:

The data type is an integer, float, Datetime, Duration, or Array.
The Series contains no null values.
The Series consists of a single chunk.
The writable parameter is set to False (default).

Parameters:

writable

Ensure the resulting array is writable. This will force a copy of the data if the array was created without copy as the underlying Arrow data is immutable.

allow_copy

Allow memory to be copied to perform the conversion. If set to False, causes conversions that are not zero-copy to fail.

use_pyarrow

First convert to PyArrow, then call pyarrow.Array.to_numpyto convert to NumPy. If set to False, Polars’ own conversion logic is used.

Deprecated since version 0.20.28: Polars now uses its native engine by default for conversion to NumPy. To use PyArrow’s engine, call .to_arrow().to_numpy() instead.

zero_copy_only

Raise an exception if the conversion to a NumPy would require copying the underlying data. Data copy occurs, for example, when the Series contains nulls or non-numeric types.

Deprecated since version 0.20.10: Use the allow_copy parameter instead, which is the inverse of this one.

Examples

Numeric data without nulls can be converted without copying data. The resulting array will not be writable.

s = pl.Series([1, 2, 3], dtype=pl.Int8) arr = s.to_numpy() arr array([1, 2, 3], dtype=int8) arr.flags.writeable False

Set writable=True to force data copy to make the array writable.

s.to_numpy(writable=True).flags.writeable True

Integer Series containing nulls will be cast to a float type with nanrepresenting a null value. This requires data to be copied.

s = pl.Series([1, 2, None], dtype=pl.UInt16) s.to_numpy() array([ 1., 2., nan], dtype=float32)

Set allow_copy=False to raise an error if data would be copied.

s.to_numpy(allow_copy=False)
Traceback (most recent call last): ... RuntimeError: copy not allowed: cannot convert to a NumPy array without copying data

Series of data type Array and Struct will result in an array with more than one dimension.

s = pl.Series([[1, 2, 3], [4, 5, 6]], dtype=pl.Array(pl.Int64, 3)) s.to_numpy() array([[1, 2, 3], [4, 5, 6]])

to_pandas(

use_pyarrow_extension_array: bool = False,

**kwargs: Any,

) → pd.Series[Any][source]

Convert this Series to a pandas Series.

This operation copies data if use_pyarrow_extension_array is not enabled.

Parameters:

use_pyarrow_extension_array

Use a PyArrow-backed extension array instead of a NumPy array for the pandas Series. This allows zero copy operations and preservation of null values. Subsequent operations on the resulting pandas Series may trigger conversion to NumPy if those operations are not supported by PyArrow compute functions.

**kwargs

Additional keyword arguments to be passed topyarrow.Array.to_pandas().

Returns:

pandas.Series

Notes

This operation requires that both pandas and pyarrow are installed.

Examples

s = pl.Series("a", [1, 2, 3]) s.to_pandas() 0 1 1 2 2 3 Name: a, dtype: int64

Null values are converted to NaN.

s = pl.Series("b", [1, 2, None]) s.to_pandas() 0 1.0 1 2.0 2 NaN Name: b, dtype: float64

Pass use_pyarrow_extension_array=True to get a pandas Series backed by a PyArrow extension array. This will preserve null values.

s.to_pandas(use_pyarrow_extension_array=True) 0 1 1 2 2 Name: b, dtype: int64[pyarrow]

to_physical() → Series[source]

Cast to physical representation of the logical dtype.

polars.datatypes.Date() -> polars.datatypes.Int32()
polars.datatypes.Datetime() -> polars.datatypes.Int64()
polars.datatypes.Time() -> polars.datatypes.Int64()
polars.datatypes.Duration() -> polars.datatypes.Int64()
polars.datatypes.Categorical() -> polars.datatypes.UInt32()
List(inner) -> List(physical of inner)
Array(inner) -> Array(physical of inner)
Struct(fields) -> Struct(physical of fields)
Other data types will be left unchanged.

Warning

The physical representations are an implementation detail and not guaranteed to be stable.

Examples

Replicating the pandaspd.Series.factorizemethod.

s = pl.Series("values", ["a", None, "x", "a"]) s.cast(pl.Categorical).to_physical() shape: (4,) Series: 'values' [u32] [ 0 null 1 0 ]

to_torch() → torch.Tensor[source]

Convert this Series to a PyTorch Tensor.

Added in version 0.20.23.

Warning

This functionality is currently considered unstable. It may be changed at any point without it being considered a breaking change.

Notes

PyTorch tensors do not support UInt16, UInt32, or UInt64; these dtypes will be automatically cast to Int32, Int64, and Int64, respectively.

Examples

s = pl.Series("x", [1, 0, 1, 2, 0], dtype=pl.UInt8) s.to_torch() tensor([1, 0, 1, 2, 0], dtype=torch.uint8) s = pl.Series("x", [5.5, -10.0, 2.5], dtype=pl.Float32) s.to_torch() tensor([ 5.5000, -10.0000, 2.5000])

top_k(k: int = 5) → Series[source]

Return the k largest elements.

Non-null elements are always preferred over null elements. The output is not guaranteed to be in any particular order, call sort() after this function if you wish the output to be sorted.

This has time complexity:

\[O(n)\]

Parameters:

Number of elements to return.

Examples

s = pl.Series("a", [2, 5, 1, 4, 3]) s.top_k(3) shape: (3,) Series: 'a' [i64] [ 5 4 3 ]

top_k_by(

by: IntoExpr | Iterable[IntoExpr],

k: int = 5,

reverse: bool | Sequence[bool] = False,

) → Series[source]

Return the k largest elements of the by column.

This has time complexity:

\[O(n \log{n})\]

Parameters:

Column used to determine the largest elements. Accepts expression input. Strings are parsed as column names.

Number of elements to return.

reverse

Consider the k smallest elements of the by column (instead of the klargest). This can be specified per column by passing a sequence of booleans.

Examples

s = pl.Series("a", [2, 5, 1, 4, 3]) s.top_k_by("a", 3) shape: (3,) Series: 'a' [i64] [ 5 4 3 ]

unique(*, maintain_order: bool = False) → Series[source]

Get unique elements in series.

Parameters:

maintain_order

Maintain order of data. This requires more work.

Examples

s = pl.Series("a", [1, 2, 2, 3]) s.unique().sort() shape: (3,) Series: 'a' [i64] [ 1 2 3 ]

unique_counts() → Series[source]

Return a count of the unique values in the order of appearance.

Examples

s = pl.Series("id", ["a", "b", "b", "c", "c", "c"]) s.unique_counts() shape: (3,) Series: 'id' [u32] [ 1 2 3 ]

upper_bound() → Self[source]

Return the upper bound of this Series’ dtype as a unit Series.

See also

lower_bound

return the lower bound of the given Series’ dtype.

Examples

s = pl.Series("s", [-1, 0, 1], dtype=pl.Int8) s.upper_bound() shape: (1,) Series: 's' [i8] [ 127 ]

s = pl.Series("s", [1.0, 2.5, 3.0], dtype=pl.Float64) s.upper_bound() shape: (1,) Series: 's' [f64] [ inf ]

value_counts(

sort: bool = False,

parallel: bool = False,

name: str | None = None,

normalize: bool = False,

) → DataFrame[source]

Count the occurrences of unique values.

Parameters:

sort

Sort the output by count, in descending order. If set to False (default), the order is non-deterministic.

parallel

Execute the computation in parallel.

Note

This option should likely not be enabled in a group_by context, as the computation will already be parallelized per group.

name

Give the resulting count column a specific name; if normalize is True this defaults to “proportion”, otherwise defaults to “count”.

normalize

If True, the count is returned as the relative frequency of unique values normalized to 1.0.

Returns:

DataFrame

Columns map the unique values to their count (or proportion).

Examples

s = pl.Series("color", ["red", "blue", "red", "green", "blue", "blue"]) s.value_counts()
shape: (3, 2) ┌───────┬───────┐ │ color ┆ count │ │ --- ┆ --- │ │ str ┆ u32 │ ╞═══════╪═══════╡ │ red ┆ 2 │ │ green ┆ 1 │ │ blue ┆ 3 │ └───────┴───────┘

Sort the output by count and customize the count column name.

s.value_counts(sort=True, name="n") shape: (3, 2) ┌───────┬─────┐ │ color ┆ n │ │ --- ┆ --- │ │ str ┆ u32 │ ╞═══════╪═════╡ │ blue ┆ 3 │ │ red ┆ 2 │ │ green ┆ 1 │ └───────┴─────┘

Return the count as a relative frequency, normalized to 1.0: >>> s.value_counts(sort=True, normalize=True, name=”fraction”) shape: (3, 2) ┌───────┬──────────┐ │ color ┆ fraction │ │ — ┆ — │ │ str ┆ f64 │ ╞═══════╪══════════╡ │ blue ┆ 0.5 │ │ red ┆ 0.333333 │ │ green ┆ 0.166667 │ └───────┴──────────┘

var(ddof: int = 1) → float | timedelta | None [source]

Get variance of this Series.

Parameters:

ddof

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is 1.

Examples

s = pl.Series("a", [1, 2, 3]) s.var() 1.0

zip_with(mask: Series, other: Series) → Self[source]

Take values from self or other based on the given mask.

Where mask evaluates true, take values from self. Where mask evaluates false, take values from other.

Parameters:

mask

Boolean Series.

other

Series of same type.

Returns:

Series

Examples

s1 = pl.Series([1, 2, 3, 4, 5]) s2 = pl.Series([5, 4, 3, 2, 1]) s1.zip_with(s1 < s2, s2) shape: (5,) Series: '' [i64] [ 1 2 3 2 1 ] mask = pl.Series([True, False, True, False, True]) s1.zip_with(mask, s2) shape: (5,) Series: '' [i64] [ 1 4 3 2 5 ]