pyarrow.dataset.Expression — Apache Arrow v20.0.0 (original) (raw)

class pyarrow.dataset.Expression#

Bases: _Weakrefable

A logical expression to be evaluated against some input.

To create an expression:

Examples

import pyarrow.compute as pc (pc.field("a") < pc.scalar(3)) | (pc.field("b") > 7) <pyarrow.compute.Expression ((a < 3) or (b > 7))> pc.field('a') != 3 <pyarrow.compute.Expression (a != 3)> pc.field('a').isin([1, 2, 3]) <pyarrow.compute.Expression is_in(a, {value_set=int64:[ 1, 2, 3 ], null_matching_behavior=MATCH})>

__init__(*args, **kwargs)#

Methods

cast(self, type=None, safe=None, options=None)#

Explicitly set or change the expression’s data type.

This creates a new expression equivalent to calling thecast compute function on this expression.

Parameters:

typeDataType, default None

Type to cast array to.

safebool, default True

Whether to check for conversion errors such as overflow.

optionsCastOptions, default None

Additional checks pass by CastOptions

Returns:

castExpression

equals(self, Expression other)#

Parameters:

otherpyarrow.dataset.Expression

Returns:

bool

static from_substrait(message)#

Deserialize an expression from Substrait

The serialized message must be an ExtendedExpression message that has only a single expression. The name of the expression and the schema the expression was bound to will be ignored. Use pyarrow.substrait.deserialize_expressions if this information is needed or if the message might contain multiple expressions.

Parameters:

messagebytes or Buffer or a protobuf Message

The Substrait message to deserialize

Returns:

Expression

The deserialized expression

is_nan(self)#

Check whether the expression is NaN.

This creates a new expression equivalent to calling theis_nan compute function on this expression.

Returns:

is_nanExpression

is_null(self, bool nan_is_null=False)#

Check whether the expression is null.

This creates a new expression equivalent to calling theis_null compute function on this expression.

Parameters:

nan_is_nullbool, default False

Whether floating-point NaNs are considered null.

Returns:

is_nullExpression

is_valid(self)#

Check whether the expression is not-null (valid).

This creates a new expression equivalent to calling theis_valid compute function on this expression.

Returns:

is_validExpression

isin(self, values)#

Check whether the expression is contained in values.

This creates a new expression equivalent to calling theis_in compute function on this expression.

Parameters:

valuesArray or iterable

The values to check for.

Returns:

isinExpression

A new expression that, when evaluated, checks whether this expression’s value is contained in values.

to_substrait(self, Schema schema, bool allow_arrow_extensions=False)#

Serialize the expression using Substrait

The expression will be serialized as an ExtendedExpression message that has a single expression named “expression”

Parameters:

schemaSchema

The input schema the expression will be bound to

allow_arrow_extensionsbool, default False

If False then only functions that are part of the core Substrait function definitions will be allowed. Set this to True to allow pyarrow-specific functions but the result may not be accepted by other compute libraries.

Returns:

Buffer

A buffer containing the serialized Protobuf plan.