LogicalPlanBuilder in datafusion::logical_expr - Rust (original) (raw)

Struct LogicalPlanBuilder

pub struct LogicalPlanBuilder { /* private fields */ }

Expand description

Builder for logical plans

§Example building a simple plan

// Create a plan similar to
// SELECT last_name
// FROM employees
// WHERE salary < 1000
let plan = table_scan(Some("employee"), &employee_schema(), None)?
 // Keep only rows where salary < 1000
 .filter(col("salary").lt(lit(1000)))?
 // only show "last_name" in the final results
 .project(vec![col("last_name")])?
 .build()?;

// Convert from plan back to builder
let builder = LogicalPlanBuilder::from(plan);

Source §

Source

Create a builder from an existing plan

Source

Create a builder from an existing plan

Source

Return the output schema of the plan build so far

Source

Return the LogicalPlan of the plan build so far

Source

Create an empty relation.

produce_one_row set to true means this empty node needs to produce a placeholder row.

Source

Convert a regular plan into a recursive query.is_distinct indicates whether the recursive term should be de-duplicated (UNION) after each iteration or not (UNION ALL).

Source

Create a values list based relation, and the schema is inferred from data, consumingvalue. See the Postgres VALUESdocumentation for more details.

so it’s usually better to override the default names with a table alias list.

If the values include params/binders such as 1,1, 1,2, $3, etc, then the param_data_types should be provided.

Source

Create a values list based relation, and the schema is inferred from data itself or table schema if provided, consumingvalue. See the Postgres VALUESdocumentation for more details.

By default, it assigns the names column1, column2, etc. to the columns of a VALUES table. The column names are not specified by the SQL standard and different database systems do it differently, so it’s usually better to override the default names with a table alias list.

If the values include params/binders such as 1,1, 1,2, $3, etc, then the param_data_types should be provided.

Source

Convert a table provider into a builder with a TableScan

Note that if you pass a string as table_name, it is treated as a SQL identifier, as described on TableReference and thus is normalized

§Example:

// Scan table_source with the name "mytable" (after normalization)
let scan = LogicalPlanBuilder::scan("MyTable", table, None);

// Scan table_source with the name "MyTable" by enclosing in quotes
let scan = LogicalPlanBuilder::scan(r#""MyTable""#, table, None);

// Scan table_source with the name "MyTable" by forming the table reference
let table_reference = TableReference::bare("MyTable");
let scan = LogicalPlanBuilder::scan(table_reference, table, None);

Source

Create a CopyTo for copying the contents of this builder to the specified file(s)

Source

Create a DmlStatement for inserting the contents of this builder into the named table.

Note, use a DefaultTableSource to insert into a TableProvider

§Example:

// VALUES (1), (2)
let input = LogicalPlanBuilder::values(vec![vec![lit(1)], vec![lit(2)]])?
  .build()?;
// INSERT INTO MyTable VALUES (1), (2)
let insert_plan = LogicalPlanBuilder::insert_into(
  input,
  "MyTable",
  table_source,
  InsertOp::Append,
)?;

Source

Convert a table provider into a builder with a TableScan

Source

Convert a table provider into a builder with a TableScan with filter and fetch

Source

Wrap a plan in a window

Source

Apply a projection without alias.

Source

Apply a projection without alias with optional validation (true to validate, false to not validate)

Source

Select the given column indices

Source

Apply a filter

Source

Apply a filter which is used for a having clause

Source

Make a builder for a prepare logical plan from the builder’s plan

Source

Limit the number of rows returned

skip - Number of rows to skip before fetch any row.

fetch - Maximum number of rows to fetch, after skipping skip rows, if specified.

Source

Limit the number of rows returned

Similar to limit but uses expressions for skip and fetch

Source

Apply an alias

Source

Apply a sort by provided expressions with default direction

Source

Apply a sort

Source

Apply a union, preserving duplicate rows

Source

Apply a union by name, preserving duplicate rows

Source

Apply a union by name, removing duplicate rows

Source

Apply a union, removing duplicate rows

Source

Apply deduplication: Only distinct (different) values are returned)

Source

Project first values of the specified expression list according to the provided sorting expressions grouped by the DISTINCT ON clause expressions.

Source

Apply a join to right using explicitly specified columns and an optional filter expression.

See join_on for a more concise way to specify the join condition. Since DataFusion will automatically identify and optimize equality predicates there is no performance difference between this function and join_on

left_cols and right_cols are used to form “equijoin” predicates (see example below), which are then combined with the optional filterexpression.

Note that in case of outer join, the filter is applied to only matched rows.

Source

Apply a join using the specified expressions.

Note that DataFusion automatically optimizes joins, including identifying and optimizing equality predicates.

§Example

let example_schema = Arc::new(Schema::new(vec![
    Field::new("a", DataType::Int32, false),
    Field::new("b", DataType::Int32, false),
    Field::new("c", DataType::Int32, false),
]));
let table_source = Arc::new(LogicalTableSource::new(example_schema));
let left_table = table_source.clone();
let right_table = table_source.clone();

let right_plan = LogicalPlanBuilder::scan("right", right_table, None)?.build()?;

// Form the expression `(left.a != right.a)` AND `(left.b != right.b)`
let exprs = vec![
    col("left.a").eq(col("right.a")),
    col("left.b").not_eq(col("right.b"))
 ];

// Perform the equivalent of `left INNER JOIN right ON (a != a2 AND b != b2)`
// finding all pairs of rows from `left` and `right` where
// where `a = a2` and `b != b2`.
let plan = LogicalPlanBuilder::scan("left", left_table, None)?
    .join_on(right_plan, JoinType::Inner, exprs)?
    .build()?;

Source

Apply a join with on constraint and specified null equality.

The behavior is the same as join except that it allows specifying the null equality behavior.

If null_equals_null=true, rows where both join keys are null will be emitted. Otherwise rows where either or both join keys are null will be omitted.

Source

Apply a join with using constraint, which duplicates all join columns in output schema.

Source

Apply a cross join

Source

Repartition

Source

Apply a window functions to extend the schema

Source

Apply an aggregate: grouping on the group_expr expressions and calculating aggr_expr aggregates for each distinct value of the group_expr;

Source

Create an expression to represent the explanation of the plan

if analyze is true, runs the actual plan and produces information about metrics during run.

if verbose is true, prints out additional details.

Source

Process intersect set operator

Source

Process except set operator

Source

Build the plan

Source

Apply a join with both explicit equijoin and non equijoin predicates.

Note this is a low level API that requires identifying specific predicate types. Most users should use join_on that automatically identifies predicates appropriately.

equi_exprs defines equijoin predicates, of the form l = r) for each(l, r) tuple. l, the first element of the tuple, must only refer to columns from the existing input. r, the second element of the tuple, must only refer to columns from the right input.

filter contains any other other filter expression to apply during the join. Note that equi_exprs predicates are evaluated more efficiently than the filter expressions, so they are preferred.

Source

Unnest the given column.

Source

Unnest the given columns with the given UnnestOptions