LogicalPlanBuilder in datafusion::logical_expr - Rust (original) (raw)
Struct LogicalPlanBuilder
pub struct LogicalPlanBuilder { /* private fields */ }
Expand description
Builder for logical plans
§Example building a simple plan
// Create a plan similar to
// SELECT last_name
// FROM employees
// WHERE salary < 1000
let plan = table_scan(Some("employee"), &employee_schema(), None)?
// Keep only rows where salary < 1000
.filter(col("salary").lt(lit(1000)))?
// only show "last_name" in the final results
.project(vec![col("last_name")])?
.build()?;
// Convert from plan back to builder
let builder = LogicalPlanBuilder::from(plan);
Create a builder from an existing plan
Create a builder from an existing plan
Return the output schema of the plan build so far
Return the LogicalPlan of the plan build so far
Create an empty relation.
produce_one_row
set to true means this empty node needs to produce a placeholder row.
Convert a regular plan into a recursive query.is_distinct
indicates whether the recursive term should be de-duplicated (UNION
) after each iteration or not (UNION ALL
).
Create a values list based relation, and the schema is inferred from data, consumingvalue
. See the Postgres VALUESdocumentation for more details.
so it’s usually better to override the default names with a table alias list.
If the values include params/binders such as 1,1, 1,2, $3, etc, then the param_data_types
should be provided.
Create a values list based relation, and the schema is inferred from data itself or table schema if provided, consumingvalue
. See the Postgres VALUESdocumentation for more details.
By default, it assigns the names column1, column2, etc. to the columns of a VALUES table. The column names are not specified by the SQL standard and different database systems do it differently, so it’s usually better to override the default names with a table alias list.
If the values include params/binders such as 1,1, 1,2, $3, etc, then the param_data_types
should be provided.
Convert a table provider into a builder with a TableScan
Note that if you pass a string as table_name
, it is treated as a SQL identifier, as described on TableReference and thus is normalized
§Example:
// Scan table_source with the name "mytable" (after normalization)
let scan = LogicalPlanBuilder::scan("MyTable", table, None);
// Scan table_source with the name "MyTable" by enclosing in quotes
let scan = LogicalPlanBuilder::scan(r#""MyTable""#, table, None);
// Scan table_source with the name "MyTable" by forming the table reference
let table_reference = TableReference::bare("MyTable");
let scan = LogicalPlanBuilder::scan(table_reference, table, None);
Create a CopyTo for copying the contents of this builder to the specified file(s)
Create a DmlStatement for inserting the contents of this builder into the named table.
Note, use a DefaultTableSource to insert into a TableProvider
§Example:
// VALUES (1), (2)
let input = LogicalPlanBuilder::values(vec![vec![lit(1)], vec![lit(2)]])?
.build()?;
// INSERT INTO MyTable VALUES (1), (2)
let insert_plan = LogicalPlanBuilder::insert_into(
input,
"MyTable",
table_source,
InsertOp::Append,
)?;
Convert a table provider into a builder with a TableScan
Convert a table provider into a builder with a TableScan with filter and fetch
Wrap a plan in a window
Apply a projection without alias.
Apply a projection without alias with optional validation (true to validate, false to not validate)
Select the given column indices
Apply a filter
Apply a filter which is used for a having clause
Make a builder for a prepare logical plan from the builder’s plan
Limit the number of rows returned
skip
- Number of rows to skip before fetch any row.
fetch
- Maximum number of rows to fetch, after skipping skip
rows, if specified.
Limit the number of rows returned
Similar to limit
but uses expressions for skip
and fetch
Apply an alias
Apply a sort by provided expressions with default direction
Apply a sort
Apply a union, preserving duplicate rows
Apply a union by name, preserving duplicate rows
Apply a union by name, removing duplicate rows
Apply a union, removing duplicate rows
Apply deduplication: Only distinct (different) values are returned)
Project first values of the specified expression list according to the provided sorting expressions grouped by the DISTINCT ON
clause expressions.
Apply a join to right
using explicitly specified columns and an optional filter expression.
See join_on for a more concise way to specify the join condition. Since DataFusion will automatically identify and optimize equality predicates there is no performance difference between this function and join_on
left_cols
and right_cols
are used to form “equijoin” predicates (see example below), which are then combined with the optional filter
expression.
Note that in case of outer join, the filter
is applied to only matched rows.
Apply a join using the specified expressions.
Note that DataFusion automatically optimizes joins, including identifying and optimizing equality predicates.
§Example
let example_schema = Arc::new(Schema::new(vec![
Field::new("a", DataType::Int32, false),
Field::new("b", DataType::Int32, false),
Field::new("c", DataType::Int32, false),
]));
let table_source = Arc::new(LogicalTableSource::new(example_schema));
let left_table = table_source.clone();
let right_table = table_source.clone();
let right_plan = LogicalPlanBuilder::scan("right", right_table, None)?.build()?;
// Form the expression `(left.a != right.a)` AND `(left.b != right.b)`
let exprs = vec![
col("left.a").eq(col("right.a")),
col("left.b").not_eq(col("right.b"))
];
// Perform the equivalent of `left INNER JOIN right ON (a != a2 AND b != b2)`
// finding all pairs of rows from `left` and `right` where
// where `a = a2` and `b != b2`.
let plan = LogicalPlanBuilder::scan("left", left_table, None)?
.join_on(right_plan, JoinType::Inner, exprs)?
.build()?;
Apply a join with on constraint and specified null equality.
The behavior is the same as join except that it allows specifying the null equality behavior.
If null_equals_null=true
, rows where both join keys are null
will be emitted. Otherwise rows where either or both join keys are null
will be omitted.
Apply a join with using constraint, which duplicates all join columns in output schema.
Apply a cross join
Repartition
Apply a window functions to extend the schema
Apply an aggregate: grouping on the group_expr
expressions and calculating aggr_expr
aggregates for each distinct value of the group_expr
;
Create an expression to represent the explanation of the plan
if analyze
is true, runs the actual plan and produces information about metrics during run.
if verbose
is true, prints out additional details.
Process intersect set operator
Process except set operator
Build the plan
Apply a join with both explicit equijoin and non equijoin predicates.
Note this is a low level API that requires identifying specific predicate types. Most users should use join_on that automatically identifies predicates appropriately.
equi_exprs
defines equijoin predicates, of the form l = r)
for each(l, r)
tuple. l
, the first element of the tuple, must only refer to columns from the existing input. r
, the second element of the tuple, must only refer to columns from the right input.
filter
contains any other other filter expression to apply during the join. Note that equi_exprs
predicates are evaluated more efficiently than the filter expressions, so they are preferred.
Unnest the given column.
Unnest the given columns with the given UnnestOptions