predict - Predict responses using regression tree model - MATLAB (original) (raw)
Predict responses using regression tree model
Syntax
Description
[Yfit](#bst08n4-1-Yfit) = predict([tree](#bst08n4-1%5Fsep%5Fshared-tree),[X](#mw%5F94577590-8ed5-497b-ab08-5b2359b3dded))
returns a vector of predicted responses for the predictor data in the table or matrix X
, based on the trained regression treetree
.
[Yfit](#bst08n4-1-Yfit) = predict([tree](#bst08n4-1%5Fsep%5Fshared-tree),[X](#mw%5F94577590-8ed5-497b-ab08-5b2359b3dded),Subtrees=[subtrees](#mw%5F58ee4b0c-e622-4f31-8c86-c089f44eb064))
also prunes tree
to the level specified bysubtrees
, before predicting responses.
[[Yfit](#bst08n4-1-Yfit),[node](#bst08n4-1-node)] = predict(___)
also returns a vector of predicted node numbers for the responses, using any of the input argument combinations in the previous syntaxes.
Examples
Load the carsmall
data set. Consider Displacement
, Horsepower
, and Weight
as predictors of the response MPG
.
load carsmall X = [Displacement Horsepower Weight];
Grow a regression tree using the entire data set.
Predict the MPG for a car with 200 cubic inch engine displacement, 150 horsepower, and that weighs 3000 lbs.
X0 = [200 150 3000]; MPG0 = predict(Mdl,X0)
The regression tree predicts the car's efficiency to be 21.94 mpg.
Input Arguments
Predictor data used to predict responses, specified as a numeric matrix or a table.
Each row of X
corresponds to one observation, and each column corresponds to one variable.
For a numeric matrix:
- The variables that make up the columns of
X
must have the same order as the predictor variables used to traintree. - If you train
tree
using a table (for example,Tbl
),X
can be a numeric matrix ifTbl
contains only numeric predictor variables. To treat numeric predictors inTbl
as categorical during training, specify categorical predictors using the CategoricalPredictors name-value argument offitrtree. IfTbl
contains heterogeneous predictor variables (for example, numeric and categorical data types) andX
is a numeric matrix,predict
issues an error.
For a table:
predict
does not support multicolumn variables or cell arrays other than cell arrays of character vectors.- If you train
tree
using a table (for example,Tbl
), all predictor variables inX
must have the same variable names and data types as those used to traintree
(stored intree.PredictorNames
). However, the column order ofX
does not need to correspond to the column order ofTbl
.Tbl
andX
can contain additional variables, such as response variables and observation weights, butpredict
ignores them. - If you train
tree
using a numeric matrix, the predictor names intree.PredictorNames
must be the same as the corresponding predictor variable names inX
. To specify predictor names during training, use the PredictorNames name-value argument of fitrtree. All predictor variables inX
must be numeric vectors.X
can contain additional variables, such as response variables and observation weights, butpredict
ignores them.
Data Types: single
| double
| table
Pruning level, specified as a vector of nonnegative integers in ascending order or "all"
.
If you specify a vector, then all elements must be at least0
and at most max(tree.PruneList)
.0
indicates the full, unpruned tree, andmax(tree.PruneList)
indicates the completely pruned tree (that is, just the root node).
If you specify "all"
, thenpredict
operates on all subtrees (that is, the entire pruning sequence). This specification is equivalent to using0:max(tree.PruneList)
.
predict
prunes tree to each level specified by subtrees
, and then estimates the corresponding output arguments. The size of subtrees
determines the size of some output arguments.
For the function to invoke subtrees
, the propertiesPruneList
and PruneAlpha
oftree
must be nonempty. In other words, growtree
by setting Prune="on"
when you use fitrtree
, or by pruning tree
using prune.
Data Types: single
| double
| char
| string
Output Arguments
Predicted response values, returned as a numeric column vector with the same number of rows as X. Each row ofYfit
gives the predicted response to the corresponding row of X
, based on the regression modeltree.
Node numbers for the predictions, returned as a numeric vector. Each entry corresponds to the predicted leaf node in tree for the corresponding row of X.
Alternative Functionality
Simulink Block
To integrate the prediction of a regression tree model into Simulink®, you can use the RegressionTree Predict block in the Statistics and Machine Learning Toolbox™ library or a MATLAB® Function block with the predict
function. For examples, see Predict Responses Using RegressionTree Predict Block and Predict Class Labels Using MATLAB Function Block.
When deciding which approach to use, consider the following:
- If you use the Statistics and Machine Learning Toolbox library block, you can use the Fixed-Point Tool (Fixed-Point Designer) to convert a floating-point model to fixed point.
- Support for variable-size arrays must be enabled for a MATLAB Function block with the
predict
function. - If you use a MATLAB Function block, you can use MATLAB functions for preprocessing or post-processing before or after predictions in the same MATLAB Function block.
Extended Capabilities
This function fully supports tall arrays. You can use models trained on either in-memory or tall data with this function.
For more information, see Tall Arrays.
Usage notes and limitations:
You can generate C/C++ code for both
predict
andupdate
by using a coder configurer. Or, generate code only forpredict
by usingsaveLearnerForCoder
,loadLearnerForCoder
, andcodegen
.- Code generation for
predict
and update — Create a coder configurer by using learnerCoderConfigurer and then generate code by using generateCode. Then you can update model parameters in the generated code without having to regenerate the code. - Code generation for
predict
— Save a trained model by using saveLearnerForCoder. Define an entry-point function that loads the saved model by using loadLearnerForCoder and calls thepredict
function. Then use codegen (MATLAB Coder) to generate code for the entry-point function.
- Code generation for
To generate single-precision C/C++ code for
predict
, specifyDataType="single"
when you call the loadLearnerForCoder function.You can also generate fixed-point C/C++ code for
predict
. Fixed-point code generation requires an additional step that defines the fixed-point data types of the variables required for prediction. Create a fixed-point data type structure by using the data type function generated by generateLearnerDataTypeFcn, and then use the structure as an input argument ofloadLearnerForCoder
in an entry-point function. Generating fixed-point C/C++ code requires MATLAB Coder™ and Fixed-Point Designer™.This table contains notes about the arguments of
predict
. Arguments not included in this table are fully supported.Argument Notes and Limitations tree For the usage notes and limitations of the model object, see Code Generation of theCompactRegressionTree object. X For general code generation, X must be a single-precision or double-precision matrix or a table containing numeric variables, categorical variables, or both.In the coder configurer workflow, X must be a single-precision or double-precision matrix.For fixed-point code generation, X must be a fixed-point matrix.The number of rows, or observations, in X can be a variable size, but the number of columns in X must be fixed.If you want to specify X as a table, then your model must be trained using a table, and your entry-point function for prediction must do the following: Accept data as arrays.Create a table from the data input arguments and specify the variable names in the table.Pass the table to predict.For an example of this table workflow, see Generate Code to Classify Data in Table. For more information on using tables in code generation, see Code Generation for Tables (MATLAB Coder) and Table Limitations for Code Generation (MATLAB Coder). Subtrees Names in name-value arguments must be compile-time constants. For example, to allow user-defined pruning levels in the generated code, include{coder.Constant("Subtrees"),coder.typeof(0,[1,n],[0,1])} in the-args value of codegen (MATLAB Coder), where n ismax(tree.PruneList).The Subtrees name-value argument is not supported in the coder configurer workflow.For fixed-point code generation, theSubtrees value must becoder.Constant("all") or have an integer data type.
For more information, see Introduction to Code Generation.
Usage notes and limitations:
- The
predict
function does not support decision tree models trained with surrogate splits.
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2011a