addInteractions - Add interaction terms to univariate generalized additive model (GAM) - MATLAB (original) (raw)

Add interaction terms to univariate generalized additive model (GAM)

Since R2021a

Syntax

Description

[UpdatedMdl](#shared-UpdatedMdl) = addInteractions([Mdl](#mw%5F8d87d816-6720-4882-82bc-1e884adf91bd%5Fsep%5Fshared-Mdl),[Interactions](#mw%5Fb854af39-adb3-47e3-8e71-1372b03e454e)) returns an updated model UpdatedMdl by adding the interaction terms inInteractions to the univariate generalized additive modelMdl. The model Mdl must contain only linear terms for predictors.

If you want to resume training for the existing terms in Mdl, use the resume function.

example

[UpdatedMdl](#shared-UpdatedMdl) = addInteractions([Mdl](#mw%5F8d87d816-6720-4882-82bc-1e884adf91bd%5Fsep%5Fshared-Mdl),[Interactions](#mw%5Fb854af39-adb3-47e3-8e71-1372b03e454e),[Name,Value](#namevaluepairarguments)) specifies additional options using one or more name-value arguments. For example,'MaxPValue',0.05 specifies to include only the interaction terms whose_p_-values are not greater than 0.05.

example

Examples

collapse all

Train a univariate GAM, which contains linear terms for predictors, and then add interaction terms to the trained model by using the addInteractions function.

Load the carbig data set, which contains measurements of cars made in the 1970s and early 1980s.

Create a table that contains the predictor variables (Acceleration, Displacement, Horsepower, and Weight) and the response variable (MPG).

tbl = table(Acceleration,Displacement,Horsepower,Weight,MPG);

Train a univariate GAM that contains linear terms for predictors in tbl.

Mdl = fitrgam(tbl,'MPG');

Add the five most important interaction terms to the trained model.

UpdatedMdl = addInteractions(Mdl,5);

Mdl is a univariate GAM, and UpdatedMdl is an updated GAM that contains all the terms in Mdl and five additional interaction terms. Display the interaction terms in UpdatedMdl.

ans = 5×2

Each row of the Interactions property represents one interaction term and contains the column indexes of the predictor variables for the interaction term. You can use the Interactions property to check the interaction terms in the model and the order in which fitrgam adds them to the model.

Train a univariate GAM, which contains linear terms for predictors, and then add interaction terms to the trained model by using the addInteractions function. Specify the 'MaxPValue' name-value argument to add interaction terms whose _p_-values are not greater than the 'MaxPValue' value.

Load Fisher's iris data set. Create a table that contains observations for versicolor and virginica.

load fisheriris inds = strcmp(species,'versicolor') | strcmp(species,'virginica'); Tbl = array2table(meas(inds,:),'VariableNames',["x1","x2","x3","x4"]); Tbl.Y = species(inds,:);

Train a univariate GAM that contains linear terms for predictors in Tbl.

Add important interaction terms to the trained model Mdl. Specify 'all' for the Interactions argument, and set the 'MaxPValue' name-value argument to 0.05. Among all available interaction terms, addInteractions identifies those whose _p_-values are not greater than the 'MaxPValue' value and adds them to the model. The default 'MaxPValue' is 1 so that the function adds all specified interaction terms to the model.

UpdatedMdl = addInteractions(Mdl,'all','MaxPValue',0.05); UpdatedMdl.Interactions

ans = 5×2

Mdl is a univariate GAM, and UpdatedMdl is an updated GAM that contains all the terms in Mdl and five additional interaction terms. UpdatedMdl includes five of the six available pairs of interaction terms.

Input Arguments

collapse all

Number or list of interaction terms to include in the candidate set S, specified as a nonnegative integer scalar, a logical matrix, or'all'.

Number of interaction terms, specified as a nonnegative integer — S includes the specified number of important interaction terms, selected based on the _p_-values of the terms.
List of interaction terms, specified as a logical matrix —S includes the terms specified by at-by-p logical matrix, where t is the number of interaction terms, andp is the number of predictors used to train the model. For example, logical([1 1 0; 0 1 1]) represents two pairs of interaction terms: a pair of the first and second predictors, and a pair of the second and third predictors.
If addInteractions uses a subset of input variables as predictors, then the function indexes the predictors using only the subset. That is, the column indexes of the logical matrix do not count the response and observation weight variables. The indexes also do not count any variables not used by the function.
'all' — S includes all possible pairs of interaction terms, which isp*(p – 1)/2 number of terms in total.

Among the interaction terms in S, the addInteractions function identifies those whose _p_-values are not greater than the'MaxPValue' value and uses them to build a set of interaction trees. Use the default value ('MaxPValue',1) to build interaction trees using all terms in S.

Data Types: single | double | logical | char | string

Name-Value Arguments

collapse all

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: addInteractions(`Mdl`,'all','MaxPValue',0.05,'Verbose',1,'NumPrints',10) specifies to include all available interaction terms whose _p_-values are not greater than 0.05 and to display diagnostic messages every 10 iterations.

Initial learning rate of gradient boosting for interaction terms, specified as a numeric scalar in the interval (0,1].

For each boosting iteration for interaction trees,addInteractions starts fitting with the initial learning rate. For classification, the function halves the learning rate until it finds a rate that improves the model fit. For regression, the function uses the initial rate throughout the training.

Training a model using a small learning rate requires more learning iterations, but often achieves better accuracy.

For more details about gradient boosting, see Gradient Boosting Algorithm.

Example: 'InitialLearnRateForInteractions',0.1

Data Types: single | double

Maximum number of decision splits (or branch nodes) for each interaction tree (boosted tree for an interaction term), specified as a positive integer scalar.

Example: 'MaxNumSplitsPerInteraction',5

Data Types: single | double

Maximum _p_-value for detecting interaction terms, specified as a numeric scalar in the interval [0,1].

addInteractions first finds the candidate set_S_ of interaction terms from the Interactions value. Then the function identifies the interaction terms whose_p_-values are not greater than the 'MaxPValue' value and uses them to build a set of interaction trees.

The default value ('MaxPValue',1) builds interaction trees for all interaction terms in the candidate set S.

For more details about detecting interaction terms, see Interaction Term Detection.

Example: 'MaxPValue',0.05

Data Types: single | double

Output Arguments

More About

collapse all

Deviance is a generalization of the residual sum of squares. It measures the goodness of fit compared to the saturated model.

The deviance of a fitted model is twice the difference between the loglikelihoods of the model and the saturated model:

-2(log_L_ - log_Ls_),

where L and_Ls_ are the likelihoods of the fitted model and the saturated model, respectively. The saturated model is the model with the maximum number of parameters that you can estimate.

addInteractions uses the deviance to measure the goodness of model fit and finds a learning rate that reduces the deviance at each iteration. Specify'Verbose' as 1 or 2 to display the deviance and learning rate in the Command Window.

Algorithms

collapse all

addInteractions adds sets of interaction trees (boosted trees for interaction terms for predictors) to a univariate generalized additive model by using a gradient boosting algorithm (Least-Squares Boosting for regression and Adaptive Logistic Regression for classification). The algorithm iterates for at most'NumTreesPerInteraction' times for interaction trees.

For each boosting iteration, addInteractions builds a set of interaction trees with the initial learning rate'InitialLearnRateForInteractions'.

When building a set of trees, the function trains one tree at a time. It fits a tree to the residual that is the difference between the response (observed response values for regression or scores of observed classes for classification) and the aggregated prediction from all trees grown previously. To control the boosting learning speed, the function shrinks the tree by the learning rate and then adds the tree to the model and updates the residual.
- Updated model = current model + (learning rate)·(new tree)
- Updated residual = current residual – (learning rate)·(response explained by new tree)
If adding the set of trees improves the model fit (that is, reduces the deviance of the fit by a value larger than the tolerance), thenaddInteractions moves to the next iteration.
Otherwise, for classification, addInteractions halves the learning rate and uses it to update the model and residual. The function continues to halve the learning rate until it finds a rate that improves the model fit. If the function cannot find such a learning rate for interaction trees, then it terminates the model fitting. For regression, if adding the set of trees does not improve the model fit with the initial learning rate, then the function terminates the model fitting.
You can determine why training stopped by checking theReasonForTermination property of the trained model.

For each pairwise interaction term_xi_ xj (specified by Interactions), the software performs an_F_-test to examine whether the term is statistically significant.

To speed up the process, addInteractions bins numeric predictors into at most 8 equiprobable bins. The number of bins can be less than 8 if a predictor has fewer than 8 unique values. The F_-test examines the null hypothesis that the bins created by xi and_xj have equal responses versus the alternative that at least one bin has a different response value from the others. A small_p_-value indicates that differences are significant, which implies that the corresponding interaction term is significant and, therefore, including the term can improve the model fit.

addInteractions builds a set of interaction trees using the terms whose_p_-values are not greater than the 'MaxPValue' value. You can use the default 'MaxPValue' value 1 to build interaction trees using all terms specified byInteractions.

addInteractions adds interaction terms to the model in the order of importance based on the _p_-values. Use the Interactions property of the returned model to check the order of the interaction terms added to the model.

Version History

Introduced in R2021a