RegressionKernel - Gaussian kernel regression model using random feature expansion - MATLAB (original) (raw)
Gaussian kernel regression model using random feature expansion
Description
RegressionKernel
is a trained model object for Gaussian kernel regression using random feature expansion. RegressionKernel
is more practical for big data applications that have large training sets but can also be applied to smaller data sets that fit in memory.
Unlike other regression models, and for economical memory usage,RegressionKernel
model objects do not store the training data. However, they do store information such as the dimension of the expanded space, the kernel scale parameter, and the regularization strength.
You can use trained RegressionKernel
models to continue training using the training data, predict responses for new data, and compute the mean squared error or epsilon-insensitive loss. For details, see resume,predict, and loss.
Creation
Create a RegressionKernel
object using the fitrkernel function. This function maps data in a low-dimensional space into a high-dimensional space, then fits a linear model in the high-dimensional space by minimizing the regularized objective function. Obtaining the linear model in the high-dimensional space is equivalent to applying the Gaussian kernel to the model in the low-dimensional space. Available linear regression models include regularized support vector machines (SVM) and least-squares regression models.
Properties
Kernel Regression Properties
Half the width of the epsilon-insensitive band, specified as a nonnegative scalar.
If Learner
is not 'svm'
, thenEpsilon
is an empty array ([]
).
Data Types: single
| double
Linear regression model type, specified as'leastsquares'
or'svm'
.
In the following table, f(x)=T(x)β+b.
- x is an observation (row vector) from p predictor variables.
- T(·) is a transformation of an observation (row vector) for feature expansion. T(x) maps x in ℝp to a high-dimensional space (ℝm).
- β is a vector of coefficients.
- b is the scalar bias.
Value | Algorithm | Loss Function | FittedLoss Value |
---|---|---|---|
'svm' | Support vector machine regression | Epsilon insensitive: ℓ[y,f(x)]=max[0,|y−f(x) | −ε] |
'leastsquares' | Linear regression through ordinary least squares | Mean squared error (MSE): ℓ[y,f(x)]=12[y−f(x)]2 | 'mse' |
Number of dimensions of the expanded space, specified as a positive integer.
Data Types: single
| double
Kernel scale parameter, specified as a positive scalar.
Data Types: single
| double
Box constraint, specified as a positive scalar.
Data Types: double
| single
Regularization term strength, specified as a nonnegative scalar.
Data Types: single
| double
Since R2023b
This property is read-only.
Predictor means, specified as a numeric vector. If you specify Standardize
as 1
or true
when you train the kernel model, then the length of the Mu
vector is equal to the number of expanded predictors (see ExpandedPredictorNames
). The vector contains 0
values for dummy variables corresponding to expanded categorical predictors.
If you set Standardize
to 0
or false
when you train the kernel model, then the Mu
value is an empty vector ([]
).
Data Types: double
Since R2023b
This property is read-only.
Predictor standard deviations, specified as a numeric vector. If you specify Standardize
as 1
or true
when you train the kernel model, then the length of the Sigma
vector is equal to the number of expanded predictors (see ExpandedPredictorNames
). The vector contains 1
values for dummy variables corresponding to expanded categorical predictors.
If you set Standardize
to 0
or false
when you train the kernel model, then the Sigma
value is an empty vector ([]
).
Data Types: double
Loss function used to fit the linear model, specified as'epsiloninsensitive'
or'mse'
.
Value | Algorithm | Loss Function | Learner Value |
---|---|---|---|
'epsiloninsensitive' | Support vector machine regression | Epsilon insensitive: ℓ[y,f(x)]=max[0,|y−f(x) | −ε] |
'mse' | Linear regression through ordinary least squares | Mean squared error (MSE): ℓ[y,f(x)]=12[y−f(x)]2 | 'leastsquares' |
Other Regression Properties
Categorical predictor indices, specified as a vector of positive integers. CategoricalPredictors
contains index values indicating that the corresponding predictors are categorical. The index values are between 1 and p
, where p
is the number of predictors used to train the model. If none of the predictors are categorical, then this property is empty ([]
).
Data Types: single
| double
Parameters used for training the RegressionKernel
model, specified as a structure.
Access fields of ModelParameters
using dot notation. For example, access the relative tolerance on the linear coefficients and the bias term by usingMdl.ModelParameters.BetaTolerance
.
Data Types: struct
Predictor names in order of their appearance in the predictor data, specified as a cell array of character vectors. The length ofPredictorNames
is equal to the number of columns used as predictor variables in the training dataX
or Tbl
.
Data Types: cell
Response variable name, specified as a character vector.
Data Types: char
Response transformation function to apply to predicted responses, specified as 'none'
or a function handle.
For kernel regression models and before the response transformation, the predicted response for the observation x (row vector) is f(x)=T(x)β+b.
- T(·) is a transformation of an observation for feature expansion.
- β corresponds to
Mdl.Beta
. - b corresponds to
Mdl.Bias
.
For a MATLAB® function or a function that you define, enter its function handle. For example, you can enter Mdl.ResponseTransform = @_function_
, where_function_
accepts a numeric vector of the original responses and returns a numeric vector of the same size containing the transformed responses.
Data Types: char
| function_handle
Object Functions
gather | Gather properties of Statistics and Machine Learning Toolbox object from GPU |
---|---|
incrementalLearner | Convert kernel regression model to incremental learner |
lime | Local interpretable model-agnostic explanations (LIME) |
loss | Regression loss for Gaussian kernel regression model |
partialDependence | Compute partial dependence |
plotPartialDependence | Create partial dependence plot (PDP) and individual conditional expectation (ICE) plots |
predict | Predict responses for Gaussian kernel regression model |
resume | Resume training of Gaussian kernel regression model |
shapley | Shapley values |
Examples
Train a kernel regression model for a tall array by using SVM.
When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the mapreducer function.
Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat 'NA'
values as missing data so that datastore
replaces them with NaN
values. Select a subset of the variables to use. Create a tall table on top of the datastore.
varnames = {'ArrTime','DepTime','ActualElapsedTime'}; ds = datastore('airlinesmall.csv','TreatAsMissing','NA',... 'SelectedVariableNames',varnames); t = tall(ds);
Specify DepTime
and ArrTime
as the predictor variables (X
) and ActualElapsedTime
as the response variable (Y
). Select the observations for which ArrTime
is later than DepTime
.
daytime = t.ArrTime>t.DepTime; Y = t.ActualElapsedTime(daytime); % Response data X = t{daytime,{'DepTime' 'ArrTime'}}; % Predictor data
Standardize the predictor variables.
Z = zscore(X); % Standardize the data
Train a default Gaussian kernel regression model with the standardized predictors. Extract a fit summary to determine how well the optimization algorithm fits the model to the data.
[Mdl,FitInfo] = fitrkernel(Z,Y)
Found 6 chunks. |========================================================================= | Solver | Iteration / | Objective | Gradient | Beta relative | | | Data Pass | | magnitude | change | |========================================================================= | INIT | 0 / 1 | 4.307833e+01 | 9.925486e-02 | NaN | | LBFGS | 0 / 2 | 2.782790e+01 | 7.202403e-03 | 9.891473e-01 | | LBFGS | 1 / 3 | 2.781351e+01 | 1.806211e-02 | 3.220672e-03 | | LBFGS | 2 / 4 | 2.777773e+01 | 2.727737e-02 | 9.309939e-03 | | LBFGS | 3 / 5 | 2.768591e+01 | 2.951422e-02 | 2.833343e-02 | | LBFGS | 4 / 6 | 2.755857e+01 | 5.124144e-02 | 7.935278e-02 | | LBFGS | 5 / 7 | 2.738896e+01 | 3.089571e-02 | 4.644920e-02 | | LBFGS | 6 / 8 | 2.716704e+01 | 2.552696e-02 | 8.596406e-02 | | LBFGS | 7 / 9 | 2.696409e+01 | 3.088621e-02 | 1.263589e-01 | | LBFGS | 8 / 10 | 2.676203e+01 | 2.021303e-02 | 1.533927e-01 | | LBFGS | 9 / 11 | 2.660322e+01 | 1.221361e-02 | 1.351968e-01 | | LBFGS | 10 / 12 | 2.645504e+01 | 1.486501e-02 | 1.175476e-01 | | LBFGS | 11 / 13 | 2.631323e+01 | 1.772835e-02 | 1.161909e-01 | | LBFGS | 12 / 14 | 2.625264e+01 | 5.837906e-02 | 1.422851e-01 | | LBFGS | 13 / 15 | 2.619281e+01 | 1.294441e-02 | 2.966283e-02 | | LBFGS | 14 / 16 | 2.618220e+01 | 3.791806e-03 | 9.051274e-03 | | LBFGS | 15 / 17 | 2.617989e+01 | 3.689255e-03 | 6.364132e-03 | | LBFGS | 16 / 18 | 2.617426e+01 | 4.200232e-03 | 1.213026e-02 | | LBFGS | 17 / 19 | 2.615914e+01 | 7.339928e-03 | 2.803348e-02 | | LBFGS | 18 / 20 | 2.620704e+01 | 2.298098e-02 | 1.749830e-01 | |========================================================================= | Solver | Iteration / | Objective | Gradient | Beta relative | | | Data Pass | | magnitude | change | |========================================================================= | LBFGS | 18 / 21 | 2.615554e+01 | 1.164689e-02 | 8.580878e-02 | | LBFGS | 19 / 22 | 2.614367e+01 | 3.395507e-03 | 3.938314e-02 | | LBFGS | 20 / 23 | 2.614090e+01 | 2.349246e-03 | 1.495049e-02 | |========================================================================|
Mdl = RegressionKernel ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 64 KernelScale: 1 Lambda: 8.5385e-06 BoxConstraint: 1 Epsilon: 5.9303
Properties, Methods
FitInfo = struct with fields: Solver: 'LBFGS-tall' LossFunction: 'epsiloninsensitive' Lambda: 8.5385e-06 BetaTolerance: 1.0000e-03 GradientTolerance: 1.0000e-05 ObjectiveValue: 26.1409 GradientMagnitude: 0.0023 RelativeChangeInBeta: 0.0150 FitTime: 13.1051 History: [1×1 struct]
Mdl
is a RegressionKernel
model. To inspect the regression error, you can pass Mdl
and the training data or new data to the loss function. Or, you can pass Mdl
and new predictor data to the predict function to predict responses for new observations. You can also pass Mdl
and the training data to the resume function to continue training.
FitInfo
is a structure array containing optimization information. Use FitInfo
to determine whether optimization termination measurements are satisfactory.
For improved accuracy, you can increase the maximum number of optimization iterations ('IterationLimit'
) and decrease the tolerance values ('BetaTolerance'
and 'GradientTolerance'
) by using the name-value pair arguments of fitrkernel
. Doing so can improve measures like ObjectiveValue
and RelativeChangeInBeta
in FitInfo
. You can also optimize model parameters by using the 'OptimizeHyperparameters'
name-value pair argument.
Resume training a Gaussian kernel regression model for more iterations to improve the regression loss.
Load the carbig
data set.
Specify the predictor variables (X
) and the response variable (Y
).
X = [Acceleration,Cylinders,Displacement,Horsepower,Weight]; Y = MPG;
Delete rows of X
and Y
where either array has NaN
values. Removing rows with NaN
values before passing data to fitrkernel
can speed up training and reduce memory usage.
R = rmmissing([X Y]); % Data with missing entries removed X = R(:,1:5); Y = R(:,end);
Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.
rng(10) % For reproducibility N = length(Y); cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices
Train a kernel regression model. Standardize the training data, set the iteration limit to 5, and specify 'Verbose',1
to display diagnostic information.
Xtrain = X(idxTrn,:); Ytrain = Y(idxTrn);
Mdl = fitrkernel(Xtrain,Ytrain,'Standardize',true, ... 'IterationLimit',5,'Verbose',1)
|=================================================================================================================| | Solver | Pass | Iteration | Objective | Step | Gradient | Relative | sum(beta~=0) | | | | | | | magnitude | change in Beta | | |=================================================================================================================| | LBFGS | 1 | 0 | 5.691016e+00 | 0.000000e+00 | 5.852758e-02 | | 0 | | LBFGS | 1 | 1 | 5.086537e+00 | 8.000000e+00 | 5.220869e-02 | 9.846711e-02 | 256 | | LBFGS | 1 | 2 | 3.862301e+00 | 5.000000e-01 | 3.796034e-01 | 5.998808e-01 | 256 | | LBFGS | 1 | 3 | 3.460613e+00 | 1.000000e+00 | 3.257790e-01 | 1.615091e-01 | 256 | | LBFGS | 1 | 4 | 3.136228e+00 | 1.000000e+00 | 2.832861e-02 | 8.006254e-02 | 256 | | LBFGS | 1 | 5 | 3.063978e+00 | 1.000000e+00 | 1.475038e-02 | 3.314455e-02 | 256 | |=================================================================================================================|
Mdl = RegressionKernel ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 256 KernelScale: 1 Lambda: 0.0028 BoxConstraint: 1 Epsilon: 0.8617
Properties, Methods
Mdl
is a RegressionKernel
model.
Estimate the epsilon-insensitive error for the test set.
Xtest = X(idxTest,:); Ytest = Y(idxTest);
L = loss(Mdl,Xtest,Ytest,'LossFun','epsiloninsensitive')
Continue training the model by using resume
. This function continues training with the same options used for training Mdl
.
UpdatedMdl = resume(Mdl,Xtrain,Ytrain);
|=================================================================================================================|
| Solver | Pass | Iteration | Objective | Step | Gradient | Relative | sum(beta=0) |
| | | | | | magnitude | change in Beta | |
|=================================================================================================================|
| LBFGS | 1 | 0 | 3.063978e+00 | 0.000000e+00 | 1.475038e-02 | | 256 |
| LBFGS | 1 | 1 | 3.007822e+00 | 8.000000e+00 | 1.391637e-02 | 2.603966e-02 | 256 |
| LBFGS | 1 | 2 | 2.817171e+00 | 5.000000e-01 | 5.949008e-02 | 1.918084e-01 | 256 |
| LBFGS | 1 | 3 | 2.807294e+00 | 2.500000e-01 | 6.798867e-02 | 2.973097e-02 | 256 |
| LBFGS | 1 | 4 | 2.791060e+00 | 1.000000e+00 | 2.549575e-02 | 1.639328e-02 | 256 |
| LBFGS | 1 | 5 | 2.767821e+00 | 1.000000e+00 | 6.154419e-03 | 2.468903e-02 | 256 |
| LBFGS | 1 | 6 | 2.738163e+00 | 1.000000e+00 | 5.949008e-02 | 9.476263e-02 | 256 |
| LBFGS | 1 | 7 | 2.719146e+00 | 1.000000e+00 | 1.699717e-02 | 1.849972e-02 | 256 |
| LBFGS | 1 | 8 | 2.705941e+00 | 1.000000e+00 | 3.116147e-02 | 4.152590e-02 | 256 |
| LBFGS | 1 | 9 | 2.701162e+00 | 1.000000e+00 | 5.665722e-03 | 9.401466e-03 | 256 |
| LBFGS | 1 | 10 | 2.695341e+00 | 5.000000e-01 | 3.116147e-02 | 4.968046e-02 | 256 |
| LBFGS | 1 | 11 | 2.691277e+00 | 1.000000e+00 | 8.498584e-03 | 1.017446e-02 | 256 |
| LBFGS | 1 | 12 | 2.689972e+00 | 1.000000e+00 | 1.983003e-02 | 9.938921e-03 | 256 |
| LBFGS | 1 | 13 | 2.688979e+00 | 1.000000e+00 | 1.416431e-02 | 6.606316e-03 | 256 |
| LBFGS | 1 | 14 | 2.687787e+00 | 1.000000e+00 | 1.621956e-03 | 7.089542e-03 | 256 |
| LBFGS | 1 | 15 | 2.686539e+00 | 1.000000e+00 | 1.699717e-02 | 1.169701e-02 | 256 |
| LBFGS | 1 | 16 | 2.685356e+00 | 1.000000e+00 | 1.133144e-02 | 1.069310e-02 | 256 |
| LBFGS | 1 | 17 | 2.685021e+00 | 5.000000e-01 | 1.133144e-02 | 2.104248e-02 | 256 |
| LBFGS | 1 | 18 | 2.684002e+00 | 1.000000e+00 | 2.832861e-03 | 6.175231e-03 | 256 |
| LBFGS | 1 | 19 | 2.683507e+00 | 1.000000e+00 | 5.665722e-03 | 3.724026e-03 | 256 |
| LBFGS | 1 | 20 | 2.683343e+00 | 5.000000e-01 | 5.665722e-03 | 9.549119e-03 | 256 |
|=================================================================================================================|
| Solver | Pass | Iteration | Objective | Step | Gradient | Relative | sum(beta=0) |
| | | | | | magnitude | change in Beta | |
|=================================================================================================================|
| LBFGS | 1 | 21 | 2.682897e+00 | 1.000000e+00 | 5.665722e-03 | 7.172867e-03 | 256 |
| LBFGS | 1 | 22 | 2.682682e+00 | 1.000000e+00 | 2.832861e-03 | 2.587726e-03 | 256 |
| LBFGS | 1 | 23 | 2.682485e+00 | 1.000000e+00 | 2.832861e-03 | 2.953648e-03 | 256 |
| LBFGS | 1 | 24 | 2.682326e+00 | 1.000000e+00 | 2.832861e-03 | 7.777294e-03 | 256 |
| LBFGS | 1 | 25 | 2.681914e+00 | 1.000000e+00 | 2.832861e-03 | 2.778555e-03 | 256 |
| LBFGS | 1 | 26 | 2.681867e+00 | 5.000000e-01 | 1.031085e-03 | 3.638352e-03 | 256 |
| LBFGS | 1 | 27 | 2.681725e+00 | 1.000000e+00 | 5.665722e-03 | 1.515199e-03 | 256 |
| LBFGS | 1 | 28 | 2.681692e+00 | 5.000000e-01 | 1.314940e-03 | 1.850055e-03 | 256 |
| LBFGS | 1 | 29 | 2.681625e+00 | 1.000000e+00 | 2.832861e-03 | 1.456903e-03 | 256 |
| LBFGS | 1 | 30 | 2.681594e+00 | 5.000000e-01 | 2.832861e-03 | 8.704875e-04 | 256 |
| LBFGS | 1 | 31 | 2.681581e+00 | 5.000000e-01 | 8.498584e-03 | 3.934768e-04 | 256 |
| LBFGS | 1 | 32 | 2.681579e+00 | 1.000000e+00 | 8.498584e-03 | 1.847866e-03 | 256 |
| LBFGS | 1 | 33 | 2.681553e+00 | 1.000000e+00 | 9.857038e-04 | 6.509825e-04 | 256 |
| LBFGS | 1 | 34 | 2.681541e+00 | 5.000000e-01 | 8.498584e-03 | 6.635528e-04 | 256 |
| LBFGS | 1 | 35 | 2.681499e+00 | 1.000000e+00 | 5.665722e-03 | 6.194735e-04 | 256 |
| LBFGS | 1 | 36 | 2.681493e+00 | 5.000000e-01 | 1.133144e-02 | 1.617763e-03 | 256 |
| LBFGS | 1 | 37 | 2.681473e+00 | 1.000000e+00 | 9.869233e-04 | 8.418484e-04 | 256 |
| LBFGS | 1 | 38 | 2.681469e+00 | 1.000000e+00 | 5.665722e-03 | 1.069722e-03 | 256 |
| LBFGS | 1 | 39 | 2.681432e+00 | 1.000000e+00 | 2.832861e-03 | 8.501930e-04 | 256 |
| LBFGS | 1 | 40 | 2.681423e+00 | 2.500000e-01 | 1.133144e-02 | 9.543716e-04 | 256 |
|=================================================================================================================|
| Solver | Pass | Iteration | Objective | Step | Gradient | Relative | sum(beta~=0) |
| | | | | | magnitude | change in Beta | |
|=================================================================================================================|
| LBFGS | 1 | 41 | 2.681416e+00 | 1.000000e+00 | 2.832861e-03 | 8.763251e-04 | 256 |
| LBFGS | 1 | 42 | 2.681413e+00 | 5.000000e-01 | 2.832861e-03 | 4.101888e-04 | 256 |
| LBFGS | 1 | 43 | 2.681403e+00 | 1.000000e+00 | 5.665722e-03 | 2.713209e-04 | 256 |
| LBFGS | 1 | 44 | 2.681392e+00 | 1.000000e+00 | 2.832861e-03 | 2.115241e-04 | 256 |
| LBFGS | 1 | 45 | 2.681383e+00 | 1.000000e+00 | 2.832861e-03 | 2.872858e-04 | 256 |
| LBFGS | 1 | 46 | 2.681374e+00 | 1.000000e+00 | 8.498584e-03 | 5.771001e-04 | 256 |
| LBFGS | 1 | 47 | 2.681353e+00 | 1.000000e+00 | 2.832861e-03 | 3.160871e-04 | 256 |
| LBFGS | 1 | 48 | 2.681334e+00 | 5.000000e-01 | 8.498584e-03 | 1.045502e-03 | 256 |
| LBFGS | 1 | 49 | 2.681314e+00 | 1.000000e+00 | 7.878714e-04 | 1.505118e-03 | 256 |
| LBFGS | 1 | 50 | 2.681306e+00 | 1.000000e+00 | 2.832861e-03 | 4.756894e-04 | 256 |
| LBFGS | 1 | 51 | 2.681301e+00 | 1.000000e+00 | 1.133144e-02 | 3.664873e-04 | 256 |
| LBFGS | 1 | 52 | 2.681288e+00 | 1.000000e+00 | 2.832861e-03 | 1.449821e-04 | 256 |
| LBFGS | 1 | 53 | 2.681287e+00 | 2.500000e-01 | 1.699717e-02 | 2.357176e-04 | 256 |
| LBFGS | 1 | 54 | 2.681282e+00 | 1.000000e+00 | 5.665722e-03 | 2.046663e-04 | 256 |
| LBFGS | 1 | 55 | 2.681278e+00 | 1.000000e+00 | 2.832861e-03 | 2.546349e-04 | 256 |
| LBFGS | 1 | 56 | 2.681276e+00 | 2.500000e-01 | 1.307940e-03 | 1.966786e-04 | 256 |
| LBFGS | 1 | 57 | 2.681274e+00 | 5.000000e-01 | 1.416431e-02 | 1.005310e-04 | 256 |
| LBFGS | 1 | 58 | 2.681271e+00 | 5.000000e-01 | 1.118892e-03 | 1.147324e-04 | 256 |
| LBFGS | 1 | 59 | 2.681269e+00 | 1.000000e+00 | 2.832861e-03 | 1.332914e-04 | 256 |
| LBFGS | 1 | 60 | 2.681268e+00 | 2.500000e-01 | 1.132045e-03 | 5.441369e-05 | 256 |
|=================================================================================================================|
Estimate the epsilon-insensitive error for the test set using the updated model.
UpdatedL = loss(UpdatedMdl,Xtest,Ytest,'LossFun','epsiloninsensitive')
The regression error decreases by a factor of about 0.08
after resume
updates the regression model with more iterations.
Extended Capabilities
Usage notes and limitations:
- The following object functions fully support GPU arrays:
- The object functions execute on a GPU if at least one of the following applies:
- The model was fitted with GPU arrays.
- The predictor data that you pass to the object function is a GPU array.
- The response data that you pass to the object function is a GPU array.
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2018a
You can fit a RegressionKernel
object with GPU arrays by usingfitrkernel. Most RegressionKernel
object functions now support GPU array input arguments so that the functions can execute on a GPU. The object functions that do not support GPU array inputs are incrementalLearner, lime, and shapley.
You can also gather the properties of a RegressionKernel
model object from the GPU using the gather function.
The fitckernel
and fitrkernel
functions support the standardization of numeric predictors. That is, in the call to either function, you can specify the Standardize
value as true
to center and scale each numeric predictor variable. Kernel models include Mu
and Sigma
properties that contain the means and standard deviations, respectively, used to standardize the predictors before training. The properties are empty when the fitting function does not perform any standardization.
You can generate C/C++ code for the predict
function.