coefTest - Linear hypothesis test on linear regression model coefficients - MATLAB (original) (raw)
Linear hypothesis test on linear regression model coefficients
Syntax
Description
[p](#bszh8d%5F-1%5Fsep%5Fshared-p) = coefTest([mdl](#bszh8d%5F-1%5Fsep%5Fshared-mdl))
computes the _p_-value for an _F_-test that all coefficient estimates in mdl
, except for the intercept term, are zero.
[p](#bszh8d%5F-1%5Fsep%5Fshared-p) = coefTest([mdl](#bszh8d%5F-1%5Fsep%5Fshared-mdl),[H](#bszh8d%5F-1%5Fsep%5Fshared-H))
performs an _F_-test that H × B = 0, where B represents the coefficient vector. UseH
to specify the coefficients to include in the_F_-test.
[p](#bszh8d%5F-1%5Fsep%5Fshared-p) = coefTest([mdl](#bszh8d%5F-1%5Fsep%5Fshared-mdl),[H](#bszh8d%5F-1%5Fsep%5Fshared-H),[C](#bszh8d%5F-1%5Fsep%5Fshared-C))
performs an _F_-test that H × B =C.
[[p](#bszh8d%5F-1%5Fsep%5Fshared-p),[F](#bszh8d%5F-1%5Fsep%5Fshared-F)] = coefTest(___)
also returns the _F_-test statistic F
using any of the input argument combinations in previous syntaxes.
[[p](#bszh8d%5F-1%5Fsep%5Fshared-p),[F](#bszh8d%5F-1%5Fsep%5Fshared-F),[r](#bszh8d%5F-1%5Fsep%5Fshared-r)] = coefTest(___)
also returns the numerator degrees of freedom r
for the test.
Examples
Fit a linear regression model and test the coefficients of the fitted model to see if they are zero.
Load the carsmall
data set and create a table in which the Model_Year
predictor is categorical.
load carsmall Model_Year = categorical(Model_Year); tbl = table(MPG,Weight,Model_Year);
Fit a linear regression model of mileage as a function of the weight, weight squared, and model year.
mdl = fitlm(tbl,'MPG ~ Model_Year + Weight^2')
mdl = Linear regression model: MPG ~ 1 + Weight + Model_Year + Weight^2
Estimated Coefficients:
Estimate SE tStat pValue
__________ __________ _______ __________
(Intercept) 54.206 4.7117 11.505 2.6648e-19
Weight -0.016404 0.0031249 -5.2493 1.0283e-06
Model_Year_76 2.0887 0.71491 2.9215 0.0044137
Model_Year_82 8.1864 0.81531 10.041 2.6364e-16
Weight^2 1.5573e-06 4.9454e-07 3.149 0.0022303
Number of observations: 94, Error degrees of freedom: 89 Root Mean Squared Error: 2.78 R-squared: 0.885, Adjusted R-Squared: 0.88 F-statistic vs. constant model: 172, p-value = 5.52e-41
The last line of the model display shows the _F_-statistic value of the regression model and the corresponding _p_-value. The small _p_-value indicates that the model fits significantly better than a degenerate model consisting of only an intercept term. You can return these two values by using coefTest
.
Fit a linear regression model and test the significance of a specified coefficient in the fitted model by using coefTest
. You can also use anova
to test the significance of each predictor in the model.
Load the carsmall
data set and create a table in which the Model_Year
predictor is categorical.
load carsmall Model_Year = categorical(Model_Year); tbl = table(MPG,Acceleration,Weight,Model_Year);
Fit a linear regression model of mileage as a function of the weight, weight squared, and model year.
mdl = fitlm(tbl,'MPG ~ Acceleration + Model_Year + Weight')
mdl = Linear regression model: MPG ~ 1 + Acceleration + Weight + Model_Year
Estimated Coefficients:
Estimate SE tStat pValue
__________ __________ ________ __________
(Intercept) 40.523 2.5293 16.021 5.8302e-28
Acceleration -0.023438 0.11353 -0.20644 0.83692
Weight -0.0066799 0.00045796 -14.586 2.5314e-25
Model_Year_76 1.9898 0.80696 2.4657 0.015591
Model_Year_82 7.9661 0.89745 8.8763 6.7725e-14
Number of observations: 94, Error degrees of freedom: 89 Root Mean Squared Error: 2.93 R-squared: 0.873, Adjusted R-Squared: 0.867 F-statistic vs. constant model: 153, p-value = 5.86e-39
The model display includes the _p_-value for the _t_-statistic for each coefficient to test the null hypothesis that the corresponding coefficient is zero.
You can examine the significance of the coefficient using coefTest
. For example, test the significance of the Acceleration
coefficient. According to the model display, Acceleration
is the second predictor. Specify the coefficient by using a numeric index vector.
[p_Acceleration,F_Acceleration,r_Acceleration] = coefTest(mdl,[0 1 0 0 0])
p_Acceleration
is the _p_-value corresponding to the _F_-statistic value F_Acceleration
, and r_Acceleration
is the numerator degrees of freedom for the _F_-test. The returned _p_-value indicates that Acceleration
is not statistically significant in the fitted model. Note that p_Acceleration
is equal to the _p_-value of _t_-statistic (tStat
) in the model display, and F_Acceleration
is the square of tStat
.
Test the significance of the categorical predictor Model_Year
. Instead of testing Model_Year_76
and Model_Year_82
separately, you can perform a single test for the categorical predictor Model_Year
. Specify Model_Year_76
and Model_Year_82
by using a numeric index matrix.
[p_Model_Year,F_Model_Year,r_Model_Year] = coefTest(mdl,[0 0 0 1 0; 0 0 0 0 1])
p_Model_Year = 2.7408e-14
The returned _p_-value indicates that Model_Year
is statistically significant in the fitted model.
You can also return these values by using anova.
ans=4×5 table
SumSq DF MeanSq F pValue
_______ __ _______ ________ __________
Acceleration 0.36613 1 0.36613 0.042618 0.83692
Weight 1827.7 1 1827.7 212.75 2.5314e-25
Model_Year 777.81 2 388.9 45.269 2.7408e-14
Error 764.59 89 8.591
Input Arguments
Data Types: single
| double
Data Types: single
| double
Output Arguments
_p_-value for the _F_-test, returned as a numeric value in the range [0,1].
Value of the test statistic for the _F_-test, returned as a numeric value.
Numerator degrees of freedom for the _F_-test, returned as a positive integer. The _F_-statistic has r
degrees of freedom in the numerator and mdl.DFE
degrees of freedom in the denominator.
Algorithms
The _p_-value, _F_-statistic, and numerator degrees of freedom are valid under these assumptions:
- The data comes from a model represented by the formula in the
Formula
property of the fitted model. - The observations are independent, conditional on the predictor values.
Under these assumptions, let β represent the (unknown) coefficient vector of the linear regression. Suppose H is a full-rank numeric index matrix of size r_-by-s, where r is the number of linear combinations of coefficients being tested, and s is the total number of coefficients. Let c be a column vector with_r rows. The following is a test statistic for the hypothesis that_Hβ_ = c:
Here, β^ is the estimate of the coefficient vector β, stored in the Coefficients
property, and V is the estimated covariance of the coefficient estimates, stored in theCoefficientCovariance
property. When the hypothesis is true, the test statistic F has an F Distribution with r and_u_ degrees of freedom, where u is the degrees of freedom for error, stored in the DFE
property.
Alternative Functionality
- The values of commonly used test statistics are available in the Coefficients property of a fitted model.
- anova provides tests for each model predictor and groups of predictors.
Extended Capabilities
Version History
Introduced in R2012a