coefTest - Linear hypothesis test on linear regression model coefficients - MATLAB (original) (raw)

Linear hypothesis test on linear regression model coefficients

Syntax

Description

[p](#bszh8d%5F-1%5Fsep%5Fshared-p) = coefTest([mdl](#bszh8d%5F-1%5Fsep%5Fshared-mdl)) computes the _p_-value for an _F_-test that all coefficient estimates in mdl, except for the intercept term, are zero.

example

[p](#bszh8d%5F-1%5Fsep%5Fshared-p) = coefTest([mdl](#bszh8d%5F-1%5Fsep%5Fshared-mdl),[H](#bszh8d%5F-1%5Fsep%5Fshared-H)) performs an _F_-test that H × B = 0, where B represents the coefficient vector. UseH to specify the coefficients to include in the_F_-test.

example

[p](#bszh8d%5F-1%5Fsep%5Fshared-p) = coefTest([mdl](#bszh8d%5F-1%5Fsep%5Fshared-mdl),[H](#bszh8d%5F-1%5Fsep%5Fshared-H),[C](#bszh8d%5F-1%5Fsep%5Fshared-C)) performs an _F_-test that H × B =C.

[[p](#bszh8d%5F-1%5Fsep%5Fshared-p),[F](#bszh8d%5F-1%5Fsep%5Fshared-F)] = coefTest(___) also returns the _F_-test statistic F using any of the input argument combinations in previous syntaxes.

example

[[p](#bszh8d%5F-1%5Fsep%5Fshared-p),[F](#bszh8d%5F-1%5Fsep%5Fshared-F),[r](#bszh8d%5F-1%5Fsep%5Fshared-r)] = coefTest(___) also returns the numerator degrees of freedom r for the test.

example

Examples

collapse all

Fit a linear regression model and test the coefficients of the fitted model to see if they are zero.

Load the carsmall data set and create a table in which the Model_Year predictor is categorical.

load carsmall Model_Year = categorical(Model_Year); tbl = table(MPG,Weight,Model_Year);

Fit a linear regression model of mileage as a function of the weight, weight squared, and model year.

mdl = fitlm(tbl,'MPG ~ Model_Year + Weight^2')

mdl = Linear regression model: MPG ~ 1 + Weight + Model_Year + Weight^2

Estimated Coefficients: Estimate SE tStat pValue
__________ __________ _______ __________

(Intercept)          54.206        4.7117     11.505    2.6648e-19
Weight            -0.016404     0.0031249    -5.2493    1.0283e-06
Model_Year_76        2.0887       0.71491     2.9215     0.0044137
Model_Year_82        8.1864       0.81531     10.041    2.6364e-16
Weight^2         1.5573e-06    4.9454e-07      3.149     0.0022303

Number of observations: 94, Error degrees of freedom: 89 Root Mean Squared Error: 2.78 R-squared: 0.885, Adjusted R-Squared: 0.88 F-statistic vs. constant model: 172, p-value = 5.52e-41

The last line of the model display shows the _F_-statistic value of the regression model and the corresponding _p_-value. The small _p_-value indicates that the model fits significantly better than a degenerate model consisting of only an intercept term. You can return these two values by using coefTest.

Fit a linear regression model and test the significance of a specified coefficient in the fitted model by using coefTest. You can also use anova to test the significance of each predictor in the model.

Load the carsmall data set and create a table in which the Model_Year predictor is categorical.

load carsmall Model_Year = categorical(Model_Year); tbl = table(MPG,Acceleration,Weight,Model_Year);

Fit a linear regression model of mileage as a function of the weight, weight squared, and model year.

mdl = fitlm(tbl,'MPG ~ Acceleration + Model_Year + Weight')

mdl = Linear regression model: MPG ~ 1 + Acceleration + Weight + Model_Year

Estimated Coefficients: Estimate SE tStat pValue
__________ __________ ________ __________

(Intercept)          40.523        2.5293      16.021    5.8302e-28
Acceleration      -0.023438       0.11353    -0.20644       0.83692
Weight           -0.0066799    0.00045796     -14.586    2.5314e-25
Model_Year_76        1.9898       0.80696      2.4657      0.015591
Model_Year_82        7.9661       0.89745      8.8763    6.7725e-14

Number of observations: 94, Error degrees of freedom: 89 Root Mean Squared Error: 2.93 R-squared: 0.873, Adjusted R-Squared: 0.867 F-statistic vs. constant model: 153, p-value = 5.86e-39

The model display includes the _p_-value for the _t_-statistic for each coefficient to test the null hypothesis that the corresponding coefficient is zero.

You can examine the significance of the coefficient using coefTest. For example, test the significance of the Acceleration coefficient. According to the model display, Acceleration is the second predictor. Specify the coefficient by using a numeric index vector.

[p_Acceleration,F_Acceleration,r_Acceleration] = coefTest(mdl,[0 1 0 0 0])

p_Acceleration is the _p_-value corresponding to the _F_-statistic value F_Acceleration, and r_Acceleration is the numerator degrees of freedom for the _F_-test. The returned _p_-value indicates that Acceleration is not statistically significant in the fitted model. Note that p_Acceleration is equal to the _p_-value of _t_-statistic (tStat) in the model display, and F_Acceleration is the square of tStat.

Test the significance of the categorical predictor Model_Year. Instead of testing Model_Year_76 and Model_Year_82 separately, you can perform a single test for the categorical predictor Model_Year. Specify Model_Year_76 and Model_Year_82 by using a numeric index matrix.

[p_Model_Year,F_Model_Year,r_Model_Year] = coefTest(mdl,[0 0 0 1 0; 0 0 0 0 1])

p_Model_Year = 2.7408e-14

The returned _p_-value indicates that Model_Year is statistically significant in the fitted model.

You can also return these values by using anova.

ans=4×5 table SumSq DF MeanSq F pValue
_______ __ _______ ________ __________

Acceleration    0.36613     1    0.36613    0.042618       0.83692
Weight           1827.7     1     1827.7      212.75    2.5314e-25
Model_Year       777.81     2      388.9      45.269    2.7408e-14
Error            764.59    89      8.591

Input Arguments

collapse all

Data Types: single | double

Output Arguments

collapse all

_p_-value for the _F_-test, returned as a numeric value in the range [0,1].

Value of the test statistic for the _F_-test, returned as a numeric value.

Numerator degrees of freedom for the _F_-test, returned as a positive integer. The _F_-statistic has r degrees of freedom in the numerator and mdl.DFE degrees of freedom in the denominator.

Algorithms

The _p_-value, _F_-statistic, and numerator degrees of freedom are valid under these assumptions:

The data comes from a model represented by the formula in theFormula property of the fitted model.
The observations are independent, conditional on the predictor values.

Under these assumptions, let β represent the (unknown) coefficient vector of the linear regression. Suppose H is a full-rank numeric index matrix of size r_-by-s, where r is the number of linear combinations of coefficients being tested, and s is the total number of coefficients. Let c be a column vector with_r rows. The following is a test statistic for the hypothesis that_Hβ_ = c:

Here, β^ is the estimate of the coefficient vector β, stored in the Coefficients property, and V is the estimated covariance of the coefficient estimates, stored in theCoefficientCovariance property. When the hypothesis is true, the test statistic F has an F Distribution with r and_u_ degrees of freedom, where u is the degrees of freedom for error, stored in the DFE property.

Alternative Functionality

The values of commonly used test statistics are available in the Coefficients property of a fitted model.
anova provides tests for each model predictor and groups of predictors.

Extended Capabilities

Version History

Introduced in R2012a