R: Regression Diagnostics (original) (raw)
lm.influence {stats} | R Documentation |
---|
Description
This function provides the basic quantities which are used in forming a wide variety of diagnostics for checking the quality of regression fits.
Usage
influence(model, ...)
## S3 method for class 'lm'
influence(model, do.coef = TRUE, ...)
## S3 method for class 'glm'
influence(model, do.coef = TRUE, ...)
lm.influence(model, do.coef = TRUE)
qr.influence(qr, res, tol = 10 * .Machine$double.eps)
Arguments
model | an object as returned by lm or glm. |
---|---|
do.coef | logical indicating if the changed coefficients(see below) are desired. These need O(n^2 p) computing time. |
... | further arguments passed to or from other methods. |
qr | typically the result of qr(), a listof class "qr". |
res | numerical vector of model residuals. |
tol | non-negative numerical tolerance. |
Details
The [influence.measures](../../stats/help/influence.measures.html)()
and other functions listed inSee Also provide a more user oriented way of computing a variety of regression diagnostics. These all build onlm.influence
. Note that for GLMs (other than the Gaussian family with identity link) these are based on one-step approximations which may be inadequate if a case has high influence.
An attempt is made to ensure that computed hat values that are probably one are treated as one, and the corresponding rows insigma
and coefficients
are NaN
. (Dropping such a case would normally result in a variable being dropped, so it is not possible to give simple drop-one diagnostics.)
[naresid](../../stats/help/naresid.html)
is applied to the results and so will fill in with NA
s it the fit had na.action = na.exclude
.
qr.influence()
is a low level interface to parts oflm.influence(*, doc.coef = FALSE)
provided for cases where speed is more important than user safety.
Value
A list containing the following components of the same length or number of rows n
, which is the number of non-zero weights. Cases omitted in the fit are omitted unless a [na.action](../../stats/help/na.action.html)
method was used (such as [na.exclude](../../stats/help/na.exclude.html)
) which restores them.
hat | a vector containing the diagonal of the ‘hat’ matrix. |
---|---|
coefficients | (unless do.coef is false) a matrix whose i-th row contains the change in the estimated coefficients which results when the i-th case is dropped from the regression. Note that aliased coefficients are not included in the matrix. |
sigma | a vector whose i-th element contains the estimate of the residual standard deviation obtained when the i-th case is dropped from the regression. (The approximations needed for GLMs can result in this being NaN.) |
wt.res | a vector of weighted (or for class glmrather deviance) residuals. |
qr.influence()
returns list with the two components hat
andsigma
, as above but without [names](../../base/html/names.html)
.
Note
The coefficients
returned by the R version of lm.influence
differ from those computed by S. Rather than returning the coefficients which result from dropping each case, we return the changes in the coefficients. This is more directly useful in many diagnostic measures.
Since these need O(n p^2)
computing time, they can be omitted bydo.coef = FALSE
.
Note that cases with weights == 0
are dropped (contrary to the situation in S).
If a model has been fitted with na.action = na.exclude
(see[na.exclude](../../stats/help/na.exclude.html)
), cases excluded in the fit _are_considered here.
References
See the list in the documentation for [influence.measures](../../stats/help/influence.measures.html)
.
Chambers JM (1992). “Linear Models.” In Chambers JM, Hastie TJ (eds.), Statistical Models in S, chapter 4. Wadsworth & Brooks/Cole.
See Also
[summary.lm](../../stats/help/summary.lm.html)
for [summary](../../base/html/summary.html)
and related methods;[influence.measures](../../stats/help/influence.measures.html)
,[hat](../../stats/help/hat.html)
for the hat matrix diagonals,[dfbetas](../../stats/help/dfbetas.html)
,[dffits](../../stats/help/dffits.html)
,[covratio](../../stats/help/covratio.html)
,[cooks.distance](../../stats/help/cooks.distance.html)
,[lm](../../stats/help/lm.html)
.
Examples
## Analysis of the life-cycle savings data
## given in Belsley, Kuh and Welsch.
summary(lm.SR <- lm(sr ~ pop15 + pop75 + dpi + ddpi,
data = LifeCycleSavings),
correlation = TRUE)
utils::str(lmI <- lm.influence(lm.SR))
qRes <- qr(lm.SR) # == lm.SR $ qr
qrI <- qr.influence(qRes, residuals(lm.SR))
strip <- function(x) lapply(lapply(x, unname), drop)
stopifnot(identical(strip(qrI),
strip(lmI[c("hat", "sigma")])))
## For more "user level" examples, use example(influence.measures)
[Package _stats_ version 4.6.0 Index]