Leverage, Influence, and the Jackknife in Clustered Regression Models: Reliable Inference Using summclust (original) (raw)

View PDF

Abstract:We introduce a new Stata package called summclust that summarizes the cluster structure of the dataset for linear regression models with clustered disturbances. The key unit of observation for such a model is the cluster. We therefore propose cluster-level measures of leverage, partial leverage, and influence and show how to compute them quickly in most cases. The measures of leverage and partial leverage can be used as diagnostic tools to identify datasets and regression designs in which cluster-robust inference is likely to be challenging. The measures of influence can provide valuable information about how the results depend on the data in the various clusters. We also show how to calculate two jackknife variance matrix estimators efficiently as a byproduct of our other computations. These estimators, which are already available in Stata, are generally more conservative than conventional variance matrix estimators. The summclust package computes all the quantities that we discuss.

Submission history

From: Morten Ørregaard Nielsen [view email]
[v1] Fri, 6 May 2022 15:14:29 UTC (120 KB)
[v2] Tue, 13 Jun 2023 07:10:21 UTC (200 KB)
[v3] Thu, 23 Nov 2023 14:50:18 UTC (201 KB)