Statistics with Estimated Parameters (original) (raw)

Statistics with Estimated PARAMETERS1

2005

Abstract: This paper studies a general problem of making inferences for functions of two sets of parameters where, when the first set is given, there exists a statistic with a known distribution. We study the distribution of this statistic when the first set of parameters is unknown and is replaced by an estimator. We show that under mild conditions the variance of the statistic is inflated when the unconstrained maximum likelihood estimator (MLE) is used, but deflated when the constrained MLE is used. The results are shown to be useful in hypothesis testing and confidence-interval construction in providing simpler and improved inference methods than do the standard large sample likelihood inference theories. We provide three applications of our theories, namely Box-Cox regression, dynamic regression, 1We thank Peter Robinson, Anil Bera, W. K. Li and a referee for their helpful comments that have led to improvements in this paper. Thanks are also due to seminar participants at the U...

On the Asymptotic Effect of Substituting Estimators for Nuisance Parameters in Inferential Statistics

This paper studies the general problem of making inferences for a set of parameters θ in the presence of another set of (nuisance) parameters λ, based on the statistic T (y;λ, θ), where y = {y 1 , y 2 , • • • , y n } represents the data,λ is an estimator of λ and the limiting distribution of T (y; λ, θ) is known. We provide general methods for finding the limiting distributions of T (y;λ, θ) whenλ is either a constrained estimator (given θ) or an unconstrained estimator. The methods will facilitate hypothesis testing as well as confidence-interval construction. We also extend the results to the cases where inferences may concern a general function of all parameters (θ and λ) and/or some weakly exogenous variables. Applications of the theories to testing serial correlation in regression models and confidence-interval construction in Box-Cox regressions are given.

Local maximum likelihood estimation and inference

Journal of the Royal Statistical Society: Series B (Statistical Methodology), 1998

Local maximum likelihood estimation is a nonparametric counterpart of the widely used parametric maximum likelihood technique. It extends the scope of the parametric maximum likelihood method to a much wider class of parametric spaces. Associated with this nonparametric estimation scheme is the issue of bandwidth selection and bias and variance assessment. This paper provides a uni®ed approach to selecting a bandwidth and constructing con®dence intervals in local maximum likelihood estimation. The approach is then applied to least squares nonparametric regression and to nonparametric logistic regression. Our experiences in these two settings show that the general idea outlined here is powerful and encouraging.

Local likelihood method: a bridge over parametric and nonparametric regression

Journal of Nonparametric Statistics, 2003

This paper discusses local likelihood method for estimating a regression function in a setting which includes generalized linear models. The local likelihood function is constructed by first considering a parametric model for the regression function. It is defined as a locally weighted log-likelihood with weights determined by a kernel function and a bandwidth. When a large bandwidth is chosen, the resulting estimator would be close to the fully parametric maximum likelihood estimator, so that a large bandwidth would be a relevant choice in the case where the true regression function is near the parametric family. On the other hand, when a small bandwidth is chosen, the performance of the resulting estimator would not depend much on the assumed parametric model, thus a small bandwidth would be desirable if the parametric model is largely misspecified. In this paper, we detail the way in which the risk of the local likelihood estimator is affected by bandwidth selection and model misspecification. We derive explicit formulas for the bias and variance of the local likelihood estimator for both large and small bandwidths. We look into higher order asymptotic expansions for the risk of the local likelihood estimator in the case where the bandwidth is large, which enables us to determine the optimal size of the bandwidth depending on the degree of model misspecification.

Evaluating Statistical Hypotheses Using Weakly-Identifiable Estimating Functions

Scandinavian Journal of Statistics, 2013

Many statistical models arising in applications contain non-and weakly-identified parameters. Due to identifiability concerns, tests concerning the parameters of interest may not be able to use conventional theories and it may not be clear how to assess statistical significance. This paper extends the literature by developing a testing procedure that can be used to evaluate hypotheses under non-and weakly-identifiable semiparametric models. The test statistic is constructed from a general estimating function of a finite dimensional parameter model representing the population characteristics of interest, but other characteristics which may be described by infinite dimensional parameters, and viewed as nuisance, are left completely unspecified. We derive the limiting distribution of this statistic and propose theoretically justified resampling approaches to approximate its asymptotic distribution. The methodology's practical utility is illustrated in simulations and an analysis of quality-of-life outcomes from a longitudinal study on breast cancer.

Auxiliary Information and a Priori Values in Construction of Improved Estimators

2014

Some ratio estimators for estimating the population mean of the variable under study, which make use of information regarding the population proportion possessing certain attribute, are proposed. Under simple random sampling without replacement (SRSWOR) scheme, the expressions of bias and mean-squared error (MSE) up to the first order of approximation are derived. The results obtained have been illustrated numerically by taking some empirical population considered in the literature.

Extending the scope of empirical likelihood

Annals of Statistics, 2009

This article extends the scope of empirical likelihood methodology in three directions: to allow for plug-in estimates of nuisance parameters in estimating equations, slower than √ n-rates of convergence, and settings in which there are a relatively large number of estimating equations compared to the sample size. Calibrating empirical likelihood confidence regions with plug-in is sometimes intractable due to the complexity of the asymptotics, so we introduce a bootstrap approximation that can be used in such situations. We provide a range of examples from survival analysis and nonparametric statistics to illustrate the main results.

The Indirect Method: Inference Based on Intermediate Statistics?A Synthesis and Examples

Statistical Science, 2004

This paper presents an exposition and synthesis of the theory and some applications of the so-called "indirect" method of inference. These ideas have been exploited in the field of econometrics, but less so in other fields such as biostatistics and epidemiology. In the indirect method, statistical inference is based on an intermediate statistic, which typically follows an asymptotic normal distribution, but is not necessarily a consistent estimator of the parameter of interest. This intermediate statistic can be a naive estimator based on a convenient but misspecified model, a sample moment, or a solution to an estimating equation. We review a procedure of indirect inference based on generalized method of moments, which involves adjusting the naive estimator to be consistent and asymptotically normal. The objective function of this procedure is shown to be interpretable as an 'indirect likelihood' based on the intermediate statistic. Many properties of the ordinary likelihood function can be extended to this indirect likelihood. This method is often more convenient computationally than maximum likelihood estimation when handling such model complexities as random effects and measurement error, for example; and it can also serve as a basis for robust inference and model selection, with less stringent assumptions on the data generating mechanism. Many familiar estimation techniques can be viewed as examples of this approach. We describe applications to measurement error, omitted covariates, and recurrent events. A data set concerning prevention of mammary tumors in rats is analyzed using a Poisson regression model with overdispersion. A second data set from an epidemiological study is analyzed using a logistic regression model with mismeasured covariates. A third data set of exam scores is used to illustrate robust covariance selection in graphical models.

Bias reduction in the estimation of parameters of rare events

Theory of Stochastic Processes

In this paper we consider a class of consistent semi-parametric estimators of a positive tail index -i, parametrized in two f.uning or control parameters a and 6. Such mntrol parameters enable us to have access, for an~' available sample, to an estimator of -; with a null dominant component of asymptotic bias, and with a reasonably flat Mean Squared Error pattC!rn, as a function of k, the number of top order statistics considered. Those control parameters depend on a second order parameter p, which needs to be adequately estimated so that we may achieve a high efficiency relatively to the classical Hill estimator. We then obviously need to have access to a larger number of top order statistics than the number needed for optimal estimation thrnugh the Hill estimator.