Further asymptotic properties of the generalized information criterion (original) (raw)
Related papers
A comparative study of information criteria for model selection
2006
To build good models, we need to know the appropriate model size. To handle this problem, a variety of information criteria have already been proposed, each with a different background. In this paper, we consider the problem of model selection and investigate the performance of a number of proposed information criteria and whether the assumption to obtain the formulae that fitting errors are normally distributed hold or not in some conditions (different data points and noise levels). The results show that although the application of information criteria prevents over-fitting and under-fitting in most cases, there are cases where we cannot avoid even involving many data points and low noise levels in ideal situations. The results also show that the distribution of the fitting errors is not always normally distributed, although the observational noise is Gaussian, which contradicts an assumption of the information criteria.
Noise derived information criterion for model selection
2008
This paper proposes a new complexity-penalization model selection strategy derived from the minimum risk principle and the behavior of candidate models under noisy conditions. This strategy seems to be robust in small sample size conditions and tends to AIC criterion as sample size grows up. The simulation study at the end of the paper will show that the proposed criterion is extremely competitive when compared to other state-of-the-art criteria.
Model selection in regression linear: a simulation based on akaike’s information criterion
Journal of Physics: Conference Series, 2019
Akaike’s Information Criterion (AIC) was firstly annunced by Akaike in 1971. In linear regression modelling, AIC is proposed as a model selection criterion since it estimates the quality of each model relative to other models. In this paper we domonstrate the use of AIC criterion to estimate p, the number of selected varibles in regression linear model through a simulation study. We simulate two particular cases, namely orthogonal and non - orthogonal cases. The orthogonal case is run where there is totally no correlation between any independent variable and one dependent variable, whereas for the the orthogonal case is run where there is a correlation between some independent variables and one dependent variable. The simulation results are used to investigate of the overestimate number of independent variables selected in the model for two cases. Although the two cases produce the oversetimate number ofindependent variables, most of the time the orthogonal case still provide less o...
Controlling the error probabilities of model selection information criteria using bootstrapping
Journal of Applied Statistics, 2019
The Akaike Information Criterion (AIC) and related information criteria are powerful and increasingly popular tools for comparing multiple, non-nested models without the specification of a null model. However, existing procedures for information-theoretic model selection do not provide explicit and uniform control over error rates for the choice between models, a key feature of classical hypothesis testing. We show how to extend notions of Type-I and Type-II error to more than two models without requiring a null. We then present the Error Control for Information Criteria (ECIC) method, a bootstrap approach to controlling Type-I error using Difference of Goodness of Fit (DGOF) distributions. We apply ECIC to empirical and simulated data in time series and regression contexts to illustrate its value for parametric Neyman-Pearson classification. An R package implementing the bootstrap method is publicly available.
Key Concepts in Model Selection: Performance and Generalizability
What is model selection? What are the goals of model selection? What are the methods of model selection, and how do they work? Which methods perform better than others, and in what circumstances? These questions rest on a number of key concepts in a relatively underdeveloped field. The aim of this essay is to explain some background concepts, highlight some of the results in this special issue, and to add my own. The standard methods of model selection include classical hypothesis testing, maximum likelihood, Bayes method, minimum description length, cross-validation and Akaike's information criterion. They all provide an implementation of Occam's razor, in which parsimony or simplicity is balanced against goodness-of-fit. These methods primarily take account of the sampling errors in parameter estimation, although their relative success at this task depends on the circumstances. However, the aim of model selection should also include the ability of a model to generalize to predictions in a different domain. Errors of extrapolation, or generalization, are different from errors of parameter estimation. So, it seems that simplicity and parsimony may be an additional factor in managing these errors, in which case the standard methods of model selection are incomplete implementations of Occam's razor.
Parametric or nonparametric? A parametricness index for model selection
The Annals of Statistics, 2011
In model selection literature, two classes of criteria perform well asymptotically in different situations: Bayesian information criterion (BIC) (as a representative) is consistent in selection when the true model is finite dimensional (parametric scenario); Akaike's information criterion (AIC) performs well in an asymptotic efficiency when the true model is infinite dimensional (nonparametric scenario). But there is little work that addresses if it is possible and how to detect the situation that a specific model selection problem is in. In this work, we differentiate the two scenarios theoretically under some conditions. We develop a measure, parametricness index (PI), to assess whether a model selected by a potentially consistent procedure can be practically treated as the true model, which also hints on AIC or BIC is better suited for the data for the goal of estimating the regression function. A consequence is that by switching between AIC and BIC based on the PI, the resulting regression estimator is simultaneously asymptotically efficient for both parametric and nonparametric scenarios. In addition, we systematically investigate the behaviors of PI in simulation and real data and show its usefulness.
An asymptotic property of model selection criteria
IEEE Transactions on Information Theory, 1998
Probability models are estimated by use of penalized log-likelihood criteria related to AIC and MDL. The accuracies of the density estimators are shown to be related to the tradeoff between three terms: the accuracy of approximation, the model dimension, and the descriptive complexity of the model classes. The asymptotic risk is determined under conditions on the penalty term, and is shown to be minimax optimal for some cases. As an application, we show that the optimal rate of convergence is simultaneously achieved for log-densities in Sobolev spaces W s 2 (U ) without knowing the smoothness parameter s and norm parameter U in advance. Applications to neural network models and sparse density function estimation are also provided.
Akaike's information criterion and recent developments in information complexity
Journal of Mathematical Psychology, 2000
In this paper we briefly study the basic idea of Akaike's (1973) information criterion (AIC). Then, we present some recent developments on a new entropic or information complexity (ICOMP) criterion of Bozdogan (1988a, 1988b, 1990, 1994d, 1996, 1998a, 1998b) for model selection. A rationale for ICOMP as a model selection criterion is that it combines a badness-of-fit term (such as minus twice the maximum log likelihood) with a measure of complexity of a model differently than AIC, or its variants, by taking into account the interdependencies of the parameter estimates as well as the dependencies of the model residuals. We operationalize the general form of ICOMP based on the quantification of the concept of overall model complexity in terms of the estimated inverse-Fisher information matrix. This approach results in an approximation to the sum of two Kullback Leibler distances. Using the correlational form of the complexity, we further provide yet another form of ICOMP to take into account the interdependencies (i.e., correlations) among the parameter estimates of the model. Later, we illustrate the practical utility and the importance of this new model selection criterion by providing several real as well as Monte Carlo simulation examples and compare its performance against AIC, or its variants. 2000 Academic Press Recently, based on Akaike's (1973) original AIC, many model-selection procedures which take the form of a penalized likelihood (a negative log likelihood plus