Elvezio Ronchetti - Academia.edu (original) (raw)
Papers by Elvezio Ronchetti
SSRN Electronic Journal, 1995
In this paper we compute the IF of a general class of estimators for grouped data, namely the cla... more In this paper we compute the IF of a general class of estimators for grouped data, namely the class of MPE. We find that this IF can be large although it is bounded. Therefore, we propose a more general class of estimators, the MGP-estimators, which include the class of estimators based on the power divergence statistic and permits to define robust estimators. By analogy with Hampel's theorem, we define optimal bounded IF estimators and by a simulation study, we show that under small model contaminations, they are a lot more stable than the classical estimators for grouped data. Finally, our results are applied to a particular real example.
Journal of the American Statistical Association, 1986
BEFORE COMPLETING FORM 1. RKPO RT 2. GOVT ACCESSION NO0 S. RECIPIENT'S CATALOG NUUMUER 4. TTLK an... more BEFORE COMPLETING FORM 1. RKPO RT 2. GOVT ACCESSION NO0 S. RECIPIENT'S CATALOG NUUMUER 4. TTLK and ubtile)S.
Robust Statistics
Page 1. Institute of Mathematical Statistics LECTURE NOTES-MONOGRAPH SERIES Small Sample Asymptot... more Page 1. Institute of Mathematical Statistics LECTURE NOTES-MONOGRAPH SERIES Small Sample Asymptotics Christopher Field Dalhousie University Elvezio Ronchetti University of Geneva Volume 13 Page 2. Page 3. Institute ...
Journal of the Royal Statistical Society: Series B (Methodological), 1994
In this paper. we investigate the use of the empirical distribution function in place of the unde... more In this paper. we investigate the use of the empirical distribution function in place of the underlying distribution function F to construct an empirical saddlepoint approximation to the density In of a general multivariate M-estimator. We obtain an explicit form for the error term in the approximation, investigate the effect of renormalizing the estimator, carry out some numerical comparisons and discuss the regression problem.
Studies in Computational Intelligence, 2017
We first review the basic ideas of robust statistics and define the main tools used to formalize ... more We first review the basic ideas of robust statistics and define the main tools used to formalize the problem and to construct new robust statistical procedures. In particular we focus on the influence function, the Gâteaux derivative of a functional in direction of a point mass, which can be used both to study the local stability properties of a statistical procedure and to construct new robust procedures. In the second part we show how these principles can be used to carry out a robustness analysis in [13] model and how to construct robust versions of Heckman’s two-stage estimator. These are central tools for the statistical analysis of data based on non-random samples from a population.
The asymptotic distribution of the pairwise log likelihood ratio is a linear combination of indep... more The asymptotic distribution of the pairwise log likelihood ratio is a linear combination of independent chi-square random variables with coefficients depending on the elements of the Godambe information. Adjusted versions of the pairwise log likelihood statistic have been proposed, but they still depend on the Godambe information matrix. Approximated p-values for testing a composite hypothesis may be obtained by refferring the observed value of such statistics to a critical value. The asymptotic theory can be used to approximate the desired quantile, but the approximation may be inaccurate. In this work we provide a nonparametric saddlepoint statistic derived from the pairwise score function. This statistic enjoys some desirable properties: it is asymptotically chi-square distributed and the approximation has a relative error of second order. Thereby our proposal claims a high level of accuracy with no need to estimate the Godambe information.
Data Segmentation and Model Selection for Computer Vision, 2000
Robust statistics deals with approximate statistical models and develops statistical techniques t... more Robust statistics deals with approximate statistical models and develops statistical techniques that are resistant and reliable in the presence of small deviations from assumed models. This chapter provides an overview of basic concepts and tools of robust statistics. In the first part we focus on regression models and discuss the most important classes of robust procedures for estimation and inference, which have been developed in the past two decades. The aim is not to provide a complete list of techniques but rather to highlight the basic ideas and discuss the statistical and computational properties of the most important robust methods for regression.
Computational Statistics & Data Analysis, 2014
The class of composite likelihood functions provides a flexible and powerful toolkit to carry out... more The class of composite likelihood functions provides a flexible and powerful toolkit to carry out approximate inference for complex statistical models when the full likelihood is either impossible to specify or unfeasible to compute. However, the strenght of the composite likelihood approach is dimmed when considering hypothesis testing about a multidimensional parameter because the finite sample behavior of likelihood ratio, Wald, and score-type test statistics is tied to the Godambe information matrix. Consequently inaccurate estimates of the Godambe information translate in inaccurate p-values. In this paper it is shown how accurate inference can be obtained by using a fully nonparametric saddlepoint test statistic derived from the composite score functions. The proposed statistic is asymptotically chisquare distributed up to a relative error of second order and does not depend on the Godambe information. The validity of the method is demonstrated through simulation studies.
Maîtriser l’aléatoire, 2009
Maîtriser l’aléatoire, 2009
ABSTRACT À partir de ce chapitre, on quitte le monde des probabilités pour rentrer dans le monde ... more ABSTRACT À partir de ce chapitre, on quitte le monde des probabilités pour rentrer dans le monde de la statistique, où les résultats de probabilités sont un outil indispensable.
Springer-Verlag France est membre du groupe Springer Science + Business Media Cet ouvrage est sou... more Springer-Verlag France est membre du groupe Springer Science + Business Media Cet ouvrage est soumis au copyright. Tous droits réservés, notamment la reproduction et la représentation, la traduction, la réimpression, l'exposé, la reproduction des illustrations et des tableaux, la transmission par voie d'enregistrement sonore ou visuel, la reproduction par microfilm ou tout autre moyen ainsi que la conservation des banques de données. La loi française sur le copyright du 9 septembre 1965 dans la version en vigueur n'autorise une reproduction intégrale ou partielle que dans certains cas, et en principe moyennant le paiement de droits. Toute représentation, reproduction, contrefaçon ou conservation dans une banque de données par quelque procédé que ce soit est sanctionnée par la loi pénale sur le copyright. L'utilisation dans cet ouvrage de désignations, dénominations commerciales, marques de fabrique, etc. même sans spécification ne signifie pas que ces termes soient libres de la législation sur les marques de fabrique et la protection des marques et qu'ils puissent être utilisés par chacun. La maison d'édition décline toute responsabilité quant à l'exactitude des indications de dosage et des modes d'emploi. Dans chaque cas, il incombe à l'usager de vérifier les informations données par comparaison à la littérature existante.
Statistics and Computing, 2001
We discuss the effects of model misspecifications on higher-order asymptotic approximations of th... more We discuss the effects of model misspecifications on higher-order asymptotic approximations of the distribution of estimators and test statistics. In particular we show that small deviations from the model can wipe out the nominal improvements of the accuracy obtained at the model by second-order approximations of the distribution of classical statistics. Although there is no guarantee that the first-order robustness properties of robust estimators and tests will carry over to second-order in a neighbourhood of the model, the behaviour of robust procedures in terms of second-order accuracy is generally more stable and reliable than that of their classical counterparts. Finally, we discuss some related work on robust adjustments of the profile likelihood and outline the role of computer algebra in this type of research.
Statistical Modelling, 2011
We adapt Breiman’s non-negative garrote method to perform variable selection in non-parametric ad... more We adapt Breiman’s non-negative garrote method to perform variable selection in non-parametric additive models. The technique avoids methods of testing for which no general reliable distributional theory is available. In addition, it removes the need for a full search of all possible models, something which is computationally intensive, especially when the number of variables is moderate to high. The method has the advantages of being conceptually simple and computationally fast. It provides accurate predictions and is effective at identifying the variables generating the model. To illustrate our procedure, we analyse logbook data on blue sharks ( Prionace glauca) from the US pelagic longline fishery. In addition, we compare our proposal to a series of available alternatives by simulation. The results show that in all cases our methods perform better or as well as these alternatives.
Journal of the American Statistical Association, 1996
Saddlepoint approximations of marginal densities and tail probabilities of general nonlinear stat... more Saddlepoint approximations of marginal densities and tail probabilities of general nonlinear statistics are derived. They are based on the expansion of the statistic up to the second order. Their accuracy is shown in a variety of examples, including logit and probit models and rank estimators for regression.
Journal of the American Statistical Association, 1994
ABSTRACT We present a robust version of Mallows’ C P [see C. L. Mallows, Technometrics 15, 661-67... more ABSTRACT We present a robust version of Mallows’ C P [see C. L. Mallows, Technometrics 15, 661-675 (1973; Zbl 0269.62061)] for regression models. It is defined by RC P =W P /σ ^ 2 -(U P -V P ), where W P =∑ i w ^ i 2 r i 2 is a weighted residual sum of squares computed from a robust fit of model P, σ ^ 2 is a robust and consistent estimator of σ 2 in the full model, and U P and V P are constants depending on the weight function and the number of parameters in model P. Good subset models are those with RC P close to V P or smaller than V P . When the weights are identically 1, W P becomes the residual sum of squares of a least squares fit, and RC P reduces to Mallows’ C P . The robust model selection procedure based on RC P allows us to choose the models that fit the majority of the data by taking into account the presence of outliers and possible departures from the normality assumption on the error distribution. Together with the classical C P , the robust version suggests several models from which we can choose.
Journal of the American Statistical Association, 2001
By starting from a natural class of robust estimators for generalized linear models based on the ... more By starting from a natural class of robust estimators for generalized linear models based on the notion of quasi-likelihood, we define robust deviances that can be used for stepwise model selection as in the classical framework. We derive the asymptotic distribution of tests based on robust deviances and we investigate the stability of their asymptotic level under contamination. The binomial and Poisson models are treated in detail. Two applications to real data and a sensitivity analysis show that the inference obtained by means of the new techniques is more reliable than that obtained by classical estimation and testing procedures.
Journal of the American Statistical Association, 2005
This paper studies the local robustness of estimators and tests for the conditional location and ... more This paper studies the local robustness of estimators and tests for the conditional location and scale parameters in a strictly stationary time series model. We first derive optimal bounded-influence estimators for such settings under a conditionally Gaussian reference model. Based on these results, optimal bounded-influence versions of the classical likelihood-based tests for parametric hypotheses are obtained. We propose a feasible and efficient algorithm for the computation of our robust estimators, which makes use of analytical Laplace approximations to estimate the auxiliary recentering vectors ensuring Fisher consistency in robust estimation. This strongly reduces the necessary computation time by avoiding the simulation of multidimensional integrals, a task that has typically to be addressed in the robust estimation of nonlinear models for time series. In some Monte Carlo simulations of an AR(1)-ARCH(1) process we show that our robust procedures maintain a very high efficiency under ideal model conditions and at the same time perform very satisfactorily under several forms of departure from conditional normality. On the contrary, classical Pseudo Maximum Likelihood inference procedures are found to be highly inefficient under such local model misspecifications. These patterns are confirmed by an application to robust testing for ARCH.
Journal of the American Statistical Association, 1997
ABSTRACT
Journal of the American Statistical Association, 1997
ABSTRACT
Journal of Statistical Planning and Inference, 1997
We first review briefly some basic approaches to robust inference and discuss the role and the pl... more We first review briefly some basic approaches to robust inference and discuss the role and the place of some key concepts (influence function, breakdown point, robustness versus efficiency, etc.). We then discuss in some detail recent results on robust testing in general multivariate parametric models. Recent applications include inference in logistic regression and testing for non-nested hypotheses.
SSRN Electronic Journal, 1995
In this paper we compute the IF of a general class of estimators for grouped data, namely the cla... more In this paper we compute the IF of a general class of estimators for grouped data, namely the class of MPE. We find that this IF can be large although it is bounded. Therefore, we propose a more general class of estimators, the MGP-estimators, which include the class of estimators based on the power divergence statistic and permits to define robust estimators. By analogy with Hampel's theorem, we define optimal bounded IF estimators and by a simulation study, we show that under small model contaminations, they are a lot more stable than the classical estimators for grouped data. Finally, our results are applied to a particular real example.
Journal of the American Statistical Association, 1986
BEFORE COMPLETING FORM 1. RKPO RT 2. GOVT ACCESSION NO0 S. RECIPIENT'S CATALOG NUUMUER 4. TTLK an... more BEFORE COMPLETING FORM 1. RKPO RT 2. GOVT ACCESSION NO0 S. RECIPIENT'S CATALOG NUUMUER 4. TTLK and ubtile)S.
Robust Statistics
Page 1. Institute of Mathematical Statistics LECTURE NOTES-MONOGRAPH SERIES Small Sample Asymptot... more Page 1. Institute of Mathematical Statistics LECTURE NOTES-MONOGRAPH SERIES Small Sample Asymptotics Christopher Field Dalhousie University Elvezio Ronchetti University of Geneva Volume 13 Page 2. Page 3. Institute ...
Journal of the Royal Statistical Society: Series B (Methodological), 1994
In this paper. we investigate the use of the empirical distribution function in place of the unde... more In this paper. we investigate the use of the empirical distribution function in place of the underlying distribution function F to construct an empirical saddlepoint approximation to the density In of a general multivariate M-estimator. We obtain an explicit form for the error term in the approximation, investigate the effect of renormalizing the estimator, carry out some numerical comparisons and discuss the regression problem.
Studies in Computational Intelligence, 2017
We first review the basic ideas of robust statistics and define the main tools used to formalize ... more We first review the basic ideas of robust statistics and define the main tools used to formalize the problem and to construct new robust statistical procedures. In particular we focus on the influence function, the Gâteaux derivative of a functional in direction of a point mass, which can be used both to study the local stability properties of a statistical procedure and to construct new robust procedures. In the second part we show how these principles can be used to carry out a robustness analysis in [13] model and how to construct robust versions of Heckman’s two-stage estimator. These are central tools for the statistical analysis of data based on non-random samples from a population.
The asymptotic distribution of the pairwise log likelihood ratio is a linear combination of indep... more The asymptotic distribution of the pairwise log likelihood ratio is a linear combination of independent chi-square random variables with coefficients depending on the elements of the Godambe information. Adjusted versions of the pairwise log likelihood statistic have been proposed, but they still depend on the Godambe information matrix. Approximated p-values for testing a composite hypothesis may be obtained by refferring the observed value of such statistics to a critical value. The asymptotic theory can be used to approximate the desired quantile, but the approximation may be inaccurate. In this work we provide a nonparametric saddlepoint statistic derived from the pairwise score function. This statistic enjoys some desirable properties: it is asymptotically chi-square distributed and the approximation has a relative error of second order. Thereby our proposal claims a high level of accuracy with no need to estimate the Godambe information.
Data Segmentation and Model Selection for Computer Vision, 2000
Robust statistics deals with approximate statistical models and develops statistical techniques t... more Robust statistics deals with approximate statistical models and develops statistical techniques that are resistant and reliable in the presence of small deviations from assumed models. This chapter provides an overview of basic concepts and tools of robust statistics. In the first part we focus on regression models and discuss the most important classes of robust procedures for estimation and inference, which have been developed in the past two decades. The aim is not to provide a complete list of techniques but rather to highlight the basic ideas and discuss the statistical and computational properties of the most important robust methods for regression.
Computational Statistics & Data Analysis, 2014
The class of composite likelihood functions provides a flexible and powerful toolkit to carry out... more The class of composite likelihood functions provides a flexible and powerful toolkit to carry out approximate inference for complex statistical models when the full likelihood is either impossible to specify or unfeasible to compute. However, the strenght of the composite likelihood approach is dimmed when considering hypothesis testing about a multidimensional parameter because the finite sample behavior of likelihood ratio, Wald, and score-type test statistics is tied to the Godambe information matrix. Consequently inaccurate estimates of the Godambe information translate in inaccurate p-values. In this paper it is shown how accurate inference can be obtained by using a fully nonparametric saddlepoint test statistic derived from the composite score functions. The proposed statistic is asymptotically chisquare distributed up to a relative error of second order and does not depend on the Godambe information. The validity of the method is demonstrated through simulation studies.
Maîtriser l’aléatoire, 2009
Maîtriser l’aléatoire, 2009
ABSTRACT À partir de ce chapitre, on quitte le monde des probabilités pour rentrer dans le monde ... more ABSTRACT À partir de ce chapitre, on quitte le monde des probabilités pour rentrer dans le monde de la statistique, où les résultats de probabilités sont un outil indispensable.
Springer-Verlag France est membre du groupe Springer Science + Business Media Cet ouvrage est sou... more Springer-Verlag France est membre du groupe Springer Science + Business Media Cet ouvrage est soumis au copyright. Tous droits réservés, notamment la reproduction et la représentation, la traduction, la réimpression, l'exposé, la reproduction des illustrations et des tableaux, la transmission par voie d'enregistrement sonore ou visuel, la reproduction par microfilm ou tout autre moyen ainsi que la conservation des banques de données. La loi française sur le copyright du 9 septembre 1965 dans la version en vigueur n'autorise une reproduction intégrale ou partielle que dans certains cas, et en principe moyennant le paiement de droits. Toute représentation, reproduction, contrefaçon ou conservation dans une banque de données par quelque procédé que ce soit est sanctionnée par la loi pénale sur le copyright. L'utilisation dans cet ouvrage de désignations, dénominations commerciales, marques de fabrique, etc. même sans spécification ne signifie pas que ces termes soient libres de la législation sur les marques de fabrique et la protection des marques et qu'ils puissent être utilisés par chacun. La maison d'édition décline toute responsabilité quant à l'exactitude des indications de dosage et des modes d'emploi. Dans chaque cas, il incombe à l'usager de vérifier les informations données par comparaison à la littérature existante.
Statistics and Computing, 2001
We discuss the effects of model misspecifications on higher-order asymptotic approximations of th... more We discuss the effects of model misspecifications on higher-order asymptotic approximations of the distribution of estimators and test statistics. In particular we show that small deviations from the model can wipe out the nominal improvements of the accuracy obtained at the model by second-order approximations of the distribution of classical statistics. Although there is no guarantee that the first-order robustness properties of robust estimators and tests will carry over to second-order in a neighbourhood of the model, the behaviour of robust procedures in terms of second-order accuracy is generally more stable and reliable than that of their classical counterparts. Finally, we discuss some related work on robust adjustments of the profile likelihood and outline the role of computer algebra in this type of research.
Statistical Modelling, 2011
We adapt Breiman’s non-negative garrote method to perform variable selection in non-parametric ad... more We adapt Breiman’s non-negative garrote method to perform variable selection in non-parametric additive models. The technique avoids methods of testing for which no general reliable distributional theory is available. In addition, it removes the need for a full search of all possible models, something which is computationally intensive, especially when the number of variables is moderate to high. The method has the advantages of being conceptually simple and computationally fast. It provides accurate predictions and is effective at identifying the variables generating the model. To illustrate our procedure, we analyse logbook data on blue sharks ( Prionace glauca) from the US pelagic longline fishery. In addition, we compare our proposal to a series of available alternatives by simulation. The results show that in all cases our methods perform better or as well as these alternatives.
Journal of the American Statistical Association, 1996
Saddlepoint approximations of marginal densities and tail probabilities of general nonlinear stat... more Saddlepoint approximations of marginal densities and tail probabilities of general nonlinear statistics are derived. They are based on the expansion of the statistic up to the second order. Their accuracy is shown in a variety of examples, including logit and probit models and rank estimators for regression.
Journal of the American Statistical Association, 1994
ABSTRACT We present a robust version of Mallows’ C P [see C. L. Mallows, Technometrics 15, 661-67... more ABSTRACT We present a robust version of Mallows’ C P [see C. L. Mallows, Technometrics 15, 661-675 (1973; Zbl 0269.62061)] for regression models. It is defined by RC P =W P /σ ^ 2 -(U P -V P ), where W P =∑ i w ^ i 2 r i 2 is a weighted residual sum of squares computed from a robust fit of model P, σ ^ 2 is a robust and consistent estimator of σ 2 in the full model, and U P and V P are constants depending on the weight function and the number of parameters in model P. Good subset models are those with RC P close to V P or smaller than V P . When the weights are identically 1, W P becomes the residual sum of squares of a least squares fit, and RC P reduces to Mallows’ C P . The robust model selection procedure based on RC P allows us to choose the models that fit the majority of the data by taking into account the presence of outliers and possible departures from the normality assumption on the error distribution. Together with the classical C P , the robust version suggests several models from which we can choose.
Journal of the American Statistical Association, 2001
By starting from a natural class of robust estimators for generalized linear models based on the ... more By starting from a natural class of robust estimators for generalized linear models based on the notion of quasi-likelihood, we define robust deviances that can be used for stepwise model selection as in the classical framework. We derive the asymptotic distribution of tests based on robust deviances and we investigate the stability of their asymptotic level under contamination. The binomial and Poisson models are treated in detail. Two applications to real data and a sensitivity analysis show that the inference obtained by means of the new techniques is more reliable than that obtained by classical estimation and testing procedures.
Journal of the American Statistical Association, 2005
This paper studies the local robustness of estimators and tests for the conditional location and ... more This paper studies the local robustness of estimators and tests for the conditional location and scale parameters in a strictly stationary time series model. We first derive optimal bounded-influence estimators for such settings under a conditionally Gaussian reference model. Based on these results, optimal bounded-influence versions of the classical likelihood-based tests for parametric hypotheses are obtained. We propose a feasible and efficient algorithm for the computation of our robust estimators, which makes use of analytical Laplace approximations to estimate the auxiliary recentering vectors ensuring Fisher consistency in robust estimation. This strongly reduces the necessary computation time by avoiding the simulation of multidimensional integrals, a task that has typically to be addressed in the robust estimation of nonlinear models for time series. In some Monte Carlo simulations of an AR(1)-ARCH(1) process we show that our robust procedures maintain a very high efficiency under ideal model conditions and at the same time perform very satisfactorily under several forms of departure from conditional normality. On the contrary, classical Pseudo Maximum Likelihood inference procedures are found to be highly inefficient under such local model misspecifications. These patterns are confirmed by an application to robust testing for ARCH.
Journal of the American Statistical Association, 1997
ABSTRACT
Journal of the American Statistical Association, 1997
ABSTRACT
Journal of Statistical Planning and Inference, 1997
We first review briefly some basic approaches to robust inference and discuss the role and the pl... more We first review briefly some basic approaches to robust inference and discuss the role and the place of some key concepts (influence function, breakdown point, robustness versus efficiency, etc.). We then discuss in some detail recent results on robust testing in general multivariate parametric models. Recent applications include inference in logistic regression and testing for non-nested hypotheses.