Peter Bühlmann - Academia.edu (original) (raw)

Uploads

Papers by Peter Bühlmann

Research paper thumbnail of Pattern alternating maximization algorithm for missing data in high-dimensional problems

Journal of Machine Learning Research, 2014

We propose a novel and efficient algorithm for maximizing the observed log-likelihood of a multiv... more We propose a novel and efficient algorithm for maximizing the observed log-likelihood of a multivariate normal data matrix with missing values. We show that our procedure, based on iteratively regr...

Research paper thumbnail of One Modern Culture of Statistics: Comments on Statistical Modeling: The Two Cultures (Breiman, 2001b)

Research paper thumbnail of A Look at Robustness and Stability of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi mathvariant="normal">ℓ</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">\ell_{1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord">ℓ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>-versus <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi mathvariant="normal">ℓ</mi><mn>0</mn></msub></mrow><annotation encoding="application/x-tex">\ell_{0}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord">ℓ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>-Regularization: Discussion of Papers by Bertsimas et al. and Hastie et al

Research paper thumbnail of Invariant Causal Prediction for Sequential Data

Journal of the American Statistical Association

Research paper thumbnail of Kernel-based tests for joint independence

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

Research paper thumbnail of Assessing statistical significance in multivariable genome wide association analysis

Research paper thumbnail of Marginal integration for nonparametric causal inference

Electronic Journal of Statistics, 2015

Research paper thumbnail of Boosting

Research paper thumbnail of Confidence Intervals for Maximin Effects in Inhomogeneous Large-Scale Data

One challenge of large-scale data analysis is that the assumption of an identical distribution fo... more One challenge of large-scale data analysis is that the assumption of an identical distribution for all samples is often not realistic. An optimal linear regression might, for example, be markedly different for distinct groups of the data. Maximin effects have been proposed as a computationally attractive way to estimate effects that are common across all data without fitting a mixture distribution explicitly. So far just point estimators of the common maximin effects have been proposed in Meinshausen and B\"uhlmann (2014). Here we propose asymptotically valid confidence regions for these effects.

Research paper thumbnail of Maximin effects in inhomogeneous large-scale data

The Annals of Statistics, 2015

Research paper thumbnail of Robust Statistics

Selected Works in Probability and Statistics, 2012

Research paper thumbnail of A sequential rejection testing method for high-dimensional regression with correlated variables

Research paper thumbnail of Hierarchical testing in the high-dimensional setting with correlated variables

Research paper thumbnail of Maximin effects in inhomogeneous large-scale data

Research paper thumbnail of On asymptotically optimal confidence regions and tests for high-dimensional models

The Annals of Statistics, 2014

Research paper thumbnail of Stable solutions

Springer Series in Statistics, 2011

ABSTRACT Estimation of discrete structure such as in variable selection or graphical modeling is ... more ABSTRACT Estimation of discrete structure such as in variable selection or graphical modeling is notoriously difficult, especially for high-dimensional data. Subsampling or bootstrapping have the potential to substantially increase the stability of high-dimensional selection algorithms and to quantify their uncertainties. Stability via subsampling or bootstrapping has been introduced by Breiman (1996) in the context of prediction. Here, the focus is different: the resampling scheme can provide finite sample control for certain error rates of false discoveries and hence a transparent principle to choose a proper amount of regularization for structure estimation. We discuss methodology and theory for very general settings which include variable selection in linear or generalized linear models or graphical modeling from Chapter 13. For the special case of variable selection in linear models, the theoretical properties (developed here) for consistent selection using stable solutions based on subsampling or bootstrapping require slightly stronger assumptions and are less refined than say for the adaptive or thresholded Lasso.

Research paper thumbnail of Rejoinder: ℓ 1-penalization for mixture regression models

Research paper thumbnail of ℓ 1-PENALIZATION for Mixture Regression Models

Research paper thumbnail of Discussion of Big Bayes Stories and BayesBag

Statistical Science, 2014

Research paper thumbnail of Bootstraps for Time Series

Statistical Science, 2002

Research paper thumbnail of Pattern alternating maximization algorithm for missing data in high-dimensional problems

Journal of Machine Learning Research, 2014

We propose a novel and efficient algorithm for maximizing the observed log-likelihood of a multiv... more We propose a novel and efficient algorithm for maximizing the observed log-likelihood of a multivariate normal data matrix with missing values. We show that our procedure, based on iteratively regr...

Research paper thumbnail of One Modern Culture of Statistics: Comments on Statistical Modeling: The Two Cultures (Breiman, 2001b)

Research paper thumbnail of A Look at Robustness and Stability of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi mathvariant="normal">ℓ</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">\ell_{1}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord">ℓ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>-versus <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi mathvariant="normal">ℓ</mi><mn>0</mn></msub></mrow><annotation encoding="application/x-tex">\ell_{0}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8444em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord">ℓ</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em;"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">0</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span></span></span></span>-Regularization: Discussion of Papers by Bertsimas et al. and Hastie et al

Research paper thumbnail of Invariant Causal Prediction for Sequential Data

Journal of the American Statistical Association

Research paper thumbnail of Kernel-based tests for joint independence

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

Research paper thumbnail of Assessing statistical significance in multivariable genome wide association analysis

Research paper thumbnail of Marginal integration for nonparametric causal inference

Electronic Journal of Statistics, 2015

Research paper thumbnail of Boosting

Research paper thumbnail of Confidence Intervals for Maximin Effects in Inhomogeneous Large-Scale Data

One challenge of large-scale data analysis is that the assumption of an identical distribution fo... more One challenge of large-scale data analysis is that the assumption of an identical distribution for all samples is often not realistic. An optimal linear regression might, for example, be markedly different for distinct groups of the data. Maximin effects have been proposed as a computationally attractive way to estimate effects that are common across all data without fitting a mixture distribution explicitly. So far just point estimators of the common maximin effects have been proposed in Meinshausen and B\"uhlmann (2014). Here we propose asymptotically valid confidence regions for these effects.

Research paper thumbnail of Maximin effects in inhomogeneous large-scale data

The Annals of Statistics, 2015

Research paper thumbnail of Robust Statistics

Selected Works in Probability and Statistics, 2012

Research paper thumbnail of A sequential rejection testing method for high-dimensional regression with correlated variables

Research paper thumbnail of Hierarchical testing in the high-dimensional setting with correlated variables

Research paper thumbnail of Maximin effects in inhomogeneous large-scale data

Research paper thumbnail of On asymptotically optimal confidence regions and tests for high-dimensional models

The Annals of Statistics, 2014

Research paper thumbnail of Stable solutions

Springer Series in Statistics, 2011

ABSTRACT Estimation of discrete structure such as in variable selection or graphical modeling is ... more ABSTRACT Estimation of discrete structure such as in variable selection or graphical modeling is notoriously difficult, especially for high-dimensional data. Subsampling or bootstrapping have the potential to substantially increase the stability of high-dimensional selection algorithms and to quantify their uncertainties. Stability via subsampling or bootstrapping has been introduced by Breiman (1996) in the context of prediction. Here, the focus is different: the resampling scheme can provide finite sample control for certain error rates of false discoveries and hence a transparent principle to choose a proper amount of regularization for structure estimation. We discuss methodology and theory for very general settings which include variable selection in linear or generalized linear models or graphical modeling from Chapter 13. For the special case of variable selection in linear models, the theoretical properties (developed here) for consistent selection using stable solutions based on subsampling or bootstrapping require slightly stronger assumptions and are less refined than say for the adaptive or thresholded Lasso.

Research paper thumbnail of Rejoinder: ℓ 1-penalization for mixture regression models

Research paper thumbnail of ℓ 1-PENALIZATION for Mixture Regression Models

Research paper thumbnail of Discussion of Big Bayes Stories and BayesBag

Statistical Science, 2014

Research paper thumbnail of Bootstraps for Time Series

Statistical Science, 2002

Log In