The ensemble conditional variance estimator for sufficient dimension reduction (original) (raw)

Ensemble Conditional Variance Estimator for Sufficient Dimension Reduction

arXiv (Cornell University), 2021

Ensemble Conditional Variance Estimation (ECVE) is a novel sufficient dimension reduction (SDR) method in regressions with continuous response and predictors. ECVE applies to general non-additive error regression models. It operates under the assumption that the predictors can be replaced by a lower dimensional projection without loss of information.It is a semiparametric forward regression model based exhaustive sufficient dimension reduction estimation method that is shown to be consistent under mild assumptions. It is shown to outperform central subspace mean average variance estimation (csMAVE), its main competitor, under several simulation settings and in a benchmark data set analysis.

Conditional variance estimator for sufficient dimension reduction

Bernoulli

Conditional Variance Estimation (CVE) is a novel sufficient dimension reduction (SDR) method for additive error regressions with continuous predictors and link function. It operates under the assumption that the predictors can be replaced by a lower dimensional projection without loss of information. In contrast to the majority of moment based sufficient dimension reduction methods, Conditional Variance Estimation is fully data driven, does not require the restrictive linearity and constant variance conditions, and is not based on inverse regression. CVE is shown to be consistent and its objective function to be uniformly convergent. CVE outperforms the mean average variance estimation, (MAVE), its main competitor, in several simulation settings, remains on par under others, while it always outperforms the usual inverse regression based linear SDR methods, such as Sliced Inverse Regression.

A selective review of sufficient dimension reduction for multivariate response regression

2022

We review sufficient dimension reduction (SDR) estimators with multivariate response in this paper. A wide range of SDR methods are characterized as inverse regression SDR estimators or forward regression SDR estimators. The inverse regression family include pooled marginal estimators, projective resampling estimators, and distance-based estimators. Ordinary least squares, partial least squares, and semiparametric SDR estimators, on the other hand, are discussed as estimators from the forward regression family.

Dimension estimation in sufficient dimension reduction: A unifying approach

Journal of Multivariate Analysis, 2011

Sufficient Dimension Reduction (SDR) in regression comprises the estimation of the dimension of the smallest (central) dimension reduction subspace and its basis elements. For SDR methods based on a kernel matrix, such as SIR and SAVE, the dimension estimation is equivalent to the estimation of the rank of a random matrix which is the sample based estimate of the kernel. A test for the rank of a random matrix amounts to testing how many of its eigen or singular values are equal to zero. We propose two tests based on the smallest eigen or singular values of the estimated matrix: an asymptotic weighted chi-square test and a Wald-type asymptotic chi-square test. We also provide an asymptotic chi-square test for assessing whether elements of the left singular vectors of the random matrix are zero. These methods together constitute a unified approach for all SDR methods based on a kernel matrix that covers estimation of the central subspace and its dimension, as well as assessment of variable contribution to the lower-dimensional predictor projections with variable selection, a special case. A small power simulation study shows that the proposed and existing tests, specific to each SDR method, perform similarly with respect to power and achievement of the nominal level. Also, the importance of the choice of the number of slices as a tuning parameter is further exhibited.

A New Covariance Estimator for Sufficient Dimension Reduction in High-Dimensional and Undersized Sample Problems

2019

The application of standard sufficient dimension reduction methods for reducing the dimension space of predictors without losing regression information requires inverting the covariance matrix of the predictors. This has posed a number of challenges especially when analyzing high-dimensional data sets in which the number of predictors p is much larger than number of samples n, (n≪ p). A new covariance estimator, called the Maximum Entropy Covariance (MEC) that addresses loss of covariance information when similar covariance matrices are linearly combined using Maximum Entropy (ME) principle is proposed in this work. By benefitting naturally from slicing or discretizing range of the response variable, y into H non-overlapping categories, h_1,... ,h_H, MEC first combines covariance matrices arising from samples in each y slice h∈ H and then select the one that maximizes entropy under the principle of maximum uncertainty. The MEC estimator is then formed from convex mixture of such ent...

Sufficient dimension reduction and prediction in regression: Asymptotic results

Journal of Multivariate Analysis, 2019

In this article, a new method named cumulative slicing principle fitted component (CUPFC) model is proposed to conduct sufficient dimension reduction and prediction in regression. Based on the classical PFC methods, the CUPFC avoids selecting some parameters such as the specific basis function form or the number of slices in slicing estimation. We develop the estimator of the central subspace in the CUPFC method under three error-term structures and establish its consistency. The simulations investigate the effectiveness of the new method in prediction and reduction estimation with other competitors. The results indicate that the new proposed method generally outperforms the existing PFC methods no matter how the predictors are truly related to the response. The application to real data also verifies the validity of the proposed method.

itdr: An R package of Integral Transformation Methods to Estimate the SDR Subspaces in Regression

2022

Sufficient dimension reduction (SDR) is a successful tool in regression models. It is a feasible method to solve and analyze the nonlinear nature of the regression problems. This paper introduces the itdr R package that provides several functions based on integral transformation methods to estimate the SDR subspaces in a comprehensive and user-friendly manner. In particular, the itdr package includes the Fourier method (FM) and the convolution method (CM) of estimating the SDR subspaces such as the central mean subspace (CMS) and the central subspace (CS). In addition, the itdr package facilitates the recovery of the CMS and the CS by using the iterative Hessian transformation (IHT) method and the Fourier transformation approach for inverse dimension reduction method (invFM), respectively. Moreover, the use of the package is illustrated by three datasets. Furthermore, this is the first package that implements integral transformation methods to estimate SDR subspaces. Hence, the itdr package may provide a huge contribution to research in the SDR field.

Sufficient Dimension Reduction using Conditional Variance Estimation and related concepts

2021

In der Regression untersucht man dieb edingte Verteilungd er Zielvariable gegeben den Prädiktoren,u mz .B. Prognosenz ue rhalten. Regression ist einer der meist studierten und angewandten Gebiet der Statistik. Die Modellierung vonh ochdimensionalenD aten, insbesonders beie inem nichtlinearern Zusammenhang, ist herausforderndf allsd ie Anzahl der Prädiktoren (p)großist.Suffiziente Dimensionsreduktion (SDR)ersetzt den hochdimensionalen Prädiktorvektor durcheine niedrigdimensionalereProjektion, ohneInformationüber die Zielvariable zu verlieren. Diese Arbeit entwickelt neueS DR Ansätze, den conditional variance und ensemble conditional variance estimator, für die Identifikationu nd Schätzung der linearens uffizienten Reduktion sowohl für den bedingtenE rwartungswert alsauch die bedingte Verteilngsfunktionder Zielvariable gegebenden hochdimensionalen Prädiktoren. Für beide Schätzer wird dieK onsistenz bewiesen. Weiters,w ird eine neuerS chätzer, der eine Kombinationa us suffizienter Dimensionsreduktion und Neuronalen Netzten ist, vorgestellt. Alled rei Schätzer sind kompetitivei mV ergleich zu momentanen state-of-the-artS DR Schätzern.

On expectile-assisted inverse regression estimation for sufficient dimension reduction

Journal of Statistical Planning and Inference, 2021

Moment-based sufficient dimension reduction methods such as sliced inverse regression may not work well in the presence of heteroscedasticity. We propose to first estimate the expectiles through kernel expectile regression, and then carry out dimension reduction based on random projections of the regression expectiles. Several popular inverse regression methods in the literature are extended under this general framework. The proposed expectileassisted methods outperform existing moment-based dimension reduction methods in both numerical studies and an analysis of the Big Mac data.

Bias and variance reduction in estimation of model dimension

Proceedings of the American Mathematical Society, 1994

The problem of estimating the number of regressors to include in a linear regression model is considered. Estimators based on the final prediction error and Akaike's criterion frequently have large positive bias. Shrinkage correction factors and bootstrapping are used to produce new estimators with reduced bias. The asymptotic bias and mean-squared errors of these estimators are derived analytically. Finite-sample estimates are obtained by simulation.