MWPCR: Multiscale Weighted Principal Component Regression for High-Dimensional Prediction (original) (raw)

Early diagnosis of Alzheimer’s disease on ADNI data using novel longitudinal score based on functional principal component analysis

Journal of Medical Imaging, 2021

Methods: Alzheimer's Disease (AD) is a worldwide prevalent age-related neurodegenerative disease with no available cure yet. Early prognosis is therefore crucial for planning proper clinical intervention. It is especially true for people diagnosed with mild cognitive impairment, to whom the prediction of whether and when the future disease onset would happen is particularly valuable. However, such prognostic prediction has been proven to be challenging, and previous studies have only achieved limited success. Approach: In this study, we seek to extract the principal component of the longitudinal disease progression trajectory in the early stage of AD, measured as the MRI-derived structural volume, to predict the onset of AD for mild cognitive impaired patients two years ahead. Results: Cross validation results of LASSO regression using the longitudinal FPC features show significant improved predictive power compared to training using the baseline volume 12 months before AD conversion (AUC of 0.802 versus 0.732) as well as 24 months before AD conversion (AUC of 0.816 versus 0.717). Conclusions: We present a novel framework using the FPCA to extract features from MRIderived information collected from multiple timepoints. The results of our study demonstrate the advantageous predictive power of the population-based longitudinal features to predict the disease onset compared with using only cross-sectional data-based on volumetric features extracted from a single timepoint, demonstrating the improved prediction power using FPC-derived longitudinal features.

Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds

Pattern Recognition, 2011

We propose "Supervised Principal Component Analysis (Supervised PCA)", a generalization of PCA that is uniquely effective for regression and classification problems with high-dimensional input data. It works by estimating a sequence of principal components that have maximal dependence on the response variable. The proposed Supervised PCA is solvable in closed-form, and has a dual formulation that significantly reduces the computational complexity of problems in which the number of predictors greatly exceeds the number of observations (such as DNA microarray experiments). Furthermore, we show how the algorithm can be kernelized, which makes it applicable to non-linear dimensionality reduction tasks. Experimental results on various visualization, classification and regression problems show significant improvement over other supervised approaches both in accuracy and computational efficiency.

Alzheimer's disease prediction using machine learning techniques and principal component analysis (PCA

Science Direct, 2022

Alzheimer's disease (AD) is a neurodegenerative disease of the human brain that affects neurotransmitters, tissue, and neurons that impair the senses, memories, and behaviors. Still, now there is no remedy for Alzheimer's disease. Even so, prescribed drugs can help reduce the development of the disease. That's why Alzheimer's early detection is very essential for treatment, and further research. Very limited numbers of trained samples and the higher volume of feature descriptions are the major difficulties in early diagnosis of Alzheimer's disease using different classification strategies. In this article, we proposed and related Alzheimer's disease early diagnostic method using Mild Cognitive Impairment (MCI), Structural Magnetic Resonance (sMR) imaging for AD-discrimination and healthy control participants (HC) with Import Vector Machine (IVM), Regularized Extreme Learning Machine (RELM) and a Support vector machine (SVM).The greedy score-based strategy for choosing essential function vectors is used. Furthermore, a discriminatory, kernel-based method is taken to treat dynamic data transformations. For volume sMR scan image data from Alzheimer's disease neuroimaging initiative (ADNI) repositories, we compare the performance of these classification models. An ADNI datasets experimental study reveals that RELM can greatly enhance the accuracy for classification of AD from MCIs as well as HC individuals along with feature selection methodology.

Prediction by supervised principal components

2006

In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal components is similar to conventional principal components analysis except that it uses a subset of the predictors that are selected based on their association with the outcome. Supervised principal components can be applied to regression and generalized regression problems such as survival analysis. It compares favorably to other techniques for this type of problem, and can also account for the effects of other covariates and help identify which predictor variables are most important. We also provide asymptotic consistency results to help support our empirical findings. These methods could become important tools for DNA microarray data, where they may be used to more accurately diagnose and treat cancer.

Structured Multivariate Pattern Classification to Detect MRI Markers for an Early Diagnosis of Alzheimer's Disease

2011

Multiple kernel learning (MKL) provides flexibility by considering multiple data views and by searching for the best data representation through a combination of kernels. Clinical applications of neuroimaging have seen recent upsurge of the use of multivariate machine learning methods to predict clinical status. However, they usually do not model structured information, such as cerebral spatial and functional networking, which could improve the predictive capacity of the model and which could be more meaningful for further neuroscientific interpretation. In this study, we applied a MKL-based approach to predict prodromal stage of Alzheimer disease (i.e. early phase of the illness) with prior structured knowledges about the brain spatial neighborhood structure and the brain functional circuits linked to cognitve decline of AD. Compared to a set of classical multivariate linear classifiers, each one highlighting specific strategies, the smooth MKL-SVM method (i.e. Lp MKL-SVM) appeared to be the most powerful to distinguish both very mild and mild AD patients from healthy subjets.

Multiple Kernel Learning and Automatic Subspace Relevance Determination for High-dimensional Neuroimaging Data

arXiv (Cornell University), 2017

Alzheimer's disease is a major cause of dementia. Its diagnosis requires accurate biomarkers that are sensitive to disease stages. In this respect, we regard probabilistic classification as a method of designing a probabilistic biomarker for disease staging. Probabilistic biomarkers naturally support the interpretation of decisions and evaluation of uncertainty associated with them. In this paper, we obtain probabilistic biomarkers via Gaussian Processes. Gaussian Processes enable probabilistic kernel machines that offer flexible means to accomplish Multiple Kernel Learning. Exploiting this flexibility, we propose a new variation of Automatic Relevance Determination and tackle the challenges of high dimensionality through multiple kernels. Our research results demonstrate that the Gaussian Process models are competitive with or better than the well-known Support Vector Machine in terms of classification performance even in the cases of single kernel learning. Extending the basic scheme towards the Multiple Kernel Learning, we improve the efficacy of the Gaussian Process models and their interpretability in terms of the known anatomical correlates of the disease. For instance, the disease pathology starts in and around the hippocampus and entorhinal cortex. Through the use of Gaussian Processes and Multiple Kernel Learning, we have automatically and efficiently determined those portions of neuroimaging data. In addition to their interpretability, our Gaussian Process models are competitive with recent deep learning solutions under similar settings.

Using high-dimensional machine learning methods to estimate an anatomical risk factor for Alzheimer's disease across imaging databases

NeuroImage, 2018

The main goal of this work is to investigate the feasibility of estimating an anatomical index that can be used as an Alzheimer's disease (AD) risk factor in the Women's Health Initiative Magnetic Resonance Imaging Study (WHIMS-MRI) using MRI data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), a well-characterized imaging database of AD patients and cognitively normal subjects. We called this index AD Pattern Similarity (AD-PS) scores. To demonstrate the construct validity of the scores, we investigated their associations with several AD risk factors. The ADNI and WHIMS imaging databases were collected with different goals, populations and data acquisition protocols: it is important to demonstrate that the approach to estimating AD-PS scores can bridge these differences. MRI data from both studies were processed using high-dimensional warping methods. High-dimensional classifiers were then estimated using the ADNI MRI data. Next, the classifiers were applie...

A hybrid manifold learning algorithm for the diagnosis and prognostication of Alzheimer’s disease

Proceedings / AMIA ... Annual Symposium. AMIA Symposium

The diagnosis of Alzheimer’s disease (AD) requires a variety of medical tests, which leads to huge amounts of multivariate heterogeneous data. Such data are difficult to compare, visualize, and analyze due to the heterogeneous nature of medical tests. We present a hybrid manifold learning framework, which embeds the feature vectors in a subspace preserving the underlying pairwise similarity structure, i.e. similar/dissimilar pairs. Evaluation tests are carried out using the neuroimaging and biological data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) in a three-class (normal, mild cognitive impairment, and AD) classification task using support vector machine (SVM). Furthermore, we make extensive comparison with standard manifold learning algorithms, such as Principal Component Analysis (PCA), Principal Component Analysis (PCA), Multidimensional Scaling (MDS), and isometric feature mapping (Isomap). Experimental results show that our proposed algorithm yields an ov...