Fully integrative data analysis of NMR metabolic fingerprints with comprehensive patient data: a case report based on the German Chronic Kidney Disease (GCKD) study (original) (raw)
Related papers
Scientific Reports, 2019
Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To this end, omics data are integrated with other data types, e.g., clinical, phenotypic, and demographic parameters of categorical or continuous nature. We exemplify this data integration issue for a chronic kidney disease (CKD) study, comprising complex clinical, demographic, and one-dimensional 1H nuclear magnetic resonance metabolic variables. Routine analysis screens for associations of single metabolic features with clinical parameters while accounting for confounders typically chosen by expert knowledge. This knowledge can be incomplete or unavailable. We introduce a framework for data integration that intrinsically adjusts for confounding variables. We give its mathematical and algorithmic foundation, provide a state-of-the-art implementation, and evaluate its performance by sanity checks and predictive performance assessment on...
Metabolic Profiling of 1H NMR Spectra in Chronic Kidney Disease with Local Predictive Modeling
2015
Metabolic profiling, the study of changes in the concentration of the metabolites in the organism induced by biological differences within subpopulations, has to deal with a very large amount of complex data. It therefore requires the use of powerful data processing and machine learning methods. To overcome over-fitting, a common concern in metabolic profiling where the number of features is often much larger than the number of observations, many predictive analyses combined dimension reduction techniques with multivariate predictive linear modeling. Moreover, they built a global model that identifies biomarkers predictive of the output of interest giving their overall trend variations. However, this fails to capture local biological phenomena underlying subgroups of subjects. More recently, local exploration methods based on decision trees approaches have been applied in metabolomics but they only explore random parts of the feature space. In this study, we used a supervised rule-m...
NMR-Based Metabolomics in Differential Diagnosis of Chronic Kidney Disease (CKD) Subtypes
Metabolites
Chronic Kidney Disease (CKD) is considered as a major public health problem as it can lead to end-stage kidney failure, which requires replacement therapy. A prompt and accurate diagnosis, along with the appropriate treatment, can delay CKD’s progression, significantly. Herein, we sought to determine whether CKD etiology can be reflected in urine metabolomics during its early stage. This is achieved through the analysis of the urine metabolic fingerprint from 108 CKD patients by means of Nuclear Magnetic Resonance (NMR) spectroscopy metabolomic analysis. We report the first NMR—metabolomics data regarding the three most common etiologies of CKD: Chronic Glomerulonephritis (IgA and Membranous Nephropathy), Diabetic Nephropathy (DN) and Hypertensive Nephrosclerosis (HN). Analysis aided a moderate glomerulonephritis clustering, providing characterization of the metabolic fluctuations between the CKD subtypes and control disease. The urine metabolome of IgA Nephropathy reveals a specifi...
2005
We describe here the implementation of the statistical total correlation spectroscopy (STOCSY) analysis method for aiding the identification of potential biomarker molecules in metabonomic studies based on NMR spectroscopic data. STOCSY takes advantage of the multicollinearity of the intensity variables in a set of spectra (in this case 1 H NMR spectra) to generate a pseudo-twodimensional NMR spectrum that displays the correlation among the intensities of the various peaks across the whole sample. This method is not limited to the usual connectivities that are deducible from more standard twodimensional NMR spectroscopic methods, such as TOC-SY. Moreover, two or more molecules involved in the same pathway can also present high intermolecular correlations because of biological covariance or can even be anticorrelated. This combination of STOCSY with supervised pattern recognition and particularly orthogonal projection on latent structure-discriminant analysis (O-PLS-DA) offers a new powerful framework for analysis of metabonomic data. In a first step O-PLS-DA extracts the part of NMR spectra related to discrimination. This information is then cross-combined with the STOCSY results to help identify the molecules responsible for the metabolic variation. To illustrate the applicability of the method, it has been applied to 1 H NMR spectra of urine from a metabonomic study of a model of insulin resistance based on the administration of a carbohydrate diet to three different mice strains (C57BL/6Oxjr, BALB/cOxjr, and 129S6/SvEvOxjr) in which a series of metabolites of biological importance can be conclusively assigned and identified by use of the STOCSY approach.
Background: Diabetes is among the most prevalent diseases worldwide, of all the affected individuals a significant proportion of the population remains undiagnosed because of a lack of specific symptoms early in this disorder and inadequate diagnostics. Diabetes and its associated sequela, i.e., comorbidity are associated with microvascular and macrovascular complications. As diabetes is characterized by an altered metabolism of key metabolites and regulatory pathways. Metabolic phenotyping can provide us with a better understanding of the unique set of regulatory perturbations that predispose to diabetes and its associated comorbidities. Methodology: The present study utilizes the analytical platform NMR spectroscopy coupled with Random Forest statistical analysis to identify the discriminatory metabolites of diabetes (DB) and diabetes-related comorbidity (DC) along with the healthy control (HC) subjects. A combined and pairwise analysis was performed, between the serum samples of ...
Journal of proteome research, 2016
Large-scale metabolomics studies involving thousands of samples present multiple challenges in data analysis, particularly when an untargeted platform is used. Studies with multiple cohorts and analysis platforms exacerbate existing problems such as peak alignment and normalization. Therefore, there is a need for robust processing pipelines which can ensure reliable data for statistical analysis. The COMBI-BIO project incorporates serum from approximately 8000 individuals, in 3 cohorts, profiled by 6 assays in 2 phases using both (1)H-NMR and UPLC-MS. Here we present the COMBI-BIO NMR analysis pipeline and demonstrate its fitness for purpose using representative quality control (QC) samples. NMR spectra were first aligned and normalized. After eliminating interfering signals, outliers identified using Hotelling's T(2) were removed and a cohort/phase adjustment was applied, resulting in two NMR datasets (CPMG and NOESY). Alignment of the NMR data was shown to increase the correla...
Metabolic biomarker data quantified by nuclear magnetic resonance (NMR) spectroscopy has recently become available in UK Biobank. Here, we describe procedures for quality control and removal of technical variation for this biomarker data, comprising 249 circulating metabolites, lipids, and lipoprotein sub-fractions on approximately 121,000 participants. We identify and characterise technical and biological factors associated with individual biomarkers and find that linear effects on individual biomarkers can combine in a non-linear fashion for 61 composite biomarkers and 81 biomarker ratios. We create an R package, ukbnmr, for extracting and normalising the metabolic biomarker data, then use ukbnmr to remove unwanted variation from the UK Biobank data. We make available code for re-deriving the 61 composite biomarkers and 81 ratios, and for further derivation of 76 additional biomarker ratios of potential biological significance. Finally, we demonstrate that removal of technical var...
1H NMR metabonomics approach to the disease continuum of diabetic complications and premature death
Molecular Systems Biology, 2008
Subtle metabolic changes precede and accompany chronic vascular complications, which are the primary causes of premature death in diabetes. To obtain a multimetabolite characterization of these high-risk individuals, we measured proton nuclear magnetic resonance ( 1 H NMR) data from the serum of 613 patients with type I diabetes and a diverse spread of complications. We developed a new metabonomics framework to visualize and interpret the data and to link the metabolic profiles to the underlying diagnostic and biochemical variables. Our results indicate complex interactions between diabetic kidney disease, insulin resistance and the metabolic syndrome. We illustrate how a single 1 H NMR protocol is able to identify the polydiagnostic metabolite manifold of type I diabetes and how its alterations translate to clinical phenotypes, clustering of micro-and macrovascular complications, and mortality during several years of follow-up. This work demonstrates the diffuse nature of complex vascular diseases and the limitations of single diagnostic biomarkers. However, it also promises cost-effective solutions through high-throughput analytics and advanced computational methods, as applied here in a case that is representative of the real clinical situation.
Personalized Metabolic Profile by Synergic Use of NMR and HRMS
Molecules, 2021
A new strategy that takes advantage of the synergism between NMR and UHPLC–HRMS yields accurate concentrations of a high number of compounds in biofluids to delineate a personalized metabolic profile (SYNHMET). Metabolite identification and quantification by this method result in a higher accuracy compared to the use of the two techniques separately, even in urine, one of the most challenging biofluids to characterize due to its complexity and variability. We quantified a total of 165 metabolites in the urine of healthy subjects, patients with chronic cystitis, and patients with bladder cancer, with a minimum number of missing values. This result was achieved without the use of analytical standards and calibration curves. A patient’s personalized profile can be mapped out from the final dataset’s concentrations by comparing them with known normal ranges. This detailed picture has potential applications in clinical practice to monitor a patient’s health status and disease progression.
Journal of proteome research, 2018
Metabolism is altered by genetics, diet, disease status, environment, and many other factors. Modeling either one of these is often done without considering the effects of the other covariates. Attributing differences in metabolic profile to one of these factors needs to be done while controlling for the metabolic influence of the rest. We describe here a data analysis framework and novel confounder-adjustment algorithm for multivariate analysis of metabolic profiling data. Using simulated data, we show that similar numbers of true associations and significantly less false positives are found compared to other commonly used methods. Covariate-adjusted projections to latent structures (CA-PLS) are exemplified here using a large-scale metabolic phenotyping study of two Chinese populations at different risks for cardiovascular disease. Using CA-PLS, we find that some previously reported differences are actually associated with external factors and discover a number of previously unrepo...