ASICS User’s Guide (original) (raw)
This user’s guide provides an overview of the package ASICS
. ASICS
is a fully automated procedure to identify and quantify metabolites in \(^1\)H 1D-NMR spectra of biological mixtures (Tardivel et al., 2017). It will enable empowering NMR-based metabolomics by quickly and accurately helping experts to obtain metabolic profiles. In addition to the quantification method, several functions allowing spectrum preprocessing or statistical analyses of quantified metabolites are available.
library(ASICS)
library(ASICSdata)
Dataset
In this user’s guide, a subset of the public datasets from Salek et al. (2007) is used. The experiment has been designed to improve the understanding of early stage of type 2 diabetes mellitus (T2DM) development. In the dataset used, \(^1\)H-NMR human metabolome was obtained from 25 healthy volunteers and 25 T2DM patients. Raw 1D Bruker spectral data files were found in the MetaboLights database (https://www.ebi.ac.uk/metabolights/, study MTBLS1).
Parallel environment
For most time consumming functions, a parallel implementation is available for unix-like OS using the BiocParallel package of Bioconductor. The number of used cores is set with the option ncores
of the corresponding functions (default to 1
, no parallel environment).
Analysis on relative quantifications
Some analysis functions are also available in ASICS
.
First, a design data frame is imported. In this data frame, the first column needs to correspond to sample names of all spectra.
design <- read.table(system.file("extdata", "design_diabete_example.txt",
package = "ASICSdata"), header = TRUE)
Then, a preprocessing is performed on relative quantifications: metabolites with more than 75% of null quantifications are removed as well as two samples that are considered as outliers.
analysis_data <- formatForAnalysis(getQuantification(ASICS_results),
design = design, zero.threshold = 75,
zero.group = "condition",
outliers = c("ADG10003u_007",
"ADG19007u_163"))
To explore results of ASICS quantification, a PCA can be performed on results of preprocessing with:
resPCA <- pca(analysis_data)
## Warning: 'info.txtC = NULL' argument value is deprecated; use 'info.txtC =
## 'none'' instead.
## Warning: 'fig.pdfC = NULL' argument value is deprecated; use 'fig.pdfC = 'none''
## instead.
plot(resPCA, graph = "ind", col.ind = "condition")
plot(resPCA, graph = "var")
It is also possible to find differences between two conditions with an OPLS-DA (Thevenot et al, 2015) or with Kruskall-Wallis tests:
resOPLSDA <- oplsda(analysis_data, condition = "condition", orthoI = 1)
resOPLSDA
## OPLS-DA performed on quantifications
## Cross validation error: 0.12
##
## Variable with the higher VIP:
## Control Group diabetes mellitus VIP influential
## L-Citrulline 1.577558e-03 6.348160e-04 2.293155 TRUE
## Galactitol 1.062704e-03 2.947166e-04 2.259247 TRUE
## D-Glucose-6-Phosphate 1.718698e-03 2.042602e-03 2.046632 TRUE
## Trimethylamine 0.000000e+00 3.390938e-05 1.885807 TRUE
## Uracil 1.290575e-03 4.742019e-04 1.862658 TRUE
## 3-PhenylPropionicAcid 6.515734e-04 8.718475e-04 1.818105 TRUE
## D-GluconicAcid 7.105618e-04 1.667304e-03 1.797962 TRUE
## 2-Oxobutyrate 4.440598e-05 5.164097e-04 1.754901 TRUE
## UrocanicAcid 1.578936e-05 1.932797e-04 1.713738 TRUE
## Levoglucosan 7.772079e-04 3.861891e-04 1.683468 TRUE
## [...]
plot(resOPLSDA)
Results of Kruskall-Wallis tests and Benjamini-Hochberg correction:
resTests <- kruskalWallis(analysis_data, "condition")
resTests
## Kruskal-Wallis tests performed on quantifications
## Variable with the lower adjusted p-value:
##
## Feature Adjusted.p.value
## 1 L-Citrulline 9.789748e-05
## 2 Galactitol 2.703854e-04
## 3 Glycerol 3.268214e-03
## 4 Ethanolamine 1.624328e-02
## 5 Uracil 6.168396e-02
## 6 D-Glucose-6-Phosphate 6.568338e-02
## 7 D-GluconicAcid 9.634158e-02
## 8 Inosine 1.117938e-01
## 9 Trigonelline 1.148400e-01
## 10 2-Oxobutyrate 1.617243e-01
## [...]
plot(resTests)
Analysis on buckets
An analysis on buckets can also be performed. An alignment is required before the spectrum bucketing:
spectra_align <- alignSpectra(spectra_norm)
spectra_bucket <- binning(spectra_align)
Alignment visualization:
spectra_obj_align <- createSpectra(spectra_align)
plotAlignment(spectra_obj, xlim = c(3.5,4))
plotAlignment(spectra_obj_align, xlim = c(3.5,4))
Then, a SummarizedExperiment
object is created with the formatForAnalysis
function as for quantification:
analysis_data_bucket <- formatForAnalysis(spectra_bucket, design = design,
zero.threshold = 75)
Finally, all analyses can be carried out on this object with the parametertype.data
set to buckets
. For example, the OPLS-DA is performed with:
resOPLSDA_buckets <- oplsda(analysis_data_bucket, condition = "condition",
type.data = "buckets")
resOPLSDA_buckets
## OPLS-DA performed on buckets
## Cross validation error: 0.12
##
## Variable with the higher VIP:
## Control Group diabetes mellitus VIP influential
## 8.935 -1.133369e-05 3.615169e-07 2.240782 TRUE
## 4.115 1.014203e-03 1.317695e-03 2.164346 TRUE
## 3.785 4.080651e-03 4.902450e-03 2.144845 TRUE
## 8.625 -1.267405e-05 6.430714e-06 2.093072 TRUE
## 3.685 8.273601e-03 5.333275e-03 2.090887 TRUE
## 4.245 7.322459e-04 4.692211e-04 2.074871 TRUE
## 9.735 -8.501084e-06 2.194731e-06 2.063978 TRUE
## 6.335 1.075097e-05 3.006047e-05 2.036788 TRUE
## 5.255 2.055731e-04 3.849152e-04 2.032645 TRUE
## 8.635 -7.321572e-06 7.141929e-06 1.968756 TRUE
## [...]
Moreover, another plot with the median spectrum and OPLS-DA results can be produced with the option graph = "buckets"
:
plot(resOPLSDA_buckets, graph = "buckets")
References
Lefort G., Liaubet L., Marty-Gasset N., Canlet C., Vialaneix N., Servien R. . 2020. Pre-print, https://www.biorxiv.org/content/10.1101/2020.10.08.331090v1.
Tardivel P., Canlet C., Lefort G., Tremblay-Franco M., Debrauwer L., Concordet D., Servien R. (2017). ASICS: an automatic method for identification and quantification of metabolites in complex 1D 1H NMR spectra. Metabolomics,13(10), 109. https://doi.org/10.1007/s11306-017-1244-5
Salek, R. M., Maguire, M. L., Bentley, E., Rubtsov, D. V., Hough, T., Cheeseman, M., … & Connor, S. C. (2007). A metabolomic comparison of urinary changes in type 2 diabetes in mouse, rat, and human. Physiological genomics,29(2), 99-108.
Wang, K. C., Wang, S. Y., Kuo, C. H., Tseng, Y. J. (2013). Distribution-based classification method for baseline correction of metabolomic 1D proton nuclear magnetic resonance spectra. Analytical Chemistry, 85(2), 1231–1239.
Vu, T. N., Valkenborg, D., Smets, K., Verwaest, K. A., Dommisse, R., Lemiere, F., … & Laukens, K. (2011). An integrated workflow for robust alignment and simplified quantitative analysis of NMR spectrometry data.BMC bioinformatics, 12(1), 405.
Thevenot, E.A., Roux, A., Xu, Y., Ezan, E., Junot, C. 2015. Analysis of the human adult urinary metabolome variations with age, body mass index and gender by implementing a comprehensive workflow for univariate and OPLS statistical analyses. Journal of Proteome Research. 14, 3322-3335.
Session information
This user’s guide has been created with the following system configuration:
sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
##
## locale:
## [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8
## [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8
## [7] LC_PAPER=fr_FR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ASICSdata_1.8.0 ASICS_2.6.1 BiocStyle_2.16.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.5 mvtnorm_1.1-1
## [3] lattice_0.20-41 zoo_1.8-8
## [5] ropls_1.20.0 glmnet_4.0-2
## [7] digest_0.6.27 foreach_1.5.1
## [9] R6_2.5.0 GenomeInfoDb_1.24.2
## [11] plyr_1.8.6 stats4_4.0.3
## [13] evaluate_0.14 ggplot2_3.3.2
## [15] pillar_1.4.6 zlibbioc_1.34.0
## [17] rlang_0.4.8 nloptr_1.2.2.2
## [19] S4Vectors_0.26.1 Matrix_1.2-18
## [21] rmarkdown_2.5 labeling_0.4.2
## [23] splines_4.0.3 BiocParallel_1.22.0
## [25] stringr_1.4.0 RCurl_1.98-1.2
## [27] munsell_0.5.0 DelayedArray_0.14.1
## [29] compiler_4.0.3 xfun_0.18
## [31] pkgconfig_2.0.3 BiocGenerics_0.34.0
## [33] shape_1.4.5 htmltools_0.5.0
## [35] tidyselect_1.1.0 SummarizedExperiment_1.18.2
## [37] tibble_3.0.4 gridExtra_2.3
## [39] GenomeInfoDbData_1.2.3 bookdown_0.21
## [41] quadprog_1.5-8 IRanges_2.22.2
## [43] codetools_0.2-16 matrixStats_0.57.0
## [45] crayon_1.3.4 dplyr_1.0.2
## [47] MASS_7.3-53 bitops_1.0-6
## [49] grid_4.0.3 gtable_0.3.0
## [51] lifecycle_0.2.0 magrittr_1.5
## [53] PepsNMR_1.6.1 scales_1.1.1
## [55] stringi_1.5.3 farver_2.0.3
## [57] XVector_0.28.0 reshape2_1.4.4
## [59] ptw_1.9-15 ellipsis_0.3.1
## [61] generics_0.1.0 vctrs_0.3.4
## [63] RColorBrewer_1.1-2 iterators_1.0.13
## [65] tools_4.0.3 Biobase_2.48.0
## [67] glue_1.4.2 purrr_0.3.4
## [69] parallel_4.0.3 survival_3.2-7
## [71] yaml_2.2.1 colorspace_1.4-1
## [73] BiocManager_1.30.10 GenomicRanges_1.40.0
## [75] knitr_1.30