An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina HumanMethylationEPIC BeadArray - PubMed (original) (raw)
An optimized library for reference-based deconvolution of whole-blood biospecimens assayed using the Illumina HumanMethylationEPIC BeadArray
Lucas A Salas et al. Genome Biol. 2018.
Abstract
Genome-wide methylation arrays are powerful tools for assessing cell composition of complex mixtures. We compare three approaches to select reference libraries for deconvoluting neutrophil, monocyte, B-lymphocyte, natural killer, and CD4+ and CD8+ T-cell fractions based on blood-derived DNA methylation signatures assayed using the Illumina HumanMethylationEPIC array. The IDOL algorithm identifies a library of 450 CpGs, resulting in an average R2 = 99.2 across cell types when applied to EPIC methylation data collected on artificial mixtures constructed from the above cell types. Of the 450 CpGs, 69% are unique to EPIC. This library has the potential to reduce unintended technical differences across array platforms.
Keywords: Adults; B-cells; Cytotoxic T-lymphocytes; DNA methylation; Epigenetics; Helper T-cells; Leukocytes; Monocytes; Natural killer cells; Neutrophils.
Conflict of interest statement
Ethics approval and consent to participate
Cells used in these experiments were obtained commercially. All donors are anonymous. All the subjects provided written informed consent before donation to the commercial houses which provided the commercial cells.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures
Fig. 1
Comparison of L-DMR libraries among automatic selection in minfi and the IDOL algorithm for optimization. a Reinius reference dataset [13] probes from the 450 K array (n = 600 CpGs). b Probes selected from the new reference samples measured with the EPIC array (n = 600 CpGs). c L-DMR library derived from IDOL using the EPIC array (n = 450 CpGs). d Overlapping of the probes of the three methods. DHS DNase hypersensitive sites
Fig. 2
Comparison of estimate cell proportions using constrained projection/quadratic programming (CP/QP) versus the reconstructed (true) DNA fraction in the artificial DNA mixtures using the EPIC IDOL method. a Cell-specific DNA proportions per sample included in the two mixture reconstruction methods (methods A and B). b R2 and RMSE using the EPIC IDOL method and the two reconstruction methods
Fig. 3
Observed estimates of absolute error by deconvolution method per cell type (top panel) and global per method (bottom panel)
Fig. 4
Comparison of the longitudinal assessment of cell type proportions and cell ratio changes using DNA methylation data and two different reference L-DMR libraries (EPIC IDOL and 450 K)
Fig. 5
Comparison of the estimated cell proportions using constrained projection/quadratic programming (CP/QP) versus the FACS measured fraction in EPIC and 450 K platforms. a Whole blood cell samples arrayed using the EPIC platform with known (FACS) fractions for the six main cell subtypes. Cell estimates were obtained using the EPIC IDOL method. b Whole blood cell samples arrayed using the Illumina 450 K platform with known (FACS) fractions for the six main cell subtypes. Cell estimates were obtained using the EPIC IDOL 450 K legacy method. c Five out of 11 observations on the longitudinal dataset run with EPIC had FACS information
Fig. 6
Examples of critical CpGs for cell deconvolution selected by IDOL
References
- Breton CV, Marsit CJ, Faustman E, Nadeau K, Goodrich JM, Dolinoy DC, et al. Small-magnitude effect sizes in epigenetic end points are important in children’s environmental health studies: the Children’s Environmental Health and Disease Prevention Research Center’s Epigenetics Working Group. Environ Health Perspect. 2017;125:511–526. doi: 10.1289/EHP595. - DOI - PMC - PubMed
Publication types
MeSH terms
Grants and funding
- P50CA097257/CA/NCI NIH HHS/United States
- R01CA207110/CA/NCI NIH HHS/United States
- R01CA52689/CA/NCI NIH HHS/United States
- P20GM108189/GM/NIGMS NIH HHS/United States
- P20GM104416/GM/NIGMS NIH HHS/United States
- P20 GM103418/GM/NIGMS NIH HHS/United States
- R01 CA216265/CA/NCI NIH HHS/United States
- R01 CA207110/CA/NCI NIH HHS/United States
- P50 CA097257/CA/NCI NIH HHS/United States
- R01DE022772/DE/NIDCR NIH HHS/United States
- F32 GM108189/GM/NIGMS NIH HHS/United States
- P20GM103418/GM/NIGMS NIH HHS/United States
- R01 CA207360/CA/NCI NIH HHS/United States
- R01CA216265/CA/NCI NIH HHS/United States
- R01 DE022772/DE/NIDCR NIH HHS/United States
- R01 CA052689/CA/NCI NIH HHS/United States
- P20 GM104416/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials