The epimutacions User’s Guide (original) (raw)
Introduction
Background
Epimutations are rare alterations in the methylation pattern at specific loci. Have been demonstrated that epimutations could be the causative factor of some genetic diseases. For example, epimutations can lead to cancers, such as Lynch syndrome, rare diseases such as Prader-Willi syndrome, and are associated with common disorders, such as autism. Nonetheless, no standard methods are available to detect and quantify these alterations. Two methods for epimutations detection on methylation microarray data have been reported: (1) based on identifying CpGs with outlier values and then cluster them in epimutations (Barbosa et al. 2018); (2) define candidate regions with bumphunter and test their statistical significance with a MANOVA (Aref-Eshghi et al. 2019). However, the implementation of these methods is not publicly available, and these approaches have not been compared.
We have developed epimutacions
R/Bioconductor package. We have implemented those two methods (called quantile
and manova
, respectively). We implemented four additional approaches, using a different distribution to detect CpG outliers (beta
), or a different model to assess region significance (mlm
, mahdist
, and iForest
).
The package epimutacions
provides tools to raw DNA methylation microarray intensities normalization and epimutations identification, visualization and annotation.
The full epimutacions
user´s guide is available in this vignette.
The main function to estimate the epimutations is called epimutations()
.
The name of the package is epimutacions
(pronounced ɛ pi mu ta 'sj ons
) which means epimutations in Catalan, a language from the northeast of Spain.
Methodology
The epimutacions
package computes a genome-wide DNA methylation analysis to identify the epimutations to be considered as biomarkers for case samples with a suspected genetic disease. The function epimutations()
infers epimutations on a case-control design. It compares a case sample against a reference panel (healthy individuals). The package also includes leave-one-out approach (epimutations_one_leave_out()
). It compares individual methylation profiles of a single sample (regardless if they are cases or controls) against all other samples from the same cohort.
The epimutacions
package includes 6 outlier detection approaches (figure 1): (1) Multivariate Analysis of variance (manova
), (2) Multivariate Linear Model (mlm
), (3) isolation forest (iForest
), (4) robust mahalanobis distance (mahdist
) (5) quantile
and (6) beta
.
The approaches manova
, mlm
, iForest
and mahdist
define the candidate regions (differentially methylated regions (DMRs)) using bumphunter method (Jaffe et al. 2012). Then, those DMRs are tested to identify regions with CpGs being outliers when comparing with the reference panel.quantile
uses the sliding window approach to individually compare the methylation value in each proband against the reference panel and then cluster them in epimutations.Beta
utilizes beta distribution to identify outlier CpGs.
We have defined an epimutation as a consecutive window of a minimum of 3 outliers CpGs with a maximum distance of 1kb between them (Barbosa et al. 2018).
Figure 1: Implementation of each outlier detection method
Setup
Installing the packages
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("epimutacions")
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("epimutacionsData")
Loading libraries
library(epimutacions)
Quick start
The workflow in figure 2 explains the main analysis in the epimutacions
package.
The package allows two different types of inputs:
- Case samples
IDAT
files (raw microarray intensities) together withRGChannelSet
object as reference panel. The reference panel can be supplied by the user or can be selected through the example datasets that the package provides (section 3).
- Case samples
GenomicRatioSet
object containing case and control samples.
The normalization (epi_preprocess()
) converts the raw microarray intensities into usable methylation measurement (\(\beta\) values at CpG locus level). As a result, we obtain a GenomicRatioSet
object, which can be used as epimutations()
function input. The data should contain information about values of CpG sites, phenotype and feature data.
Figure 2: Allowed data formats, normalization and input types
Datasets
The package contains 3 example datasets adapted from Gene Expression Omnibus (GEO):
- 4 case samples IDAT files (GEO: GSE131350)
reference_panel
: aRGChannelSet
class object containing 22 healthy individuals(GEO: GSE127824)
methy
: aGenomicRatioSet
object which includes 49 controls (GEO: GSE104812)and 3 cases (GEO: GSE97362).
We also included a dataset specifying the 40,408 candidate regions in Illumina 450K array which could be epimutations (see section 3.2).
We created the epimutacionsData
package in ExperimentHub
. It contains the reference panel, methy and the candidate epimutations datasets. The package includes the IDAT files as external data. To access the datasets we need to install the packages by running the following commands:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("ExperimentHub")
Then, we need to load the package and create an ExperimentHub
object:
library(ExperimentHub)
eh <- ExperimentHub()
query(eh, c("epimutacionsData"))
ExperimentHub with 3 records
# snapshotDate(): 2025-04-11
# $dataprovider: GEO, Illumina 450k array
# $species: Homo sapiens
# $rdataclass: RGChannelSet, GenomicRatioSet, GRanges
# additional mcols(): taxonomyid, genome, description,
# coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
# rdatapath, sourceurl, sourcetype
# retrieve records with, e.g., 'object[["EH6690"]]'
title
EH6690 | Control and case samples
EH6691 | Reference panel
EH6692 | Candidate epimutations
IDAT files and reference panel
IDAT files directory in epimutacionsData
package:
baseDir <- system.file("extdata", package = "epimutacionsData")
The reference panel and methy
dataset can be found inEH6691
and EH6690
record of the eh
object:
reference_panel <- eh[["EH6691"]]
methy <- eh[["EH6690"]]
Candidate regions
In Illumina 450K array, probes are unequally distributed along the genome, limiting the number of regions that can fulfil the requirements to be considered an epimutation. Thus, we have computed a dataset containing all the regions that could be candidates to become an epimutation.
We used the clustering approach from bumphunter to define the candidate epimutations. We defined a primary dataset with all the CpGs from the Illumina 450K array. Then, we run bumphunter and selected those regions with at least 3 CpGs and a maximum distance of 1kb between them. As a result, we found 40,408 candidate epimutations. The function epimutation()
filters the identified epimutations using these candidate regions.
The following is the code used to identify the candidate epimutations in Illumina 450K array:
library(minfi)
# Regions 450K
library("IlluminaHumanMethylation450kanno.ilmn12.hg19")
data(Locations)
### Select CpGs (names starting by cg) in autosomic chromosomes
locs.450 <- subset(Locations, grepl("^cg", rownames(Locations)) & chr %in% paste0("chr", 1:22))
locs.450GR <- makeGRangesFromDataFrame(locs.450,
start.field = "pos",
end.field = "pos",
strand = "*")
locs.450GR <- sort(locs.450GR)
mat <- matrix(0, nrow = length(locs.450GR), ncol = 2,
dimnames = list(names(locs.450GR), c("A", "B")))
## Set sample B to all 1
mat[, 2] <- 1
## Define model matrix
pheno <- data.frame(var = c(0, 1))
model <- model.matrix(~ var, pheno)
## Run bumphunter
bumps <- bumphunter(mat, design = model, pos = start(locs.450GR),
chr = as.character(seqnames(locs.450GR)),
cutoff = 0.05)$table
bumps.fil <- subset(bumps, L >= 3)
The candidate regions can be found in EH6692
record of the eh
object:
candRegsGR <- eh[["EH6692"]]
Preprocessing
The epi_preprocess()
function allows calling the 6 preprocessing methods from minfi
package:
Each normalization approach has some unique parameters which can be modified through norm_parameters()
function:
We can obtain the default settings for each method by invoking the function norm_parameters()
with no arguments:
norm_parameters()
$illumina <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi><mi>l</mi><mi>l</mi><mi>u</mi><mi>m</mi><mi>i</mi><mi>n</mi><mi>a</mi></mrow><annotation encoding="application/x-tex">illumina</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">ll</span><span class="mord mathnormal">u</span><span class="mord mathnormal">mina</span></span></span></span>bg.correct
[1] TRUE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi><mi>l</mi><mi>l</mi><mi>u</mi><mi>m</mi><mi>i</mi><mi>n</mi><mi>a</mi></mrow><annotation encoding="application/x-tex">illumina</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">ll</span><span class="mord mathnormal">u</span><span class="mord mathnormal">mina</span></span></span></span>normalize
[1] "controls" "no"
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi><mi>l</mi><mi>l</mi><mi>u</mi><mi>m</mi><mi>i</mi><mi>n</mi><mi>a</mi></mrow><annotation encoding="application/x-tex">illumina</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">ll</span><span class="mord mathnormal">u</span><span class="mord mathnormal">mina</span></span></span></span>reference
[1] 1
$quantile <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>fixOutliers
[1] TRUE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>removeBadSamples
[1] FALSE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>badSampleCutoff
[1] 10.5
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>quantileNormalize
[1] TRUE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>stratified
[1] TRUE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>mergeManifest
[1] FALSE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>sex
NULL
$noob <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mi>o</mi><mi>o</mi><mi>b</mi></mrow><annotation encoding="application/x-tex">noob</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">n</span><span class="mord mathnormal">oo</span><span class="mord mathnormal">b</span></span></span></span>offset
[1] 15
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mi>o</mi><mi>o</mi><mi>b</mi></mrow><annotation encoding="application/x-tex">noob</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">n</span><span class="mord mathnormal">oo</span><span class="mord mathnormal">b</span></span></span></span>dyeCorr
[1] TRUE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>n</mi><mi>o</mi><mi>o</mi><mi>b</mi></mrow><annotation encoding="application/x-tex">noob</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">n</span><span class="mord mathnormal">oo</span><span class="mord mathnormal">b</span></span></span></span>dyeMethod
[1] "single" "reference"
$funnorm <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mi>u</mi><mi>n</mi><mi>n</mi><mi>o</mi><mi>r</mi><mi>m</mi></mrow><annotation encoding="application/x-tex">funnorm</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">u</span><span class="mord mathnormal">nn</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span><span class="mord mathnormal">m</span></span></span></span>nPCs
[1] 2
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mi>u</mi><mi>n</mi><mi>n</mi><mi>o</mi><mi>r</mi><mi>m</mi></mrow><annotation encoding="application/x-tex">funnorm</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">u</span><span class="mord mathnormal">nn</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span><span class="mord mathnormal">m</span></span></span></span>sex
NULL
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mi>u</mi><mi>n</mi><mi>n</mi><mi>o</mi><mi>r</mi><mi>m</mi></mrow><annotation encoding="application/x-tex">funnorm</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">u</span><span class="mord mathnormal">nn</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span><span class="mord mathnormal">m</span></span></span></span>bgCorr
[1] TRUE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mi>u</mi><mi>n</mi><mi>n</mi><mi>o</mi><mi>r</mi><mi>m</mi></mrow><annotation encoding="application/x-tex">funnorm</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">u</span><span class="mord mathnormal">nn</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span><span class="mord mathnormal">m</span></span></span></span>dyeCorr
[1] TRUE
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>f</mi><mi>u</mi><mi>n</mi><mi>n</mi><mi>o</mi><mi>r</mi><mi>m</mi></mrow><annotation encoding="application/x-tex">funnorm</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">u</span><span class="mord mathnormal">nn</span><span class="mord mathnormal" style="margin-right:0.02778em;">or</span><span class="mord mathnormal">m</span></span></span></span>keepCN
[1] FALSE
However, we can modify the parameter(s) as the following example for illumina
approach:
parameters <- norm_parameters(illumina = list("bg.correct" = FALSE))
parameters$illumina$bg.correct
[1] FALSE
We are going to preprocess the IDAT files and reference panel (3). We need to specify the IDAT files directory and the reference panel in RGChannelSet
format. As a result, we will obtain a GenomicRatioSet
object containing the control and case samples:
GRset <- epi_preprocess(baseDir,
reference_panel,
pattern = "SampleSheet.csv")
Epimutations
Epimutations detection
The function epimutations()
includes 6 methods for epimutation identification: (1) Multivariate Analysis of variance (manova
), (2) Multivariate Linear Model (mlm
), (3) isolation forest (iForest
), (4) robust mahalanobis distance (mahdist
) (5) quantile
and (6) beta
.
To illustrate the following examples we are going to use the dataset methy
(section 3). We need to separate the case and control samples:
case_samples <- methy[,methy$status == "case"]
control_samples <- methy[,methy$status == "control"]
We can specify the chromosome or region to analyze which helps to reduce the execution time:
epi_mvo <- epimutations(case_samples,
control_samples,
method = "manova")
The function epimutations_one_leave_out()
compared individual methylation profiles of a single sample (regardless if they are cases or controls) against all other samples from the same cohort. To use this function we do not need to split the dataset. To ilustrate this example we are going to use the GRset
dataset available in epimutacions
package:
#manova (default method)
data(GRset)
epi_mva_one_leave_out<- epimutations_one_leave_out(GRset)
Unique parameters
The epi_parameters()
function is useful to set the individual parameters for each outliers detection approach. The following table describes the arguments:
epi_parameters()
with no arguments, returns a list of the default settings for each method:
epi_parameters()
$manova <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>m</mi><mi>a</mi><mi>n</mi><mi>o</mi><mi>v</mi><mi>a</mi></mrow><annotation encoding="application/x-tex">manova</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em;"></span><span class="mord mathnormal">man</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">v</span><span class="mord mathnormal">a</span></span></span></span>pvalue_cutoff
[1] 0.05
$mlm <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>m</mi><mi>l</mi><mi>m</mi></mrow><annotation encoding="application/x-tex">mlm</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">m</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">m</span></span></span></span>pvalue_cutoff
[1] 0.05
$iForest <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi><mi>F</mi><mi>o</mi><mi>r</mi><mi>e</mi><mi>s</mi><mi>t</mi></mrow><annotation encoding="application/x-tex">iForest</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.13889em;">F</span><span class="mord mathnormal">ores</span><span class="mord mathnormal">t</span></span></span></span>outlier_score_cutoff
[1] 0.7
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi><mi>F</mi><mi>o</mi><mi>r</mi><mi>e</mi><mi>s</mi><mi>t</mi></mrow><annotation encoding="application/x-tex">iForest</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em;"></span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.13889em;">F</span><span class="mord mathnormal">ores</span><span class="mord mathnormal">t</span></span></span></span>ntrees
[1] 100
$mahdist <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>m</mi><mi>a</mi><mi>h</mi><mi>d</mi><mi>i</mi><mi>s</mi><mi>t</mi></mrow><annotation encoding="application/x-tex">mahdist</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">mah</span><span class="mord mathnormal">d</span><span class="mord mathnormal">i</span><span class="mord mathnormal">s</span><span class="mord mathnormal">t</span></span></span></span>nsamp
[1] "deterministic"
$quantile <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>window_sz
[1] 1000
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>offset_abs
[1] 0.15
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>qsup
[1] 0.995
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mi>u</mi><mi>a</mi><mi>n</mi><mi>t</mi><mi>i</mi><mi>l</mi><mi>e</mi></mrow><annotation encoding="application/x-tex">quantile</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">q</span><span class="mord mathnormal">u</span><span class="mord mathnormal">an</span><span class="mord mathnormal">t</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span></span></span></span>qinf
[1] 0.005
$beta <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>b</mi><mi>e</mi><mi>t</mi><mi>a</mi></mrow><annotation encoding="application/x-tex">beta</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">b</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span></span></span></span>pvalue_cutoff
[1] 1e-06
<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>b</mi><mi>e</mi><mi>t</mi><mi>a</mi></mrow><annotation encoding="application/x-tex">beta</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal">b</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">a</span></span></span></span>diff_threshold
[1] 0.1
The set up of any parameter can be done as the following example for manova
:
parameters <- epi_parameters(manova = list("pvalue_cutoff" = 0.01))
parameters$manova$pvalue_cutoff
[1] 0.01
Results description
The epimutations
function returns a data frame (tibble) containing all the epimutations identified in the given case sample. If no epimutation is found, the function returns a row containing the case sample information and missing values for all other arguments. The following table describes each argument:
If no outliers are detected in the region, valures for sz
, cpg_n
, cpg_ids
,outlier_score
, outlier_direction
, pvalue
, adj_pvalue
and epi_region_id
are set to NA
As an example we are going to visualize the obtained results with MANOVA method (epi_mvo
):
dim(epi_mvo)
[1] 51 16
class(epi_mvo)
[1] "tbl_df" "tbl" "data.frame"
head(as.data.frame(epi_mvo), 1)
epi_id sample chromosome start end sz cpg_n
1 epi_manova_1 GSM2562699 chr19 12777736 12777903 167 7
cpg_ids
1 cg20791841,cg25267526,cg03641858,cg25441478,cg14132016,cg23954461,cg03143365
outlier_score outlier_direction pvalue
1 302.558383959446/0.981008923520812 hypermethylation 3.441085e-33
adj_pvalue delta_beta epi_region_id CRE
1 2.236705e-31 0.2960902 chr19_12776725 EH38E1939817,EH38E1939818,EH38E1939819
CRE_type
1 pELS;pELS,CTCF-bound;PLS,CTCF-bound
Epimutations annotations
The annotate_epimutations()
function enriches the identified epimutations. It includes information about GENECODE gene names, description of the regulatory feature provided by methylation consortium, the location of the CpG relative to the CpG island, OMIM accession and description number and Ensembl region id, coordinates, type and tissue:
rst_mvo <- annotate_epimutations(epi_mvo, omim = TRUE)
As an example we are going to visualize different annotations
kableExtra::kable(
rst_mvo[ c(27,32) ,c("epi_id", "cpg_ids", "Relation_to_Island")],
row.names = FALSE) %>%
column_spec(1:3, width = "4cm")
kableExtra::kable(
rst_mvo[ c(4,8,22) , c("epi_id", "OMIM_ACC", "OMIM_DESC")],
row.names = FALSE ) %>%
column_spec(1:3, width = "4cm")
kableExtra::kable(
rst_mvo[ c(1:5), c("epi_id", "ensembl_reg_id", "ensembl_reg_type")],
row.names = FALSE ) %>%
column_spec(1:3, width = "4cm")
Epimutation visualization
The plot_epimutations()
function locates the epimutations along the genome. It plots the methylation values of the case sample in red, the control samples in dashed black lines and population mean in blue:
plot_epimutations(as.data.frame(epi_mvo[1,]), methy)
If we set the argument gene_annot == TRUE
the plot includes the gene annotations:
p <- plot_epimutations(as.data.frame(epi_mvo[1,]),
methy = methy,
genes_annot = TRUE)
plot(p)
To plot the chromatin marks H3K4me3, H3K27me3 and H3K27ac we need to specify the argument regulation = TRUE
:
- H3K4me3: commonly associated with the activation of transcription of nearby genes.
- H3K27me3: is used in epigenetics to look for inactive genes.
- H3K27ac: is associated with the higher activation of transcription and therefore defined as an active enhancer mark
p <- plot_epimutations(as.data.frame(epi_mvo[1,]),
methy = methy,
regulation = TRUE)
plot(p)
Acknowledgements
The authors would like to thank the team who collaborated in the initial design of the package in the European BioHackathon 2020: Lordstrong Akano, James Baye, Alejandro Caceres, Pavlo Hrab, Raquel Manzano and Margherita Mutarelli. The authors also want to thank the organization of European BioHackathon 2020 for its support.
All the team members of Project #5 for the contribution to this package:
Session Info
sessionInfo()
R version 4.5.0 beta (2025-04-02 r88102)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.2 LTS
Matrix products: default
BLAS: /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB LC_COLLATE=C
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: system (glibc)
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] IlluminaHumanMethylationEPICanno.ilm10b2.hg19_0.6.0
[2] kableExtra_1.4.0
[3] minfi_1.55.0
[4] bumphunter_1.51.0
[5] locfit_1.5-9.12
[6] iterators_1.0.14
[7] foreach_1.5.2
[8] Biostrings_2.77.0
[9] XVector_0.49.0
[10] SummarizedExperiment_1.39.0
[11] Biobase_2.69.0
[12] MatrixGenerics_1.21.0
[13] matrixStats_1.5.0
[14] GenomicRanges_1.61.0
[15] GenomeInfoDb_1.45.0
[16] IRanges_2.43.0
[17] S4Vectors_0.47.0
[18] ExperimentHub_2.17.0
[19] AnnotationHub_3.17.0
[20] BiocFileCache_2.17.0
[21] dbplyr_2.5.0
[22] BiocGenerics_0.55.0
[23] generics_0.1.3
[24] epimutacions_1.13.0
[25] epimutacionsData_1.11.0
[26] BiocStyle_2.37.0
loaded via a namespace (and not attached):
[1] ProtGenerics_1.41.0
[2] bitops_1.0-9
[3] isotree_0.6.1-4
[4] httr_1.4.7
[5] RColorBrewer_1.1-3
[6] tools_4.5.0
[7] doRNG_1.8.6.2
[8] backports_1.5.0
[9] R6_2.6.1
[10] HDF5Array_1.37.0
[11] lazyeval_0.2.2
[12] Gviz_1.53.0
[13] rhdf5filters_1.21.0
[14] withr_3.0.2
[15] prettyunits_1.2.0
[16] gridExtra_2.3
[17] base64_2.0.2
[18] preprocessCore_1.71.0
[19] cli_3.6.4
[20] labeling_0.4.3
[21] sass_0.4.10
[22] robustbase_0.99-4-1
[23] readr_2.1.5
[24] genefilter_1.91.0
[25] askpass_1.2.1
[26] Rsamtools_2.25.0
[27] systemfonts_1.2.2
[28] foreign_0.8-90
[29] siggenes_1.83.0
[30] illuminaio_0.51.0
[31] svglite_2.1.3
[32] rentrez_1.2.3
[33] dichromat_2.0-0.1
[34] scrime_1.3.5
[35] BSgenome_1.77.0
[36] limma_3.65.0
[37] rstudioapi_0.17.1
[38] RSQLite_2.3.9
[39] BiocIO_1.19.0
[40] dplyr_1.1.4
[41] interp_1.1-6
[42] Matrix_1.7-3
[43] abind_1.4-8
[44] lifecycle_1.0.4
[45] yaml_2.3.10
[46] rhdf5_2.53.0
[47] SparseArray_1.9.0
[48] grid_4.5.0
[49] blob_1.2.4
[50] crayon_1.5.3
[51] lattice_0.22-7
[52] GenomicFeatures_1.61.0
[53] annotate_1.87.0
[54] KEGGREST_1.49.0
[55] magick_2.8.6
[56] pillar_1.10.2
[57] knitr_1.50
[58] beanplot_1.3.1
[59] rjson_0.2.23
[60] codetools_0.2-20
[61] glue_1.8.0
[62] data.table_1.17.0
[63] vctrs_0.6.5
[64] png_0.1-8
[65] IlluminaHumanMethylationEPICmanifest_0.3.0
[66] gtable_0.3.6
[67] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1
[68] cachem_1.1.0
[69] xfun_0.52
[70] S4Arrays_1.9.0
[71] mime_0.13
[72] survival_3.8-3
[73] tinytex_0.57
[74] statmod_1.5.0
[75] nlme_3.1-168
[76] bit64_4.6.0-1
[77] progress_1.2.3
[78] filelock_1.0.3
[79] bslib_0.9.0
[80] nor1mix_1.3-3
[81] IlluminaHumanMethylation450kmanifest_0.4.0
[82] rpart_4.1.24
[83] colorspace_2.1-1
[84] DBI_1.2.3
[85] Hmisc_5.2-3
[86] nnet_7.3-20
[87] tidyselect_1.2.1
[88] bit_4.6.0
[89] compiler_4.5.0
[90] curl_6.2.2
[91] httr2_1.1.2
[92] htmlTable_2.4.3
[93] h5mread_1.1.0
[94] xml2_1.3.8
[95] DelayedArray_0.35.0
[96] bookdown_0.43
[97] rtracklayer_1.69.0
[98] checkmate_2.3.2
[99] scales_1.3.0
[100] DEoptimR_1.1-3-1
[101] quadprog_1.5-8
[102] rappdirs_0.3.3
[103] stringr_1.5.1
[104] digest_0.6.37
[105] rmarkdown_2.29
[106] GEOquery_2.77.0
[107] htmltools_0.5.8.1
[108] pkgconfig_2.0.3
[109] jpeg_0.1-11
[110] base64enc_0.1-3
[111] sparseMatrixStats_1.21.0
[112] fastmap_1.2.0
[113] ensembldb_2.33.0
[114] rlang_1.1.6
[115] htmlwidgets_1.6.4
[116] UCSC.utils_1.5.0
[117] DelayedMatrixStats_1.31.0
[118] farver_2.1.2
[119] jquerylib_0.1.4
[120] jsonlite_2.0.0
[121] BiocParallel_1.43.0
[122] mclust_6.1.1
[123] VariantAnnotation_1.55.0
[124] RCurl_1.98-1.17
[125] magrittr_2.0.3
[126] Formula_1.2-5
[127] GenomeInfoDbData_1.2.14
[128] Rhdf5lib_1.31.0
[129] munsell_0.5.1
[130] Rcpp_1.0.14
[131] stringi_1.8.7
[132] MASS_7.3-65
[133] plyr_1.8.9
[134] ggrepel_0.9.6
[135] deldir_2.0-4
[136] splines_4.5.0
[137] multtest_2.65.0
[138] hms_1.1.3
[139] rngtools_1.5.2
[140] reshape2_1.4.4
[141] biomaRt_2.65.0
[142] BiocVersion_3.22.0
[143] XML_3.99-0.18
[144] evaluate_1.0.3
[145] latticeExtra_0.6-30
[146] biovizBase_1.57.0
[147] BiocManager_1.30.25
[148] tzdb_0.5.0
[149] tidyr_1.3.1
[150] openssl_2.3.2
[151] purrr_1.0.4
[152] reshape_0.8.9
[153] ggplot2_3.5.2
[154] xtable_1.8-4
[155] restfulr_0.0.15
[156] AnnotationFilter_1.33.0
[157] viridisLite_0.4.2
[158] tibble_3.2.1
[159] memoise_2.0.1
[160] AnnotationDbi_1.71.0
[161] GenomicAlignments_1.45.0
[162] cluster_2.1.8.1
References
Aref-Eshghi, Erfan, Eric G. Bend, Samantha Colaiacovo, Michelle Caudle, Rana Chakrabarti, Melanie Napier, Lauren Brick, et al. 2019. “Diagnostic Utility of Genome-Wide Dna Methylation Testing in Genetically Unsolved Individuals with Suspected Hereditary Conditions.” The American Journal of Human Genetics. https://doi.org/https://doi.org/10.1016/j.ajhg.2019.03.008.
Barbosa, Mafalda, Ricky S Joshi, Paras Garg, Alejandro Martin-Trujillo, Nihir Patel, Bharati Jadhav, Corey T Watson, et al. 2018. “Identification of Rare de Novo Epigenetic Variations in Congenital Disorders.” Nature Communications 9 (1): 1–11.
Jaffe, Andrew E, Peter Murakami, Hwajin Lee, Jeffrey T Leek, M Daniele Fallin, Andrew P Feinberg, and Rafael A Irizarry. 2012. “Bump Hunting to Identify Differentially Methylated Regions in Epigenetic Epidemiology Studies.” International Journal of Epidemiology 41 (1): 200–209.