InfiniumMethylation BeadChips Annotation (original) (raw)
Human Array | Mouse Array |Mammal Array |Release Notes | Reference
All files are gzipped plain text files (tab-delimited). some old gz files appear double compressed when downloaded with firefox. Please apply decompression twice.
Human MSA, EPICv2, EPIC+, EPIC, HM450, HM27 Array
GRCh38 / hg38
- Manifest with mapping information (MSA,EPICv2,EPIC+,EPIC,HM450)
Column header: (header description)CpG_chrm CpG_beg CpG_end address_A address_B target nextBase channel Probe_ID mapFlag_A mapChrm_A mapPos_A mapQ_A mapCigar_A AlleleA_ProbeSeq mapNM_A mapAS_A mapYD_A mapFlag_B mapChrm_B mapPos_B mapQ_B mapCigar_B AlleleB_ProbeSeq mapNM_B mapAS_B mapYD_B type Previous versions: (EPIC, HM450,HM27,header description) - Mask information (MSA,EPICv2,EPIC+)
Column header: (header description), see M_general column for the recommended maskingProbe_ID mask maskUniq M_general Previous versions: (EPIC,HM450,HM27,header description) Previous population-specific SNP masking: (EPIC,HM450,HM27,header description) - Design (MSA)
Column header:Probe_ID Trait_Associations Header description: (2) Trait_Associations: a comma-delimited string of trait associations. EWAS hits has the following format:EWAS_hit:[trait]:[PMID]:[q-value of association]. NA is used when the info is missing. - Trait associations (MSA)
Column header:Probe_ID Trait_Associations CoordID Header description: (2) Trait_Associations: a string of trait associations using following format:[trait_group]:[trait]:[PMID]:[q-value of association]. NA is used when the info is missing. - Gene annotation (GENCODEv41) and promoters (MSA,EPICv2)
Column header: (header description)CpG_chrm CpG_beg CpG_end probe_strand Probe_ID genesUniq geneNames transcriptTypes transcriptIDs distToTSS Previous versions: GENCODEv36 (EPICv2,EPIC,HM450,HM27) GENCODEv22 (EPIC,HM450,HM27) - Functional annotations (CGI, enhancer, TF binding, chromatin state, ...) (MSA,EPICv2,EPIC,HM450,HM27)
Column header: (header description) - SNP annotations (MSA,EPICv2,HM450)
Column header: (header description)chrm beg end strand rs designType U REF ALT Probe_ID This is for converting probes to genotype VCFs. Previous version: rs probes (EPIC,header description) Previous version: channel-switching Infinium-I probes (EPIC,header description) - Bisulfite non-uniqueness of 3'-subsequence of 10-50 bases in length (EPIC, HM450,HM27)
Column header: (header description)probeID copy_10 copy_11 copy_12 copy_13 ... copy_49 copy_50
GRCh37 / hg19
(Links to all archived platforms)
Other Genomes
(Click here if you are using arrays on non-target species.)
Mouse/MM285 Array (see here for working with the mouse array)
GRCm38 / mm10
- Manifest with mapping information (MM285) Column header: (header description)
| CpG_chrm | CpG_beg | CpG_end | address_A | address_B | target | nextBase | channel | Probe_ID | mapFlag_A | mapChrm_A | mapPos_A | mapQ_A | mapCigar_A | AlleleA_ProbeSeq | mapNM_A | mapAS_A | mapYD_A | mapFlag_B | mapChrm_B | mapPos_B | mapQ_B | mapCigar_B | AlleleB_ProbeSeq | mapNM_B | mapAS_B | mapYD_B | type |
|---|
Note: This annotation is based on the design paper (N=296,070). It described a slightly different set that corrects the Illumina A2 manifest (N=287,692).
See Manifest comparison for details. Sesame default preprocessing is based on the 296070 version.
Previous versions:
Illumina A2 manifest (MM285) (N=287,692)
LEGX manifest (MM285,header description)
- Mouse array design groups (MM285)
Column header:Header description:
(2) design: contains the annotation of the probes. For the syntenic EPIC probe mapping, search for EPIC prefix in the design column. - Mask information (MM285)
Column header: (header description), see M_general column for the recommended maskingProbe_ID mask maskUniq M_general Header description: (2) mask: a boolean indication of whether probes are recommended to be masked in data preprocessing. Probes masked by default includes control probes, multi-mapping probes (mapQ < 30 for either allele A or B) and non-informative probes (uk probes)Previous versions: Illumina A2 manifest (MM285), LEGX manifest (MM285) - Gene annotation (GENCODEvM25) and promoters (MM285)
Column header: (header description)CpG_chrm CpG_beg CpG_end probe_strand probeID genesUniq geneNames transcriptTypes transcriptIDs distToTSS Previous versions: Illumina A2 manifest (MM285) - Functional annotations (CGI, enhancer, TF binding, chromatin state, ...) (MM285)
Column header: (header description) Note: This annotation is based on the design paper (N=296,070). It described a slightly different set that corrects the Illumina A2 manifest (N=287,692).
See Manifest comparison for details. Sesame default preprocessing is based on the 296070 version. - The Illustration of the new array ID system
The mouse array employs an improved ID system on top of the traditional "cg" numbers. The new ID system uniquely specifies the design details. The new ID is designed to accomodate more flexible probe design such as replicates and opposite-strand design. - ChromHMM annotation of 66 samples (MM285)
Column header: (header description)chrm beg end probeID ENCFF005IEW_forebrain_embryonic_12.5 ENCFF014LBF_hindbrain_postnatal_0 ENCFF023ETX_liver_embryonic_14.5 ENCFF065PNO_midbrain_embryonic_16.5 ENCFF072LNA_liver_embryonic_11.5 ... - Imprinting ICR/DMR annotation (MM285, also see ICR annotation, and a comparison of different ICR/DMR evidences).
- PhastCons Evolutionary Conservation (MM285)
- PhyloP Evolutionary Conservation (MM285)
GRCm39 / mm39
- Manifest with mapping information (MM285)
Column header: (header description)CpG_chrm CpG_beg CpG_end address_A address_B target nextBase channel Probe_ID mapFlag_A mapChrm_A mapPos_A mapQ_A mapCigar_A AlleleA_ProbeSeq mapNM_A mapAS_A mapYD_A mapFlag_B mapChrm_B mapPos_B mapQ_B mapCigar_B AlleleB_ProbeSeq mapNM_B mapAS_B mapYD_B type
Other genomes
- 310 species manifest from Ensemble v101 (MM285)
Column header: (header description)CpG_chrm CpG_beg CpG_end address_A address_B target nextBase channel Probe_ID mapFlag_A mapChrm_A mapPos_A mapQ_A mapCigar_A AlleleA_ProbeSeq mapNM_A mapAS_A mapYD_A mapFlag_B mapChrm_B mapPos_B mapQ_B mapCigar_B AlleleB_ProbeSeq mapNM_B mapAS_B mapYD_B type
HorvathMammal40/Mammal40 Array
GRCh38 / hg38
- Manifest with mapping information (Mammal40)
Column header: (header description)CpG_chrm CpG_beg CpG_end address_A address_B target nextBase channel Probe_ID mapFlag_A mapChrm_A mapPos_A mapQ_A mapCigar_A AlleleA_ProbeSeq mapNM_A mapAS_A mapYD_A mapFlag_B mapChrm_B mapPos_B mapQ_B mapCigar_B AlleleB_ProbeSeq mapNM_B mapAS_B mapYD_B type Previous versions: Old manifest (Mammal40,header description) - KnowYourCG annotations (CGI, enhancer, TF binding, chromatin state, ...) (Mammal40)
Column header: (header description)
Other genomes
- 310 species manifest from Ensemble v101 (Mammal40)
Column header: (header description)CpG_chrm CpG_beg CpG_end address_A address_B target nextBase channel Probe_ID mapFlag_A mapChrm_A mapPos_A mapQ_A mapCigar_A AlleleA_ProbeSeq mapNM_A mapAS_A mapYD_A mapFlag_B mapChrm_B mapPos_B mapQ_B mapCigar_B AlleleB_ProbeSeq mapNM_B mapAS_B mapYD_B type
Release Notes Update Mar-06-2024:
- Added Methylation Screening Array (MSA) manifest and annotations Update Apr-27-2023:
- Added KYCG annotation files links Update Nov-10-2022:
- Added EPICv2 manifest and annotations Update Sep-03-2022:
- update to Ensembl manifests, mouse manifests and human anotation
- update to the documentation of column headers Update Jun-15-2021:
- updated gene annotation to GENCODE v36 to be consistent with GDC. CGI column is updated to contain CGI only when CGIposition is not NA.
- updated mouse array gene model Update Apr-14-2021:
- updated gene annotation to GENCODE v37 Update Jul-4-2020:
- More detailed annotation to SNP and channel-switching probes Release Sep-9-2018:
- Added gene_HGNC column for gene symbols corrected for HGNC compatibility
- Updated SNP masking with dbSNP build 151
- All maskings were redone on hg19. Previously hg19 manifest file was "borrowing" hg38 maskings.
- CCS probes have shrunk in numbers based on AF filtering (>1%).
- See Updates Summary for more details. Release Aug-8-2018:
- Fix to MASK_mapping, some mapping with NM>0 are now masked as well
- Addition of NM columns and gene annotation to the manifest file.
- Previously, masking of mapping issues were merged from hg19 and hg38. Now masking on hg19 and hg38 was re-built entirely independently. There is some small decrease in the number of masking due to this change and a small increase due to the added NM-based masking (included in MASK_mapping).
- See Updates Summary for more details. Release Mar-4-2018
- Fix to hg19 decoy mapping inconsistency. Release Jan-4-2018
- Updated strand information. Release Nov-23-2017
- Changed to RDS from RData following R's recommendation.
- Update to missing value representation.
- Update to table headers.
- Added gene annotation in short forms.
- Added mapping to genome both including and excluding alt-chromosomes.
- Made consistent naming of columns with underscores. Release Mar-13-2017
- fix to relative position in SNP masking of type-I probes.
References
Human MSA - Goldberg et al. Scalable Screening of Ternary-Code DNA Methylation Dynamics Associated with Human Traits, Cell Genomics 2025
mLiftOver - Chen and Zhou. mLiftOver: Harmonizing Data Across Infinium DNA Methylation Platforms, Bioinformatics 2024
Human EPICv2 - Kaur and Lee et al. Comprehensive evaluation of the Infinium human MethylationEPIC v2 BeadChip, Epigenetics Communications 2023
Mammalian and nonhuman species - Ding et al. Comparative epigenome analysis using Infinium DNA methylation BeadChips, Briefing in Bioinformatics 2023
Mouse MM285 - Zhou W et al. DNA methylation dynamics and dysregulation delinated by high-throughput profiling in the mouse, Cell Genomics 2022
Human EPIC, HM450 - Zhou W, Laird PW and Shen H, Comprehensive characterization, annotation and innovative use of Infinium DNA Methylation BeadChip probes, Nucleic Acids Research 2017
Contact
Questions regarding this annotation can be addressed to wanding.zhou@pennmedicine.upenn.edu.