Annotation of functional variation in personal genomes using RegulomeDB - PubMed (original) (raw)
doi: 10.1101/gr.137323.112.
Eurie L Hong, Manoj Hariharan, Yong Cheng, Marc A Schaub, Maya Kasowski, Konrad J Karczewski, Julie Park, Benjamin C Hitz, Shuai Weng, J Michael Cherry, Michael Snyder
Affiliations
- PMID: 22955989
- PMCID: PMC3431494
- DOI: 10.1101/gr.137323.112
Annotation of functional variation in personal genomes using RegulomeDB
Alan P Boyle et al. Genome Res. 2012 Sep.
Abstract
As the sequencing of healthy and disease genomes becomes more commonplace, detailed annotation provides interpretation for individual variation responsible for normal and disease phenotypes. Current approaches focus on direct changes in protein coding genes, particularly nonsynonymous mutations that directly affect the gene product. However, most individual variation occurs outside of genes and, indeed, most markers generated from genome-wide association studies (GWAS) identify variants outside of coding segments. Identification of potential regulatory changes that perturb these sites will lead to a better localization of truly functional variants and interpretation of their effects. We have developed a novel approach and database, RegulomeDB, which guides interpretation of regulatory variants in the human genome. RegulomeDB includes high-throughput, experimental data sets from ENCODE and other sources, as well as computational predictions and manual annotations to identify putative regulatory potential and identify functional variants. These data sources are combined into a powerful tool that scores variants to help separate functional variants from a large pool and provides a small set of putative sites with testable hypotheses as to their function. We demonstrate the applicability of this tool to the annotation of noncoding variants from 69 full sequenced genomes as well as that of a personal genome, where thousands of functionally associated variants were identified. Moreover, we demonstrate a GWAS where the database is able to quickly identify the known associated functional variant and provide a hypothesis as to its function. Overall, we expect this approach and resource to be valuable for the annotation of human genome sequences.
Figures
Figure 1.
A SNV (rs9261424) overlapping many regulatory features. (A) This SNV falls within peak regions for many ChIP-seq factors as well as DNase-seq peaks from multiple cell lines. (B) The same SNV overlaps a motif match to the NFKB motif and has been shown to alter binding. The signal tracks represent ChIP-seq peaks of NFKB at the SNV site for three individuals: homozygous to reference allele (G), heterozygous, and homozygous to alternate allele (C) (Kasowski et al. 2010).
Figure 2.
Incidence of SNVs in features and categories. Average percent count of SNVs in each genomic feature (A) and in each RegulomeDB category (B). Although the differences between homozygous and heterozygous SNV counts are small, they are nevertheless significant (P < 5 × 10−15). Actual SNV count in features (C) and categories for the cell line GM12878 (D).
Figure 3.
Protein coding and noncoding SNVs can be classified as potentially functional by Polyphen-2 and RegulomeDB, respectively. Heterozygous, damaging coding SNVs can act in conjunction with a heterozygous regulatory SNV on the opposite allele to create a compound heterozygote and loss of function on both alleles (one regulatory, the other coding).
Figure 4.
_TNFAIP3_-associated SNV. (A) RegulomeDB results for rs117480515 which is likely a functional variant associated with systemic lupus erythematosus. (B) This SNV was the most likely to be functional in the associated region but might be missed in a standard study because it lies >20 kb downstream from its target. (C) An enlargement of the region around rs117480515 (red line) shows the overlap with a large number of functional elements (NFKB, purple; BCL, light blue; and DNase, green) as well as the motif for BCL.
Similar articles
- An Experimental Approach to Genome Annotation: This report is based on a colloquium sponsored by the American Academy of Microbiology held July 19-20, 2004, in Washington, DC.
[No authors listed] [No authors listed] Washington (DC): American Society for Microbiology; 2004. Washington (DC): American Society for Microbiology; 2004. PMID: 33001599 Free Books & Documents. Review. - Deep sequencing of Danish Holstein dairy cattle for variant detection and insight into potential loss-of-function variants in protein coding genes.
Das A, Panitz F, Gregersen VR, Bendixen C, Holm LE. Das A, et al. BMC Genomics. 2015 Dec 9;16:1043. doi: 10.1186/s12864-015-2249-y. BMC Genomics. 2015. PMID: 26645365 Free PMC article. - Functional annotation signatures of disease susceptibility loci improve SNP association analysis.
Iversen ES, Lipton G, Clyde MA, Monteiro AN. Iversen ES, et al. BMC Genomics. 2014 May 24;15(1):398. doi: 10.1186/1471-2164-15-398. BMC Genomics. 2014. PMID: 24886216 Free PMC article. - Incorporating Non-Coding Annotations into Rare Variant Analysis.
Richardson TG, Campbell C, Timpson NJ, Gaunt TR. Richardson TG, et al. PLoS One. 2016 Apr 29;11(4):e0154181. doi: 10.1371/journal.pone.0154181. eCollection 2016. PLoS One. 2016. PMID: 27128317 Free PMC article. - The genetic basis of systemic lupus erythematosus: What are the risk factors and what have we learned.
Teruel M, Alarcón-Riquelme ME. Teruel M, et al. J Autoimmun. 2016 Nov;74:161-175. doi: 10.1016/j.jaut.2016.08.001. Epub 2016 Aug 10. J Autoimmun. 2016. PMID: 27522116 Review.
Cited by
- Elucidating the role of MLL1 nsSNPs: Structural and functional alterations and their contribution to leukemia development.
Al-Nakhle HH, Yagoub HS, Alrehaili RY, Shaqroon OA, Khan MK, Alsharif GS. Al-Nakhle HH, et al. PLoS One. 2024 Oct 15;19(10):e0304986. doi: 10.1371/journal.pone.0304986. eCollection 2024. PLoS One. 2024. PMID: 39405275 Free PMC article. - Unveiling the shared genetic architecture between testosterone and polycystic ovary syndrome.
Sun S, Liu Y, Li L, Xiong L, Jiao M, Yang J, Li X, Liu W. Sun S, et al. Sci Rep. 2024 Oct 13;14(1):23931. doi: 10.1038/s41598-024-75816-0. Sci Rep. 2024. PMID: 39397165 Free PMC article. - Genetic architectures of the human hippocampus and those involved in neuropsychiatric traits.
Ning C, Jin M, Cai Y, Fan L, Hu K, Lu Z, Zhang M, Chen C, Li Y, Hu N, Zhang D, Liu Y, Chen S, Jiang Y, He C, Wang Z, Cao Z, Li H, Li G, Ma Q, Geng H, Tian W, Zhang H, Yang X, Huang C, Wei Y, Li B, Zhu Y, Li X, Miao X, Tian J. Ning C, et al. BMC Med. 2024 Oct 11;22(1):456. doi: 10.1186/s12916-024-03682-8. BMC Med. 2024. PMID: 39394562 Free PMC article. - The genetic landscape of basal ganglia and implications for common brain disorders.
Bahrami S, Nordengen K, Rokicki J, Shadrin AA, Rahman Z, Smeland OB, Jaholkowski PP, Parker N, Parekh P, O'Connell KS, Elvsåshagen T, Toft M, Djurovic S, Dale AM, Westlye LT, Kaufmann T, Andreassen OA. Bahrami S, et al. Nat Commun. 2024 Oct 1;15(1):8476. doi: 10.1038/s41467-024-52583-0. Nat Commun. 2024. PMID: 39353893 Free PMC article. - Identification of candidate causal variants and target genes at 41 breast cancer risk loci through differential allelic expression analysis.
Xavier JM, Magno R, Russell R, de Almeida BP, Jacinta-Fernandes A, Besouro-Duarte A, Dunning M, Samarajiwa S, O'Reilly M, Maia AM, Rocha CL, Rosli N, Ponder BAJ, Maia AT. Xavier JM, et al. Sci Rep. 2024 Sep 28;14(1):22526. doi: 10.1038/s41598-024-72163-y. Sci Rep. 2024. PMID: 39341862 Free PMC article.
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials