GENCODE reference annotation for the human and mouse genomes - PubMed (original) (raw)

. 2019 Jan 8;47(D1):D766-D773.

doi: 10.1093/nar/gky955.

Mark Diekhans 2, Anne-Maud Ferreira 3, Rory Johnson 4 5, Irwin Jungreis 6 7, Jane Loveland 1, Jonathan M Mudge 1, Cristina Sisu 8 9, James Wright 10, Joel Armstrong 2, If Barnes 1, Andrew Berry 1, Alexandra Bignell 1, Silvia Carbonell Sala 11, Jacqueline Chrast 3, Fiona Cunningham 1, Tomás Di Domenico 12, Sarah Donaldson 1, Ian T Fiddes 2, Carlos García Girón 1, Jose Manuel Gonzalez 1, Tiago Grego 1, Matthew Hardy 1, Thibaut Hourlier 1, Toby Hunt 1, Osagie G Izuogu 1, Julien Lagarde 11, Fergal J Martin 1, Laura Martínez 12, Shamika Mohanan 1, Paul Muir 13 14, Fabio C P Navarro 8, Anne Parker 1, Baikang Pei 8, Fernando Pozo 12, Magali Ruffier 1, Bianca M Schmitt 1, Eloise Stapleton 1, Marie-Marthe Suner 1, Irina Sycheva 1, Barbara Uszczynska-Ratajczak 15, Jinuri Xu 8, Andrew Yates 1, Daniel Zerbino 1, Yan Zhang 8 16, Bronwen Aken 1, Jyoti S Choudhary 10, Mark Gerstein 8 17 18, Roderic Guigó 11 19, Tim J P Hubbard 20, Manolis Kellis 6 7, Benedict Paten 2, Alexandre Reymond 3, Michael L Tress 12, Paul Flicek 1

Affiliations

GENCODE reference annotation for the human and mouse genomes

Adam Frankish et al. Nucleic Acids Res. 2019.

Abstract

The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

New and updated manually annotated genes and transcripts from July 2016 to June 2018. For both human (left) and mouse (right) the numbers of completely new genes and transcripts, updated genes and transcripts and the total number of manually added or edited genes and transcripts for each of four broad categories of annotation. A new gene annotation can represent a completely de novo locus with no overlap with pre-existing annotation or the reclassification of an existing complex locus into multiple loci to better represent the biology of the locus inferred from transcriptomic and/or proteomic data. A new transcript represents the annotation of a unique exon-intron structure, including novel alternative splicing at an annotated locus. Updated genes and transcripts represent pre-existing loci or transcript models that have been edited to improve the representation of biotype (e.g. changed from lncRNA to protein-coding) or structure (e.g. by extension, addition of novel exons).

Figure 2.

Figure 2.

Annotation statistics for human and mouse GENCODE releases from July 2016 to June 2018, encompassing human releases GENCODE 25–28 and mouse releases M10 to M18. The panels on the left show the total number of genes by broad biotype (protein-coding, lncRNA, pseudogene and sncRNA) for each release for human and mouse respectively and panels on the right show the total numbers of genes and transcripts of all biotypes.

Similar articles

Cited by

References

    1. ENCODE Project Consortium The ENCODE (ENCyclopedia Of DNA Elements) Project. Science. 2004; 306:636–640. - PubMed
    1. ENCODE Project Consortium Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007; 447:799–816. - PMC - PubMed
    1. Harrow J., Denoeud F., Frankish A., Reymond A., Chen C.-K., Chrast J., Lagarde J., Gilbert J.G.R., Storey R., Swarbreck D. et al. . GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006; 7(Suppl. 1):doi:10.1186/gb-2006-7-s1-s4. - PMC - PubMed
    1. ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012; 489:57–74. - PMC - PubMed
    1. Harrow J., Frankish A., Gonzalez J.M., Tapanari E., Diekhans M., Kokocinski F., Aken B.L., Barrell D., Zadissa A., Searle S. et al. . GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 2012; 22:1760–1774. - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources