ENCODE data in the UCSC Genome Browser: year 5 update - PubMed (original) (raw)

. 2013 Jan;41(Database issue):D56-63.

doi: 10.1093/nar/gks1172. Epub 2012 Nov 27.

Cricket A Sloan, Venkat S Malladi, Timothy R Dreszer, Katrina Learned, Vanessa M Kirkup, Matthew C Wong, Morgan Maddren, Ruihua Fang, Steven G Heitner, Brian T Lee, Galt P Barber, Rachel A Harte, Mark Diekhans, Jeffrey C Long, Steven P Wilder, Ann S Zweig, Donna Karolchik, Robert M Kuhn, David Haussler, W James Kent

Affiliations

ENCODE data in the UCSC Genome Browser: year 5 update

Kate R Rosenbloom et al. Nucleic Acids Res. 2013 Jan.

Abstract

The Encyclopedia of DNA Elements (ENCODE), http://encodeproject.org, has completed its fifth year of scientific collaboration to create a comprehensive catalog of functional elements in the human genome, and its third year of investigations in the mouse genome. Since the last report in this journal, the ENCODE human data repertoire has grown by 898 new experiments (totaling 2886), accompanied by a major integrative analysis. In the mouse genome, results from 404 new experiments became available this year, increasing the total to 583, collected during the course of the project. The University of California, Santa Cruz, makes this data available on the public Genome Browser http://genome.ucsc.edu for visual browsing and data mining. Download of raw and processed data files are all supported. The ENCODE portal provides specialized tools and information about the ENCODE data sets.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

ENCODE data displayed in the UCSC Genome Browser together with annotations from the ENCODE Analysis Hub in the region of the nucleoporin gene NUP133 demonstrate the power this diversity of data provides for visual interpretation. The GENCODE Basic gene set shows this gene having four protein-coding splice variants and three smaller non-coding transcripts nearby. The proteogenomics track shows support for many of the coding exons, with protein localized in the nucleus, but not in plasma membrane or mitochondria. The long polyA RNA signal shows strong peaks over the exons and low intron signal in the cytosol, with greater signal in the nucleus. This is expected because nuclear mRNAs are not all completely spliced. The Combined Genome Segmentation integrates signal from many histones and classifies regions into those with characteristics of promoters (red), enhancers (yellow), insulators (blue), transcribed regions (green) and repressed (gray). Below are signal tracks from four of the eight histone modifications used as input to the segmentation. The promoter and transcribed regions agree with the RNA evidence, and like the RNA evidence show no evidence of transcription of the non-coding gene to the right of NUP133. Underneath the GM12878 histone signals is a track that overlays one of the histone signals, H3K27Ac, in seven different cell lines (with GM12878 shown in red). A peak in H3K27Ac appears at the enhancer, but as is often the case with enhancers, this appears to be relatively cell specific in contrast to the larger peak near the promoter, where the black coloration indicates the peak is shared by many cell types. The DNAse hypersensitivity and transcription factor tracks also provide evidence for both promoter and enhancer. Finally the mappability track indicates regions where short reads are not uniquely mappable, indicating the data are incomplete and therefore harder to interpret. Although most of this region is mappable, there are many small regions throughout and one larger region on the right where mapping is problematic. Overall, the ENCODE data in this region show strong evidence that this is a nuclear-localized protein-coding gene with a promoter that is used in a wide variety of cell types, and is likely to be regulated by tissue-specific enhancers as well.

Figure 2.

Figure 2.

The ENCODE Analysis Hub at the EBI hosts over 2800 ENCODE data sets, organized in six tracks controlled via the track menu shown here.

Figure 3.

Figure 3.

All three screens of the Experiment Matrix for mouse are shown overlaid. The Data Summary screen lists experiments by data type, and provides launching to the two matrix screens that organize the data by assay and cell type. Clicking the appropriate table row or matrix cell launches a Track or File search tool (based on the Track/File selector control) that allows further refinement of the selection for browsing or download.

Similar articles

Cited by

References

    1. ENCODE Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science. 2004;306:636–640. - PubMed
    1. Myers RM, Stamatoyannopoulos J, Snyder M, Dunham I, Hardison RC, Bernstein BE, Gingeras TR, Kent WJ, Birney E, Wold B, et al. A user's guide to the encyclopedia of DNA elements (ENCODE) PLoS Biol. 2011;9:e1001046. - PMC - PubMed
    1. The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. - PMC - PubMed
    1. Rosenbloom KR, Dreszer TR, Pheasant M, Barber GP, Meyer LR, Pohl A, Raney BJ, Wang T, Hinrichs AS, Zweig AS, et al. ENCODE whole-genome data in the UCSC Genome Browser. Nucleic Acids Res. 2010;38:D620–D625. - PMC - PubMed
    1. Raney BJ, Cline MS, Rosenbloom KR, Dreszer TR, Learned K, Barber GP, Meyer LR, Sloan CA, Malladi VS, Roskin KM, et al. ENCODE whole-genome data in the UCSC genome browser (2011 update) Nucleic Acids Res. 2011;39:D871–D875. - PMC - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources