Tissue resolved, gene structure refined equine transcriptome - PubMed (original) (raw)
Tissue resolved, gene structure refined equine transcriptome
T A Mansour et al. BMC Genomics. 2017.
Abstract
Background: Transcriptome interpretation relies on a good-quality reference transcriptome for accurate quantification of gene expression as well as functional analysis of genetic variants. The current annotation of the horse genome lacks the specificity and sensitivity necessary to assess gene expression especially at the isoform level, and suffers from insufficient annotation of untranslated regions (UTR) usage. We built an annotation pipeline for horse and used it to integrate 1.9 billion reads from multiple RNA-seq data sets into a new refined transcriptome.
Results: This equine transcriptome integrates eight different tissues from 59 individuals and improves gene structure and isoform resolution, while providing considerable tissue-specific information. We utilized four levels of transcript filtration in our pipeline, aimed at producing several transcriptome versions that are suitable for different downstream analyses. Our most refined transcriptome includes 36,876 genes and 76,125 isoforms, with 6474 candidate transcriptional loci novel to the equine transcriptome.
Conclusions: We have employed a variety of descriptive statistics and figures that demonstrate the quality and content of the transcriptome. The equine transcriptomes that are provided by this pipeline show the best tissue-specific resolution of any equine transcriptome to date and are flexible for several downstream analyses. We encourage the integration of further equine transcriptomes with our annotation pipeline to continue and improve the equine transcriptome.
Keywords: Equine transcriptome; RNA-seq; Tissue-specificity.
Figures
Fig. 1
An outline of the workflow used to generate each version of the transcriptome. Transcriptome products are in ovals. Programs used to perform various steps are indicated in parentheses. All transcriptome versions and the pipeline scripts are publically available
Fig. 2
Comparison of our refined transcriptome to current equine annotations. The degree of similarity between our refined transcriptome and current annotations can be found in (a). The annotation of MUTYH in the refined version of the transcriptome shows the addition of several isoforms, α, β, and γ, as seen in the human, of MUTYH (b). The gene annotation of CYP7A1 in the refined transcriptome also shows the inclusion of an extended alternative first exon not seen in other species (c)
Fig. 3
Tissue-specific gene and isoform composition of the transcriptome. A heatmap of genes with high expression and substantial expression differences across tissues (a). A bar graph showing isoforms uniquely present (the bar outlined in red above the x-axis) or solely absent (the blue outlined bars extending below the x-axis). The green trendline corresponds to the cumulative TPM of the uniquely present transcripts (b). A stacked bar graph showing the transcription percentage of mitochondrial genes versus nuclear encoded genes (c). Emb. Is short for embryo
Fig. 4
Novel gene analysis and classification. A bar graph showing the comparison of all the novel genes against the current equine annotations (a). The three categories of novel genes were supported novel genes (Category I), unsupported, but conserved, novel genes (Category II) and the unsupported, un-conserved, but novel genes with an ORF (Category III). A stacked bar graph of transcript counts with all three categories of novel genes showing exonic composition (b) and their cumulative TPM in a tissue specific manner (c)
References
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources