AnaLysis of Expression on human chromosome 21, ALE-HSA21: a pilot integrated web resource - PubMed (original) (raw)
AnaLysis of Expression on human chromosome 21, ALE-HSA21: a pilot integrated web resource
Margherita Scarpato et al. Database (Oxford). 2014.
Abstract
Transcriptome studies have shown the pervasive nature of transcription, demonstrating almost all the genes undergo alternative splicing. Accurately annotating all transcripts of a gene is crucial. It is needed to understand the impact of mutations on phenotypes, to shed light on genetic and epigenetic regulation of mRNAs and more generally to widen our knowledge about cell functionality and tissue diversity. RNA-sequencing (RNA-Seq), and the other applications of the next-generation sequencing, provides precious data to improve annotations' accuracy, simultaneously creating issues related to the variety, complexity and the size of produced data. In this 'scenario', the lack of user-friendly resources, easily accessible to researchers with low skills in bioinformatics, makes difficult to retrieve complete information about one or few genes without browsing a jungle of databases. Concordantly, the increasing amount of data from 'omics' technologies imposes to develop integrated databases merging different data formats coming from distinct but complementary sources. In light of these considerations, and given the wide interest in studying Down syndrome-a genetic condition due to the trisomy of human chromosome 21 (HSA21)-we developed an integrated relational database and a web interface, named ALE-HSA21 (AnaLysis of Expression on HSA21), accessible at http://bioinfo.na.iac.cnr.it/ALE-HSA21\. This comprehensive and user-friendly web resource integrates-for all coding and noncoding transcripts of chromosome 21-existing gene annotations and transcripts identified de novo through RNA-Seq analysis with predictive computational analysis of regulatory sequences. Given the role of noncoding RNAs and untranslated regions of coding genes in key regulatory mechanisms, ALE-HSA21 is also an interesting web-based platform to investigate such processes. The 'transcript-centric' and easily-accessible nature of ALE-HSA21 makes this resource a valuable tool to rapidly retrieve data at the isoform level, rather than at gene level, useful to investigate any disease, molecular pathway or cell process involving chromosome 21 genes. Database URL: http://bioinfo.na.iac.cnr.it/ALE-HSA21/.
Figures
Figure 1.
Schematic overview of data collected in ALE-HSA21 and of the computational analysis. Panel (A) shows a list of open-access databases used to retrieve information and the cartoon of the computational workflow used to analyze the data. Data derived from these sources were collected and integrated into our relational database and its web interface, represented on the right by the Homepage of ALE-HSA21. On the left part of panel (B) is schematically illustrated the computational approach used to analyze RNA-Seq data sets. In the right part it is depicted the workflow of the in silico analysis performed on the regulatory sequences for both coding and noncoding transcripts of chromosome 21. Green boxes indicate data files; in orange are indicated the computational tools used to perform the analysis; in blue are indicated the ‘features’ of interest; in white are indicated the databases and the regulatory data sets retrieved from them.
Figure 2.
Screenshots from ALE-HSA21 web resource. Panel (A) shows the Homepage with Navigation Bar; panel (B) shows the list of HSA21 transcripts in the ‘Coding genes’ section in tabular format. Official gene symbol, ID, genomic coordinates, the sense of transcription, the number of exons and UniProt IDs are reported. The Black arrow and circle indicate an example of a clickable item (SOD1 gene in the example). By clicking there, the users access the Gene Description page, depicted in Panel (C). Interactive 3D graphical representation for each transcript is embedded in this web page. Each gene element is linked to results of in silico analysis. Colored circles—red for ‘Promoter’, green for ‘exons’, light blue for ‘introns’ and gray for ‘3′ UTRs’—correspond to the clickable elements of the 3D images. The same color scheme is used in panel (D) to indicate the relative results for the computational analyses of those elements.
Figure 3.
Example of the data provided for miRNAs in ALE-HSA21 web portal. In panel (A) and (B) are shown the results of the computational prediction of MiTGs in tabular format and Venn diagrams, respectively. Such data are accessible by clicking the ‘Target genes’ button embedded within miRNA web pages. ‘Validated’, ‘predicted’ and ‘co-expressed’ correspond to the target genes according to miRWalk and CoMeTa databases. Panel (C) shows a prediction of the secondary pre-miRNA structure obtained by RNAfold. Mature miRNA sequences are indicated by black brackets.
References
- Kapranov P, Cheng J, Dike S, et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science. 2007;316:1484–1488. -PubMed
- Jacquier A. The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs. Nat. Rev. Genet. 2009;10:833–844. -PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources