Construction of microRNA functional families by a mixture model of position weight matrices (original) (raw)

Functional microRNA targets in protein coding sequences

Bioinformatics, 2012

Motivation: Experimental evidence has accumulated showing that microRNA (miRNA) binding sites within protein coding sequences (CDSs) are functional in controlling gene expression. Results: Here we report a computational analysis of such miRNA target sites, based on features extracted from existing mammalian high-throughput immunoprecipitation and sequencing data. The analysis is performed independently for the CDS and the 3untranslated regions (3 -UTRs) and reveals different sets of features and models for the two regions. The two models are combined into a novel computational model for miRNA target genes, DIANA-microT-CDS, which achieves higher sensitivity compared with other popular programs and the model that uses only the 3 -UTR target sites. Further analysis indicates that genes with shorter 3 -UTRs are preferentially targeted in the CDS, suggesting that evolutionary selection might favor additional sites on the CDS in cases where there is restricted space on the 3 -UTR. Availability: The results of DIANA-microT-CDS are available at

miROrtho: computational survey of microRNA genes

Nucleic Acids Research, 2009

MicroRNAs (miRNAs) are short, non-protein coding RNAs that direct the widespread phenomenon of post-transcriptional regulation of metazoan genes. The mature~22-nt long RNA molecules are processed from genome-encoded stem-loop structured precursor genes. Hundreds of such genes have been experimentally validated in vertebrate genomes, yet their discovery remains challenging, and substantially higher numbers have been estimated. The miROrtho database (http://cegg.unige. ch/mirortho) presents the results of a comprehensive computational survey of miRNA gene candidates across the majority of sequenced metazoan genomes. We designed and applied a three-tier analysis pipeline: (i) an SVM-based ab initio screen for potent hairpins, plus homologs of known miRNAs, (ii) an orthology delineation procedure and (iii) an SVM-based classifier of the ortholog multiple sequence alignments. The web interface provides direct access to putative miRNA annotations, ortholog multiple alignments, RNA secondary structure conservation, and sequence data. The miROrtho data are conceptually complementary to the miRBase catalog of experimentally verified miRNA sequences, providing a consistent comparative genomics perspective as well as identifying many novel miRNA genes with strong evolutionary support.

Computational Identification of Putative miRNAs

2015

microRNAs represent a class of noncoding small RNAs of approximately 20-23 nt length, which are evolutionarily conserved and play a vital role in various biological processes by either degrading or repressing mRNA translation. The Felis catus (cat) genome sequence has been published, and just revealed the number of miRNAs in the genome-without mention of any further details on these miRNAs. This paper discusses an in silico comparative approach using all known sequences of vertebrate pre-miRNA as query sequence, and report 405 putative miRNAs from cat genome. We determine the identity values of pre-miRNAs and mature miRNAs besides statistical sequence characteristics. Interestingly, among 405 miRNAs-90, 53 and 50 showed 100% identity to cattle, human and dog, respectively. Further, we have validated 6 miRNAs, whose identity are ,85% with the query sequence and validated them using MiPred algorithm. We also identify 25 miRNA clusters in cat based on their homologs in other vertebrates. Most importantly, based on identities among pre-miRNA, mature miRNA, miRNA families and clusters, we observe that miRNAs from cat are more identical to cattle, than humans. Our results, therefore may add a new dimension to the studies related to the evolution of cat.

A Uniform System for the Annotation of Vertebrate microRNA Genes and the Evolution of the Human microRNAome

Annual Review of Genetics, 2015

Although microRNAs (miRNAs) are among the most intensively studied molecules of the past 20 years, determining what is and what is not a miRNA has not been straightforward. Here, we present a uniform system for the annotation and nomenclature of miRNA genes. We show that fewer than a third of the 1,881 human miRBase entries, and only approximately 16% of the 7,095 metazoan miRBase entries, are robustly supported as miRNA genes. Furthermore, we show that the human repertoire of miRNAs has been shaped by periods of intense miRNA innovation, and that mature gene products show a very different tempo and mode of sequence evolution than star products. We establish a new open access database-MirGeneDB (http://mirgenedb.org)-to catalog this set of robustly supported miRNAs, which complements the efforts of miRBase, but differs from it by annotating the mature versus star products, and by imposing an evolutionary hierarchy upon this curated and consistently named repertoire.

Conserved Expression Patterns Predict microRNA Targets

PLoS Computational Biology, 2009

microRNAs (miRNAs) are major regulators of gene expression and thereby modulate many biological processes. Computational methods have been instrumental in understanding how miRNAs bind to mRNAs to induce their repression but have proven inaccurate. Here we describe a novel method that combines expression data from human and mouse to discover conserved patterns of expression between orthologous miRNAs and mRNA genes. This method allowed us to predict thousands of putative miRNA targets. Using the luciferase reporter assay, we confirmed 4 out of 6 of our predictions. In addition, this method predicted many miRNAs that act as expression enhancers. We show that many miRNA enhancer effects are mediated through the repression of negative transcriptional regulators and that this effect could be as common as the widely reported repression activity of miRNAs. Our findings suggest that the indirect enhancement of gene expression by miRNAs could be an important component of miRNA regulation that has been widely neglected to date.

Where we stand, where we are moving: Surveying computational techniques for identifying miRNA genes and uncovering their regulatory role

Journal of Biomedical Informatics, 2013

Traditional biology was forced to restate some of its principles when the microRNA (miRNA) genes and their regulatory role were firstly discovered. Typically, miRNAs are small non-coding RNA molecules which have the ability to bind to the 3 0 untraslated region (UTR) of their mRNA target genes for cleavage or translational repression. Existing experimental techniques for their identification and the prediction of the target genes share some important limitations such as low coverage, time consuming experiments and high cost reagents. Hence, many computational methods have been proposed for these tasks to overcome these limitations. Recently, many researchers emphasized on the development of computational approaches to predict the participation of miRNA genes in regulatory networks and to analyze their transcription mechanisms. All these approaches have certain advantages and disadvantages which are going to be described in the present survey. Our work is differentiated from existing review papers by updating the methodologies list and emphasizing on the computational issues that arise from the miRNA data analysis. Furthermore, in the present survey, the various miRNA data analysis steps are treated as an integrated procedure whose aims and scope is to uncover the regulatory role and mechanisms of the miRNA genes. This integrated view of the miRNA data analysis steps may be extremely useful for all researchers even if they work on just a single step.

Computational prediction of the localization of microRNAs within their pre-miRNA

Nucleic Acids Research, 2013

MicroRNAs (miRNAs) are short RNA species derived from hairpin-forming miRNA precursors (pre-miRNA) and acting as key posttranscriptional regulators. Most computational tools labeled as miRNA predictors are in fact pre-miRNA predictors and provide no information about the putative miRNA location within the pre-miRNA. Sequence and structural features that determine the location of the miRNA, and the extent to which these properties vary from species to species, are poorly understood. We have developed miRdup, a computational predictor for the identification of the most likely miRNA location within a given pre-miRNA or the validation of a candidate miRNA. MiRdup is based on a random forest classifier trained with experimentally validated miRNAs from miRbase, with features that characterize the miRNA-miRNA* duplex. Because we observed that miRNAs have sequence and structural properties that differ between species, mostly in terms of duplex stability, we trained various cladespecific miRdup models and obtained increased accuracy. MiRdup self-trains on the most recent version of miRbase and is easy to use. Combined with existing pre-miRNA predictors, it will be valuable for both de novo mapping of miRNAs and filtering of large sets of candidate miRNAs obtained from transcriptome sequencing projects. MiRdup is open source under the GPLv3 and available at

Identification of microRNA-mRNA modules using microarray data

BMC Genomics, 2011

Background: MicroRNAs (miRNAs) are post-transcriptional regulators of mRNA expression and are involved in numerous cellular processes. Consequently, miRNAs are an important component of gene regulatory networks and an improved understanding of miRNAs will further our knowledge of these networks. There is a many-to-many relationship between miRNAs and mRNAs because a single miRNA targets multiple mRNAs and a single mRNA is targeted by multiple miRNAs. However, most of the current methods for the identification of regulatory miRNAs and their target mRNAs ignore this biological observation and focus on miRNA-mRNA pairs. Results: We propose a two-step method for the identification of many-to-many relationships between miRNAs and mRNAs. In the first step, we obtain miRNA and mRNA clusters using a combination of miRNA-target mRNA prediction algorithms and microarray expression data. In the second step, we determine the associations between miRNA clusters and mRNA clusters based on changes in miRNA and mRNA expression profiles. We consider the miRNA-mRNA clusters with statistically significant associations to be potentially regulatory and, therefore, of biological interest. Conclusions: Our method reduces the interactions between several hundred miRNAs and several thousand mRNAs to a few miRNA-mRNA groups, thereby facilitating a more meaningful biological analysis and a more targeted experimental validation.

An Overview of Computational Approaches for Prediction of miRNA Genes and their Targets

Current Bioinformatics, 2011

MicroRNAs (miRNAs) are a recently identified class of cellular non-coding RNAs that regulates protein expression and growth of a biological system during different stages of life. The active, mature miRNAs are 17-24 bases long, single-stranded RNA molecules expressed in both eukaryotic and prokaryotic cells and are known to affect the translation or stability of target messenger RNAs (mRNAs). Each miRNA is believed to regulate multiple genes and it is believed that greater than one third of all human genes may be regulated by miRNA molecules. Here in this review we try to focus on the role of these tiny molecules at different aspects of bioprocesses, prediction of miRNAs and their targets from a bioinformatics point of view.