Rituparno Sen | Helmholtz Centre for Infection Research (original) (raw)
Papers by Rituparno Sen
Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies
Machine learning (ML) methods are often used to identify members of non-coding RNA classes such a... more Machine learning (ML) methods are often used to identify members of non-coding RNA classes such as microRNAs or snoRNAs. However, ML methods have not been successfully used for homology search tasks. A systematic evaluation of ML in homology search requires large, controlled, and known ground truth test sets, and thus, methods to construct large realistic artificial data sets. Here we describe a method for producing sets of arbitrarily large and diverse snoRNA sequences based on artificial evolution. These are then used to evaluate supervised ML methods (Support Vector Machine, Artificial Neural Network, and Random Forest) for snoRNA detection in a chordate genome. Our results indicate that ML approaches can indeed be competitive also for homology search.
<p>(a) lnCeDB is searched by the gene symbol MCL1. (b) An intermediate page shows the diffe... more <p>(a) lnCeDB is searched by the gene symbol MCL1. (b) An intermediate page shows the different transcripts of the gene MCL1. (c) Upon choosing a transcript, the results page shows the potential lncRNAs working as ceRNA for the chosen transcript of MCL1. (d) By clicking on a lncRNA or mRNA id in the ceRNA table, the user views all miRNA targets on the chosen transcript. (e) By clicking on a serial number in the ceRNA table, the user views the expression heatmap and bar chart for the ceRNA pair along with co-expressed shared miRNAs in 22 human tissues.</p
HumanViCe: host ceRNA network in virus infected cells in human
Non-Coding RNA, 2021
Long non-coding RNAs (lncRNAs) are widely recognized as important regulators of gene expression. ... more Long non-coding RNAs (lncRNAs) are widely recognized as important regulators of gene expression. Their molecular functions range from miRNA sponging to chromatin-associated mechanisms, leading to effects in disease progression and establishing them as diagnostic and therapeutic targets. Still, only a few representatives of this diverse class of RNAs are well studied, while the vast majority is poorly described beyond the existence of their transcripts. In this review we survey common in silico approaches for lncRNA annotation. We focus on the well-established sets of features used for classification and discuss their specific advantages and weaknesses. While the available tools perform very well for the task of distinguishing coding sequence from other RNAs, we find that current methods are not well suited to distinguish lncRNAs or parts thereof from other non-protein-coding input sequences. We conclude that the distinction of lncRNAs from intronic sequences and untranslated regions...
PLOS ONE, 2014
<p>(a) Fraction of GENCODE 19 lncRNA transcripts with putative miRNA targets. (b) Fraction ... more <p>(a) Fraction of GENCODE 19 lncRNA transcripts with putative miRNA targets. (b) Fraction of mRNAs with predicted miRNA targets. (c) Fraction of lncRNAs with predicted ceRNA function compared to all lncRNAs with putative miRNA targets. (d) Fraction of mRNAs with predicted ceRNA function compared to all mRNAs with putative miRNA targets.</p
<p>Experimentally verified lncRNA ceRNAs in ln<i><u>Ce</u></i>DB.&l... more <p>Experimentally verified lncRNA ceRNAs in ln<i><u>Ce</u></i>DB.</p
Non-Coding RNA, 2017
Long non-coding RNAs (lncRNAs) form a substantial component of the transcriptome and are involved... more Long non-coding RNAs (lncRNAs) form a substantial component of the transcriptome and are involved in a wide variety of regulatory mechanisms. Compared to protein-coding genes, they are often expressed at low levels and are restricted to a narrow range of cell types or developmental stages. As a consequence, the diversity of their isoforms is still far from being recorded and catalogued in its entirety, and the debate is ongoing about what fraction of non-coding RNAs truly conveys biological function rather than being "junk". Here, using a collection of more than 100 transcriptomes from related B cell lymphoma, we show that lncRNA loci produce a very defined set of splice variants. While some of them are so rare that they become recognizable only in the superposition of dozens or hundreds of transcriptome datasets and not infrequently include introns or exons that have not been included in available genome annotation data, there is still a very limited number of processing products for any given locus. The combined depth of our sequencing data is large enough to effectively exhaust the isoform diversity: the overwhelming majority of splice junctions that are observed at all are represented by multiple junction-spanning reads. We conclude that the human transcriptome produces virtually no background of RNAs that are processed at effectively random positions, but is-under normal circumstances-confined to a well defined set of splice variants.
Scientific Reports, 2016
Some earlier studies have reported an alternative mode of microRNA-target interaction. We detecte... more Some earlier studies have reported an alternative mode of microRNA-target interaction. We detected target regions within mRNA transcripts from AGO PAR-CLIP that did not contain any conventional microRNA seed pairing but only had non-conventional binding sites with microRNA 3′ end. Our study from 7 set of data that measured global protein fold change after microRNA transfection pointed towards the association of target protein fold change with 6-mer and 7-mer target sites involving microRNA 3′ end. We developed a model to predict the degree of microRNA target regulation in terms of protein fold changes from the number of different conventional and non-conventional target sites present in the target, and found significant correlation of its output with protein expression changes. We validated the effect of non-conventional interactions with target by modulating the abundance of microRNA in a human breast cancer cell line MCF-7. The validation was done using luciferase assay and immuno...
The Scientific World Journal, 2014
Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shar... more Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shared miRNAs responses elements (MREs) and act as modulator of miRNA by influencing the available level of miRNA. It has recently been discovered that, apart from protein-coding ceRNAs, pseudogenes, long noncoding RNAs (lncRNAs), and circular RNAs act as miRNA “sponges” by sharing common MRE, inhibiting normal miRNA targeting activity on mRNA. These MRE sharing elements form the posttranscriptional ceRNA network to regulate mRNA expression. ceRNAs are widely implicated in many biological processes. Recent studies have identified ceRNAs associated with a number of diseases including cancer. This brief review focuses on the molecular mechanism of ceRNA as part of the complex post-transcriptional regulatory circuit in cell and the impact of ceRNAs in development and disease.
Frontiers in Genetics, 2013
Frontiers in Genetics, 2014
Theory in Biosciences
Many small nucleolar RNAs and many of the hairpin precursors of miRNAs are processed from long no... more Many small nucleolar RNAs and many of the hairpin precursors of miRNAs are processed from long non-protein-coding host genes. In contrast to their highly conserved and heavily structured payload, the host genes feature poorly conserved sequences. Nevertheless, there is mounting evidence that the host genes have biological functions beyond their primary task of carrying a ncRNA as payload. So far, no connections between the function of the host genes and the function of their payloads have been reported. Here we investigate whether there is evidence for an association of host gene function or mechanisms with the type of payload. To assess this hypothesis we test whether the miRNA host genes (MIRHGs), snoRNA host genes (SNHGs), and other lncRNA host genes can be distinguished based on sequence and/or structure features unrelated to their payload. A positive answer would imply a functional and mechanistic correlation between host genes and their payload, provided the classification doe...
Host-virus interaction via host cellular components has been an important field of research in re... more Host-virus interaction via host cellular components has been an important field of research
in recent times. RNA interference mediated by short interfering RNAs and microRNAs
(miRNA), is a widespread anti-viral defense strategy. Importantly, viruses also encode
their own miRNAs. In recent times miRNAs were identified as key players in host-virus
interaction. Furthermore, viruses were shown to exploit the host miRNA networks to
suite their own need. The complex cross-talk between host and viral miRNAs and
their cellular and viral targets forms the environment for viral pathogenesis. Apart from
protein-coding mRNAs, non-coding RNAs may also be targeted by host or viral miRNAs
in virus infected cells, and viruses can exploit the host miRNA mediated gene regulatory
network via the competing endogenous RNA effect. A recent report showed that viral
U-rich non-coding RNAs called HSUR, expressed in primate virus herpesvirus saimiri (HVS)
infected T cells, were able to bind to three host miRNAs, causing significant alteration in
cellular level for one of the miRNAs. We have predicted protein coding and non proteincoding
targets for viral and human miRNAs in virus infected cells. We identified viral
miRNA targets within host non-coding RNA loci from AGO interacting regions in three
different virus infected cells. Gene ontology (GO) and pathway enrichment analysis of the
genes comprising the ceRNA networks in the virus infected cells revealed enrichment of
key cellular signaling pathways related to cell fate decisions and gene transcription, like
Notch and Wnt signaling pathways, as well as pathways related to viral entry, replication
and virulence. We identified a vast number of non-coding transcripts playing as potential
ceRNAs to the immune response associated genes; e.g., APOBEC family genes, in some
virus infected cells. All these information are compiled in HumanViCe (http://gyanxet-beta.
com/humanvice), a comprehensive database that provides the potential ceRNA networks
in virus infected human cells.
Long noncoding RNA (lncRNA) influences post-transcriptional regulation by interfering with the mi... more Long noncoding RNA (lncRNA) influences post-transcriptional regulation by interfering with the microRNA (miRNA) pathways, acting as competing endogenous RNA (ceRNA). These lncRNAs have miRNA responsive elements (MRE) in them, and control endogenous miRNAs available for binding with their target mRNAs, thus reducing the repression of these mRNAs. lnCeDB provides a database of human lncRNAs (from GENCODE 19 version) that can potentially act as ceRNAs. The putative mRNA targets of human miRNAs and the targets mapped to AGO clipped regions are collected from TargetScan and StarBase respectively. The lncRNA targets of human miRNAs (up to GENCODE 11) are downloaded from miRCode database. miRNA targets on the rest of the GENCODE 19 lncRNAs are predicted by our algorithm for finding seed-matched target sites. These putative miRNA-lncRNA interactions are mapped to the Ago interacting regions within lncRNAs. To find out the likelihood of an lncRNA-mRNA pair for actually being ceRNA we take recourse to two methods. First, a ceRNA score is calculated from the ratio of the number of shared MREs between the pair with the total number of MREs of the individual candidate gene. Second, the P-value for each ceRNA pair is determined by hypergeometric test using the number of shared miRNAs between the ceRNA pair against the number of miRNAs interacting with the individual RNAs. Typically, in a pair of RNAs being targeted by common miRNA(s), there should be a correlation of expression so that the increase in level of one ceRNA results in the increased level of the other ceRNA. Near-equimolar concentration of the competing RNAs is associated with more profound ceRNA effect. In lnCeDB one can not only browse for lncRNA-mRNA pairs having common targeting miRNAs, but also compare the expression of the pair in 22 human tissues to estimate the chances of the pair for actually being ceRNAs. Availability: Downloadable freely from http://gyanxet-beta.com/lncedb/.
Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shar... more Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shared miRNAs responses elements (MREs) and act as modulator of miRNA by influencing the available level of miRNA. It has recently been discovered that, apart from protein-coding ceRNAs, pseudogenes, long noncoding RNAs (lncRNAs), and circular RNAs act as miRNA "sponges" by sharing common MRE, inhibiting normal miRNA targeting activity on mRNA. These MRE sharing elements form the posttranscriptional ceRNA network to regulate mRNA expression. ceRNAs are widely implicated in many biological processes. Recent studies have identified ceRNAs associated with a number of diseases including cancer. This brief review focuses on the molecular mechanism of ceRNA as part of the complex post-transcriptional regulatory circuit in cell and the impact of ceRNAs in development and disease.
Proceedings of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies
Machine learning (ML) methods are often used to identify members of non-coding RNA classes such a... more Machine learning (ML) methods are often used to identify members of non-coding RNA classes such as microRNAs or snoRNAs. However, ML methods have not been successfully used for homology search tasks. A systematic evaluation of ML in homology search requires large, controlled, and known ground truth test sets, and thus, methods to construct large realistic artificial data sets. Here we describe a method for producing sets of arbitrarily large and diverse snoRNA sequences based on artificial evolution. These are then used to evaluate supervised ML methods (Support Vector Machine, Artificial Neural Network, and Random Forest) for snoRNA detection in a chordate genome. Our results indicate that ML approaches can indeed be competitive also for homology search.
<p>(a) lnCeDB is searched by the gene symbol MCL1. (b) An intermediate page shows the diffe... more <p>(a) lnCeDB is searched by the gene symbol MCL1. (b) An intermediate page shows the different transcripts of the gene MCL1. (c) Upon choosing a transcript, the results page shows the potential lncRNAs working as ceRNA for the chosen transcript of MCL1. (d) By clicking on a lncRNA or mRNA id in the ceRNA table, the user views all miRNA targets on the chosen transcript. (e) By clicking on a serial number in the ceRNA table, the user views the expression heatmap and bar chart for the ceRNA pair along with co-expressed shared miRNAs in 22 human tissues.</p
HumanViCe: host ceRNA network in virus infected cells in human
Non-Coding RNA, 2021
Long non-coding RNAs (lncRNAs) are widely recognized as important regulators of gene expression. ... more Long non-coding RNAs (lncRNAs) are widely recognized as important regulators of gene expression. Their molecular functions range from miRNA sponging to chromatin-associated mechanisms, leading to effects in disease progression and establishing them as diagnostic and therapeutic targets. Still, only a few representatives of this diverse class of RNAs are well studied, while the vast majority is poorly described beyond the existence of their transcripts. In this review we survey common in silico approaches for lncRNA annotation. We focus on the well-established sets of features used for classification and discuss their specific advantages and weaknesses. While the available tools perform very well for the task of distinguishing coding sequence from other RNAs, we find that current methods are not well suited to distinguish lncRNAs or parts thereof from other non-protein-coding input sequences. We conclude that the distinction of lncRNAs from intronic sequences and untranslated regions...
PLOS ONE, 2014
<p>(a) Fraction of GENCODE 19 lncRNA transcripts with putative miRNA targets. (b) Fraction ... more <p>(a) Fraction of GENCODE 19 lncRNA transcripts with putative miRNA targets. (b) Fraction of mRNAs with predicted miRNA targets. (c) Fraction of lncRNAs with predicted ceRNA function compared to all lncRNAs with putative miRNA targets. (d) Fraction of mRNAs with predicted ceRNA function compared to all mRNAs with putative miRNA targets.</p
<p>Experimentally verified lncRNA ceRNAs in ln<i><u>Ce</u></i>DB.&l... more <p>Experimentally verified lncRNA ceRNAs in ln<i><u>Ce</u></i>DB.</p
Non-Coding RNA, 2017
Long non-coding RNAs (lncRNAs) form a substantial component of the transcriptome and are involved... more Long non-coding RNAs (lncRNAs) form a substantial component of the transcriptome and are involved in a wide variety of regulatory mechanisms. Compared to protein-coding genes, they are often expressed at low levels and are restricted to a narrow range of cell types or developmental stages. As a consequence, the diversity of their isoforms is still far from being recorded and catalogued in its entirety, and the debate is ongoing about what fraction of non-coding RNAs truly conveys biological function rather than being "junk". Here, using a collection of more than 100 transcriptomes from related B cell lymphoma, we show that lncRNA loci produce a very defined set of splice variants. While some of them are so rare that they become recognizable only in the superposition of dozens or hundreds of transcriptome datasets and not infrequently include introns or exons that have not been included in available genome annotation data, there is still a very limited number of processing products for any given locus. The combined depth of our sequencing data is large enough to effectively exhaust the isoform diversity: the overwhelming majority of splice junctions that are observed at all are represented by multiple junction-spanning reads. We conclude that the human transcriptome produces virtually no background of RNAs that are processed at effectively random positions, but is-under normal circumstances-confined to a well defined set of splice variants.
Scientific Reports, 2016
Some earlier studies have reported an alternative mode of microRNA-target interaction. We detecte... more Some earlier studies have reported an alternative mode of microRNA-target interaction. We detected target regions within mRNA transcripts from AGO PAR-CLIP that did not contain any conventional microRNA seed pairing but only had non-conventional binding sites with microRNA 3′ end. Our study from 7 set of data that measured global protein fold change after microRNA transfection pointed towards the association of target protein fold change with 6-mer and 7-mer target sites involving microRNA 3′ end. We developed a model to predict the degree of microRNA target regulation in terms of protein fold changes from the number of different conventional and non-conventional target sites present in the target, and found significant correlation of its output with protein expression changes. We validated the effect of non-conventional interactions with target by modulating the abundance of microRNA in a human breast cancer cell line MCF-7. The validation was done using luciferase assay and immuno...
The Scientific World Journal, 2014
Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shar... more Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shared miRNAs responses elements (MREs) and act as modulator of miRNA by influencing the available level of miRNA. It has recently been discovered that, apart from protein-coding ceRNAs, pseudogenes, long noncoding RNAs (lncRNAs), and circular RNAs act as miRNA “sponges” by sharing common MRE, inhibiting normal miRNA targeting activity on mRNA. These MRE sharing elements form the posttranscriptional ceRNA network to regulate mRNA expression. ceRNAs are widely implicated in many biological processes. Recent studies have identified ceRNAs associated with a number of diseases including cancer. This brief review focuses on the molecular mechanism of ceRNA as part of the complex post-transcriptional regulatory circuit in cell and the impact of ceRNAs in development and disease.
Frontiers in Genetics, 2013
Frontiers in Genetics, 2014
Theory in Biosciences
Many small nucleolar RNAs and many of the hairpin precursors of miRNAs are processed from long no... more Many small nucleolar RNAs and many of the hairpin precursors of miRNAs are processed from long non-protein-coding host genes. In contrast to their highly conserved and heavily structured payload, the host genes feature poorly conserved sequences. Nevertheless, there is mounting evidence that the host genes have biological functions beyond their primary task of carrying a ncRNA as payload. So far, no connections between the function of the host genes and the function of their payloads have been reported. Here we investigate whether there is evidence for an association of host gene function or mechanisms with the type of payload. To assess this hypothesis we test whether the miRNA host genes (MIRHGs), snoRNA host genes (SNHGs), and other lncRNA host genes can be distinguished based on sequence and/or structure features unrelated to their payload. A positive answer would imply a functional and mechanistic correlation between host genes and their payload, provided the classification doe...
Host-virus interaction via host cellular components has been an important field of research in re... more Host-virus interaction via host cellular components has been an important field of research
in recent times. RNA interference mediated by short interfering RNAs and microRNAs
(miRNA), is a widespread anti-viral defense strategy. Importantly, viruses also encode
their own miRNAs. In recent times miRNAs were identified as key players in host-virus
interaction. Furthermore, viruses were shown to exploit the host miRNA networks to
suite their own need. The complex cross-talk between host and viral miRNAs and
their cellular and viral targets forms the environment for viral pathogenesis. Apart from
protein-coding mRNAs, non-coding RNAs may also be targeted by host or viral miRNAs
in virus infected cells, and viruses can exploit the host miRNA mediated gene regulatory
network via the competing endogenous RNA effect. A recent report showed that viral
U-rich non-coding RNAs called HSUR, expressed in primate virus herpesvirus saimiri (HVS)
infected T cells, were able to bind to three host miRNAs, causing significant alteration in
cellular level for one of the miRNAs. We have predicted protein coding and non proteincoding
targets for viral and human miRNAs in virus infected cells. We identified viral
miRNA targets within host non-coding RNA loci from AGO interacting regions in three
different virus infected cells. Gene ontology (GO) and pathway enrichment analysis of the
genes comprising the ceRNA networks in the virus infected cells revealed enrichment of
key cellular signaling pathways related to cell fate decisions and gene transcription, like
Notch and Wnt signaling pathways, as well as pathways related to viral entry, replication
and virulence. We identified a vast number of non-coding transcripts playing as potential
ceRNAs to the immune response associated genes; e.g., APOBEC family genes, in some
virus infected cells. All these information are compiled in HumanViCe (http://gyanxet-beta.
com/humanvice), a comprehensive database that provides the potential ceRNA networks
in virus infected human cells.
Long noncoding RNA (lncRNA) influences post-transcriptional regulation by interfering with the mi... more Long noncoding RNA (lncRNA) influences post-transcriptional regulation by interfering with the microRNA (miRNA) pathways, acting as competing endogenous RNA (ceRNA). These lncRNAs have miRNA responsive elements (MRE) in them, and control endogenous miRNAs available for binding with their target mRNAs, thus reducing the repression of these mRNAs. lnCeDB provides a database of human lncRNAs (from GENCODE 19 version) that can potentially act as ceRNAs. The putative mRNA targets of human miRNAs and the targets mapped to AGO clipped regions are collected from TargetScan and StarBase respectively. The lncRNA targets of human miRNAs (up to GENCODE 11) are downloaded from miRCode database. miRNA targets on the rest of the GENCODE 19 lncRNAs are predicted by our algorithm for finding seed-matched target sites. These putative miRNA-lncRNA interactions are mapped to the Ago interacting regions within lncRNAs. To find out the likelihood of an lncRNA-mRNA pair for actually being ceRNA we take recourse to two methods. First, a ceRNA score is calculated from the ratio of the number of shared MREs between the pair with the total number of MREs of the individual candidate gene. Second, the P-value for each ceRNA pair is determined by hypergeometric test using the number of shared miRNAs between the ceRNA pair against the number of miRNAs interacting with the individual RNAs. Typically, in a pair of RNAs being targeted by common miRNA(s), there should be a correlation of expression so that the increase in level of one ceRNA results in the increased level of the other ceRNA. Near-equimolar concentration of the competing RNAs is associated with more profound ceRNA effect. In lnCeDB one can not only browse for lncRNA-mRNA pairs having common targeting miRNAs, but also compare the expression of the pair in 22 human tissues to estimate the chances of the pair for actually being ceRNAs. Availability: Downloadable freely from http://gyanxet-beta.com/lncedb/.
Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shar... more Competing endogenous RNA, ceRNA, vie with messenger RNAs (mRNAs) for microRNAs (miRNAs) with shared miRNAs responses elements (MREs) and act as modulator of miRNA by influencing the available level of miRNA. It has recently been discovered that, apart from protein-coding ceRNAs, pseudogenes, long noncoding RNAs (lncRNAs), and circular RNAs act as miRNA "sponges" by sharing common MRE, inhibiting normal miRNA targeting activity on mRNA. These MRE sharing elements form the posttranscriptional ceRNA network to regulate mRNA expression. ceRNAs are widely implicated in many biological processes. Recent studies have identified ceRNAs associated with a number of diseases including cancer. This brief review focuses on the molecular mechanism of ceRNA as part of the complex post-transcriptional regulatory circuit in cell and the impact of ceRNAs in development and disease.