Mario Lauria - Academia.edu (original) (raw)
Papers by Mario Lauria
Scientific reports, Mar 5, 2024
Scientific Reports, Sep 2, 2016
Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the... more Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease. Alzheimer's disease (AD) is the most common cause of dementia, characterized clinically by a decline in cognitive function and by distinctive brain pathology with neuronal loss and the formation of amyloid plaques and neurofibrillary tangles. Early onset AD is rare and is caused by mutations in specific genes such as amyloid precursor protein (APP), presenilin 1 (PSEN1) and presenilin 2 (PSEN2). Late onset AD is the most common form but, although several putative susceptibility genes have been reported, APOE, coding for the Apolipoprotein E, is the most robust susceptibility gene known to date. Three common isoforms of APOE have been recognized: APOE2 (cys112, cys158), APOE3 (cys112, arg158) and APOE4 (arg112, arg158) and the presence of the alleles coding for the APOE4 isoform are associated with an increased risk (up to tenfold in homozygous cases 1) of late onset AD when compared to the most common APOE3 allele or APOE2, a rarer allele, that appears to have, instead, a protective effect 2,3. APOE is a multifunctional glycosylated protein with a major role in lipid transport and atherosclerosis pathogenesis and it is expressed in several organs, with the highest expression in the liver and brain. In the central nervous system, although neurons can produce APOE under certain conditions, non-neuronal cells, mainly astrocytes and to some extent microglia, are the major cell types that express APOE in the brain 4,5. Numerous mechanisms by which APOE influences AD pathogenesis have been proposed, including a role in the clearance of Amyloid β 6,7 , but how this influences the pathogenic molecular processes remains to be clarified.
Nucleic Acids Research, May 9, 2015
SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of ... more SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of gene expression profiles for diagnostic and classification purposes. The tool is based on a new method for the clustering of profiles based on a subject-specific, as opposed to disease-specific, signature. Our approach relies on construction of a reference map of transcriptional signatures, from both healthy and affected subjects, derived from their respective mRNA or miRNA profiles. A diagnosis for a new individual can then be performed by determining the position of the individual's transcriptional signature on the map. The diagnostic power of our method has been convincingly demonstrated in an open scientific competition (SBV Improver Diagnostic Signature Challenge), scoring second place overall and first place in one of the sub-challenges.
Cells, 2020
The Negr1 gene has been significantly associated with major depression in genetic studies. Negr1 ... more The Negr1 gene has been significantly associated with major depression in genetic studies. Negr1 encodes for a cell adhesion molecule cleaved by the protease Adam10, thus activating Fgfr2 and promoting neuronal spine plasticity. We investigated whether antidepressants modulate the expression of genes belonging to Negr1-Fgfr2 pathway in Flinders sensitive line (FSL) rats, in a corticosterone-treated mouse model of depression, and in mouse primary neurons. Negr1 and Adam10 were the genes mostly affected by antidepressant treatment, and in opposite directions. Negr1 was down-regulated by escitalopram in the hypothalamus of FSL rats, by fluoxetine in the hippocampal dentate gyrus of corticosterone-treated mice, and by nortriptyline in hippocampal primary neurons. Adam10 mRNA was increased by nortriptyline administration in the hypothalamus, by escitalopram in the hippocampus of FSL rats, and by fluoxetine in mouse dorsal dentate gyrus. Similarly, nortriptyline increased Adam10 expressio...
Scientific reports, Jan 23, 2018
In longitudinal clinical studies, methodologies available for the analysis of multivariate data w... more In longitudinal clinical studies, methodologies available for the analysis of multivariate data with multivariate methods are relatively limited. Here, we present Consensus Clustering (CClust) a new computational method based on clustering of time profiles and posterior identification of correlation between clusters and predictors. Subjects are first clustered in groups according to a response variable temporal profile, using a robust consensus-based strategy. To discover which of the remaining variables are associated with the resulting groups, a non-parametric hypothesis test is performed between groups at every time point, and then the results are aggregated according to the Fisher method. Our approach is tested through its application to the EarlyBird cohort database, which contains temporal variations of clinical, metabolic, and anthropometric profiles in a population of 150 children followed-up annually from age 5 to age 16. Our results show that our consensus-based method is ...
Cell systems, Jan 4, 2017
We report the results of a DREAM challenge designed to predict relative genetic essentialities ba... more We report the results of a DREAM challenge designed to predict relative genetic essentialities based on a novel dataset testing 98,000 shRNAs against 149 molecularly characterized cancer cell lines. We analyzed the results of over 3,000 submissions over a period of 4 months. We found that algorithms combining essentiality data across multiple genes demonstrated increased accuracy; gene expression was the most informative molecular data type; the identity of the gene being predicted was far more important than the modeling strategy; well-predicted genes and selected molecular features showed enrichment in functional categories; and frequently selected expression features correlated with survival in primary tumors. This study establishes benchmarks for gene essentiality prediction, presents a community resource for future comparison with this benchmark, and provides insights into factors influencing the ability to predict gene essentiality from functional genetic screens. This study a...
Journal of Industrial and Production Engineering, 2017
Abstract This paper proposes methods for forward and inverse system modeling using Bayesian and l... more Abstract This paper proposes methods for forward and inverse system modeling using Bayesian and least squares regression. These methods are based on both space-filling design criteria for multiple response problems and linear optimality criteria focusing on D-optimality. Modeling with and without the constant term is considered motivated by the case study application of genetic network modeling. We propose extended one-factor-at-a-time experimentation followed by augmentation of next stage design which offers biologists simplicity. Results are illustrated both numerical examples, a test problem from the literature, and a case study motivated by an real world biological research related to genetic network modeling.
2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015
We describe a new signature definition and analysis method to be used as biomarker for blood-base... more We describe a new signature definition and analysis method to be used as biomarker for blood-based diagnosis of tuberculosis. Our new approach is based on the construction of a reference map of transcriptional signatures of both healthy and affected individuals using circulating miRNA from a large number of subjects. Once such a map is available, the diagnosis for a new patient can be performed by observing the relative position on the map of his/her transcriptional signature. To demonstrate its efficacy for this specific application we report the results of the application of our method to published data sets of circulating miRNA. Two crucial features make this method an ideal candidate for large scale applications such as a mass screening tool, or for point-of-care diagnostics. Specifically, our method is minimally invasive because it works well with profiles of circulating miRNA. More importantly, it is robust with respect to lab-to-lab protocol variability, measurement errors and batch effects because it requires that only the relative ranking of miRNA species in a profile be accurate, not their absolute values.
High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on
As Data and Computational Grids grow in size and complexity, the crucial task of identifying, mon... more As Data and Computational Grids grow in size and complexity, the crucial task of identifying, monitoring and utilizing available resources in an efficient manner is becoming increasingly difficult. The design of monitoring systems that are scalable both in the number of sources being monitored and in the number of clients served is a challenging issue. In this paper we investigate the trade-offs of different polling strategies that can be used to monitor resource availability on machines in a distributed environment. We show how adaptive polling protocols can substantially increase scalability with a less than proportional loss of precision, and how these protocols can be personalized for different types of resource usage patterns.
Methodological Advances in the Culture, Manipulation and Utilization of Embryonic Stem Cells for Basic and Practical Applications, 2011
We describe a new signature definition and analysis method to be used as biomarker for early canc... more We describe a new signature definition and analysis method to be used as biomarker for early cancer detection. Our new approach is based on the construction of a reference map of transcriptional signatures of both healthy and cancer affected individuals using circulating miRNA from a large number of subjects. Once such a map is available, the diagnosis for a new patient can be performed by observing the relative position on the map of his/her transcriptional signature. To demonstrate its efficacy for this specific application we report the results of the application of our method to published datasets of circulating miRNA, and we quantify its performance compared to current state-of-the-art methods. A number of additional features make this method an ideal candidate for large-scale use, for example, as a mass screening tool for early cancer detection or for at-home diagnostics. Specifically, our method is minimally invasive (because it works well with circulating miRNA), it is robust with respect to lab-tolab protocol variability and batch effects (it requires that only the relative ranking of expression value of miRNA in a profile be accurate not their absolute values), and it is scalable to a large number of subjects. Finally we discuss the need for HPC capability in a widespread application of our or similar methods.
Genome Biology, 2010
Background: Dosage imbalance is responsible for several genetic diseases, among which Down syndro... more Background: Dosage imbalance is responsible for several genetic diseases, among which Down syndrome is caused by the trisomy of human chromosome 21. Results: To elucidate the extent to which the dosage imbalance of specific human chromosome 21 genes perturb distinct molecular pathways, we developed the first mouse embryonic stem (ES) cell bank of human chromosome 21 genes. The human chromosome 21-mouse ES cell bank includes, in triplicate clones, 32 human chromosome 21 genes, which can be overexpressed in an inducible manner. Each clone was transcriptionally profiled in inducing versus noninducing conditions. Analysis of the transcriptional response yielded results that were consistent with the perturbed gene's known function. Comparison between mouse ES cells containing the whole human chromosome 21 (trisomic mouse ES cells) and mouse ES cells overexpressing single human chromosome 21 genes allowed us to evaluate the contribution of single genes to the trisomic mouse ES cell transcriptome. In addition, for the clones overexpressing the Runx1 gene, we compared the transcriptome changes with the corresponding protein changes by mass spectroscopy analysis. Conclusions: We determined that only a subset of genes produces a strong transcriptional response when overexpressed in mouse ES cells and that this effect can be predicted taking into account the basal gene expression level and the protein secondary structure. We showed that the human chromosome 21-mouse ES cell bank is an important resource, which may be instrumental towards a better understanding of Down syndrome and other human aneuploidy disorders.
Nucleic Acids Research, 2012
Gene expression profiles can be used to infer previously unknown transcriptional regulatory inter... more Gene expression profiles can be used to infer previously unknown transcriptional regulatory interaction among thousands of genes, via systems biology 'reverse engineering' approaches. We 'reverse engineered' an embryonic stem (ES)-specific transcriptional network from 171 gene expression profiles, measured in ES cells, to identify master regulators of gene expression ('hubs'). We discovered that E130012A19Rik (E13), highly expressed in mouse ES cells as compared with differentiated cells, was a central 'hub' of the network. We demonstrated that E13 is a protein-coding gene implicated in regulating the commitment towards the different neuronal subtypes and glia cells. The overexpression and knock-down of E13 in ES cell lines, undergoing differentiation into neurons and glia cells, caused a strong up-regulation of the glutamatergic neurons marker Vglut2 and a strong down-regulation of the GABAergic neurons marker GAD65 and of the radial glia marker Blbp. We confirmed E13 expression in the cerebral cortex of adult mice and during development. By immuno-based affinity purification, we characterized protein partners of E13, involved in the Polycomb complex. Our results suggest a role of E13 in regulating the division between glutamatergic projection neurons and GABAergic interneurons and glia cells possibly by epigenetic-mediated transcriptional regulation.
Bioinformatics, 2013
Motivation: After more than a decade since microarrays were used to predict phenotype of biologic... more Motivation: After more than a decade since microarrays were used to predict phenotype of biological samples, real-life applications for disease screening and identification of patients who would best benefit from treatment are still emerging. The interest of the scientific community in identifying best approaches to develop such prediction models was reaffirmed in a competition style international collaboration called IMPROVER Diagnostic Signature Challenge whose results we describe herein. Results: Fifty-four teams used public data to develop prediction models in four disease areas including multiple sclerosis, lung cancer, psoriasis and chronic obstructive pulmonary disease, and made predictions on blinded new data that we generated. Teams were scored using three metrics that captured various aspects of the quality of predictions, and best performers were awarded. This article presents the challenge results and introduces to the community the approaches of the best overall three p...
Scientific Reports, 2019
Evidence is accumulating that the main chronic diseases of aging Alzheimer’s disease (AD) and typ... more Evidence is accumulating that the main chronic diseases of aging Alzheimer’s disease (AD) and type-2 diabetes mellitus (T2DM) share common pathophysiological mechanisms. This study aimed at applying systems biology approaches to increase the knowledge of the shared molecular pathways underpinnings of AD and T2DM. We analysed transcriptomic data of post-mortem AD and T2DM human brains to obtain disease signatures of AD and T2DM and combined them with protein-protein interaction information to construct two disease-specific networks. The overlapping AD/T2DM network proteins were then used to extract the most representative Gene Ontology biological process terms. The expression of genes identified as relevant was studied in two AD models, 3xTg-AD and ApoE3/ApoE4 targeted replacement mice. The present transcriptomic data analysis revealed a principal role for autophagy in the molecular basis of both AD and T2DM. Our experimental validation in mouse AD models confirmed the role of autoph...
PLoS ONE, 2013
Alzheimer's disease is the most common cause of dementia worldwide, affecting the elderly populat... more Alzheimer's disease is the most common cause of dementia worldwide, affecting the elderly population. It is characterized by the hallmark pathology of amyloid-b deposition, neurofibrillary tangle formation, and extensive neuronal degeneration in the brain. Wealth of data related to Alzheimer's disease has been generated to date, nevertheless, the molecular mechanism underlying the etiology and pathophysiology of the disease is still unknown. Here we described a method for the combined analysis of multiple types of genome-wide data aimed at revealing convergent evidence interest that would not be captured by a standard molecular approach. Lists of Alzheimer-related genes (seed genes) were obtained from different sets of data on gene expression, SNPs, and molecular targets of drugs. Network analysis was applied for identifying the regions of the human protein-protein interaction network showing a significant enrichment in seed genes, and ultimately, in genes associated to Alzheimer's disease, due to the cumulative effect of different combinations of the starting data sets. The functional properties of these enriched modules were characterized, effectively considering the role of both Alzheimer-related seed genes and genes that closely interact with them. This approach allowed us to present evidence in favor of one of the competing theories about AD underlying processes, specifically evidence supporting a predominant role of metabolism-associated biological process terms, including autophagy, insulin and fatty acid metabolic processes in Alzheimer, with a focus on AMP-activated protein kinase. This central regulator of cellular energy homeostasis regulates a series of brain functions altered in Alzheimer's disease and could link genetic perturbation with neuronal transmission and energy regulation, representing a potential candidate to be targeted by therapy.
Cancers, 2021
High-throughput technologies make it possible to produce a large amount of data representing diff... more High-throughput technologies make it possible to produce a large amount of data representing different biological layers, examples of which are genomics, proteomics, metabolomics and transcriptomics. Omics data have been individually investigated to understand the molecular bases of various diseases, but this may not be sufficient to fully capture the molecular mechanisms and the multilayer regulatory processes underlying complex diseases, especially cancer. To overcome this problem, several multi-omics integration methods have been introduced but a commonly agreed standard of analysis is still lacking. In this paper, we present MOUSSE, a novel normalization-free pipeline for unsupervised multi-omics integration. The main innovations are the use of rank-based subject-specific signatures and the use of such signatures to derive subject similarity networks. A separate similarity network was derived for each omics, and the resulting networks were then carefully merged in a way that con...
Nucleic Acids Research, 2015
SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of ... more SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of gene expression profiles for diagnostic and classification purposes. The tool is based on a new method for the clustering of profiles based on a subject-specific, as opposed to disease-specific, signature. Our approach relies on construction of a reference map of transcriptional signatures, from both healthy and affected subjects, derived from their respective mRNA or miRNA profiles. A diagnosis for a new individual can then be performed by determining the position of the individual's transcriptional signature on the map. The diagnostic power of our method has been convincingly demonstrated in an open scientific competition (SBV Improver Diagnostic Signature Challenge), scoring second place overall and first place in one of the sub-challenges.
Systems Biomedicine, 2013
Desktop grids have already been used to perform some of the largest computations in the world and... more Desktop grids have already been used to perform some of the largest computations in the world and have the potential to grow by several more orders of magnitude. However current approaches to utilizing desktop resources require either centralized servers or extensive knowledge of the underlying system, limiting their scalability. We propose a biologically inspired and fully-decentralized approach to the organization of computation that is based on the autonomous scheduling of strongly mobile agents on a peer-to-peer network. In a radical departure from current models, we envision large scale desktop grids in which agents autonomically organize themselves so as to maximize resource utilization. We demonstrate this concept with a reduced scale proof-of-concept implementation that executes a data-intensive parameter sweep application on a set of heterogeneous geographically distributed machines. We present a detailed exploration of the design space of our system and a performance evalu...
Scientific reports, Mar 5, 2024
Scientific Reports, Sep 2, 2016
Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the... more Among the genetic factors known to increase the risk of late onset Alzheimer's diseases (AD), the presence of the apolipoproteine e4 (APOE4) allele has been recognized as the one with the strongest effect. However, despite decades of research, the pathogenic role of APOE4 in Alzheimer's disease has not been clearly elucidated yet. In order to investigate the pathogenic action of APOE4, we applied a systems biology approach to the analysis of transcriptomic and genomic data of APOE44 vs. APOE33 allele carriers affected by Alzheimer's disease. Network analysis combined with a novel technique for biomarker computation allowed the identification of an alteration in aging-associated processes such as inflammation, oxidative stress and metabolic pathways, indicating that APOE4 possibly accelerates pathological processes physiologically induced by aging. Subsequent integration with genomic data indicates that the Notch pathway could be the nodal molecular mechanism altered in APOE44 allele carriers with Alzheimer's disease. Interestingly, PSEN1 and APP, genes whose mutation are known to be linked to early onset Alzheimer's disease, are closely linked to this pathway. In conclusion, APOE4 role on inflammation and oxidation through the Notch signaling pathway could be crucial in elucidating the risk factors of Alzheimer's disease. Alzheimer's disease (AD) is the most common cause of dementia, characterized clinically by a decline in cognitive function and by distinctive brain pathology with neuronal loss and the formation of amyloid plaques and neurofibrillary tangles. Early onset AD is rare and is caused by mutations in specific genes such as amyloid precursor protein (APP), presenilin 1 (PSEN1) and presenilin 2 (PSEN2). Late onset AD is the most common form but, although several putative susceptibility genes have been reported, APOE, coding for the Apolipoprotein E, is the most robust susceptibility gene known to date. Three common isoforms of APOE have been recognized: APOE2 (cys112, cys158), APOE3 (cys112, arg158) and APOE4 (arg112, arg158) and the presence of the alleles coding for the APOE4 isoform are associated with an increased risk (up to tenfold in homozygous cases 1) of late onset AD when compared to the most common APOE3 allele or APOE2, a rarer allele, that appears to have, instead, a protective effect 2,3. APOE is a multifunctional glycosylated protein with a major role in lipid transport and atherosclerosis pathogenesis and it is expressed in several organs, with the highest expression in the liver and brain. In the central nervous system, although neurons can produce APOE under certain conditions, non-neuronal cells, mainly astrocytes and to some extent microglia, are the major cell types that express APOE in the brain 4,5. Numerous mechanisms by which APOE influences AD pathogenesis have been proposed, including a role in the clearance of Amyloid β 6,7 , but how this influences the pathogenic molecular processes remains to be clarified.
Nucleic Acids Research, May 9, 2015
SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of ... more SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of gene expression profiles for diagnostic and classification purposes. The tool is based on a new method for the clustering of profiles based on a subject-specific, as opposed to disease-specific, signature. Our approach relies on construction of a reference map of transcriptional signatures, from both healthy and affected subjects, derived from their respective mRNA or miRNA profiles. A diagnosis for a new individual can then be performed by determining the position of the individual's transcriptional signature on the map. The diagnostic power of our method has been convincingly demonstrated in an open scientific competition (SBV Improver Diagnostic Signature Challenge), scoring second place overall and first place in one of the sub-challenges.
Cells, 2020
The Negr1 gene has been significantly associated with major depression in genetic studies. Negr1 ... more The Negr1 gene has been significantly associated with major depression in genetic studies. Negr1 encodes for a cell adhesion molecule cleaved by the protease Adam10, thus activating Fgfr2 and promoting neuronal spine plasticity. We investigated whether antidepressants modulate the expression of genes belonging to Negr1-Fgfr2 pathway in Flinders sensitive line (FSL) rats, in a corticosterone-treated mouse model of depression, and in mouse primary neurons. Negr1 and Adam10 were the genes mostly affected by antidepressant treatment, and in opposite directions. Negr1 was down-regulated by escitalopram in the hypothalamus of FSL rats, by fluoxetine in the hippocampal dentate gyrus of corticosterone-treated mice, and by nortriptyline in hippocampal primary neurons. Adam10 mRNA was increased by nortriptyline administration in the hypothalamus, by escitalopram in the hippocampus of FSL rats, and by fluoxetine in mouse dorsal dentate gyrus. Similarly, nortriptyline increased Adam10 expressio...
Scientific reports, Jan 23, 2018
In longitudinal clinical studies, methodologies available for the analysis of multivariate data w... more In longitudinal clinical studies, methodologies available for the analysis of multivariate data with multivariate methods are relatively limited. Here, we present Consensus Clustering (CClust) a new computational method based on clustering of time profiles and posterior identification of correlation between clusters and predictors. Subjects are first clustered in groups according to a response variable temporal profile, using a robust consensus-based strategy. To discover which of the remaining variables are associated with the resulting groups, a non-parametric hypothesis test is performed between groups at every time point, and then the results are aggregated according to the Fisher method. Our approach is tested through its application to the EarlyBird cohort database, which contains temporal variations of clinical, metabolic, and anthropometric profiles in a population of 150 children followed-up annually from age 5 to age 16. Our results show that our consensus-based method is ...
Cell systems, Jan 4, 2017
We report the results of a DREAM challenge designed to predict relative genetic essentialities ba... more We report the results of a DREAM challenge designed to predict relative genetic essentialities based on a novel dataset testing 98,000 shRNAs against 149 molecularly characterized cancer cell lines. We analyzed the results of over 3,000 submissions over a period of 4 months. We found that algorithms combining essentiality data across multiple genes demonstrated increased accuracy; gene expression was the most informative molecular data type; the identity of the gene being predicted was far more important than the modeling strategy; well-predicted genes and selected molecular features showed enrichment in functional categories; and frequently selected expression features correlated with survival in primary tumors. This study establishes benchmarks for gene essentiality prediction, presents a community resource for future comparison with this benchmark, and provides insights into factors influencing the ability to predict gene essentiality from functional genetic screens. This study a...
Journal of Industrial and Production Engineering, 2017
Abstract This paper proposes methods for forward and inverse system modeling using Bayesian and l... more Abstract This paper proposes methods for forward and inverse system modeling using Bayesian and least squares regression. These methods are based on both space-filling design criteria for multiple response problems and linear optimality criteria focusing on D-optimality. Modeling with and without the constant term is considered motivated by the case study application of genetic network modeling. We propose extended one-factor-at-a-time experimentation followed by augmentation of next stage design which offers biologists simplicity. Results are illustrated both numerical examples, a test problem from the literature, and a case study motivated by an real world biological research related to genetic network modeling.
2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015
We describe a new signature definition and analysis method to be used as biomarker for blood-base... more We describe a new signature definition and analysis method to be used as biomarker for blood-based diagnosis of tuberculosis. Our new approach is based on the construction of a reference map of transcriptional signatures of both healthy and affected individuals using circulating miRNA from a large number of subjects. Once such a map is available, the diagnosis for a new patient can be performed by observing the relative position on the map of his/her transcriptional signature. To demonstrate its efficacy for this specific application we report the results of the application of our method to published data sets of circulating miRNA. Two crucial features make this method an ideal candidate for large scale applications such as a mass screening tool, or for point-of-care diagnostics. Specifically, our method is minimally invasive because it works well with profiles of circulating miRNA. More importantly, it is robust with respect to lab-to-lab protocol variability, measurement errors and batch effects because it requires that only the relative ranking of miRNA species in a profile be accurate, not their absolute values.
High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on
As Data and Computational Grids grow in size and complexity, the crucial task of identifying, mon... more As Data and Computational Grids grow in size and complexity, the crucial task of identifying, monitoring and utilizing available resources in an efficient manner is becoming increasingly difficult. The design of monitoring systems that are scalable both in the number of sources being monitored and in the number of clients served is a challenging issue. In this paper we investigate the trade-offs of different polling strategies that can be used to monitor resource availability on machines in a distributed environment. We show how adaptive polling protocols can substantially increase scalability with a less than proportional loss of precision, and how these protocols can be personalized for different types of resource usage patterns.
Methodological Advances in the Culture, Manipulation and Utilization of Embryonic Stem Cells for Basic and Practical Applications, 2011
We describe a new signature definition and analysis method to be used as biomarker for early canc... more We describe a new signature definition and analysis method to be used as biomarker for early cancer detection. Our new approach is based on the construction of a reference map of transcriptional signatures of both healthy and cancer affected individuals using circulating miRNA from a large number of subjects. Once such a map is available, the diagnosis for a new patient can be performed by observing the relative position on the map of his/her transcriptional signature. To demonstrate its efficacy for this specific application we report the results of the application of our method to published datasets of circulating miRNA, and we quantify its performance compared to current state-of-the-art methods. A number of additional features make this method an ideal candidate for large-scale use, for example, as a mass screening tool for early cancer detection or for at-home diagnostics. Specifically, our method is minimally invasive (because it works well with circulating miRNA), it is robust with respect to lab-tolab protocol variability and batch effects (it requires that only the relative ranking of expression value of miRNA in a profile be accurate not their absolute values), and it is scalable to a large number of subjects. Finally we discuss the need for HPC capability in a widespread application of our or similar methods.
Genome Biology, 2010
Background: Dosage imbalance is responsible for several genetic diseases, among which Down syndro... more Background: Dosage imbalance is responsible for several genetic diseases, among which Down syndrome is caused by the trisomy of human chromosome 21. Results: To elucidate the extent to which the dosage imbalance of specific human chromosome 21 genes perturb distinct molecular pathways, we developed the first mouse embryonic stem (ES) cell bank of human chromosome 21 genes. The human chromosome 21-mouse ES cell bank includes, in triplicate clones, 32 human chromosome 21 genes, which can be overexpressed in an inducible manner. Each clone was transcriptionally profiled in inducing versus noninducing conditions. Analysis of the transcriptional response yielded results that were consistent with the perturbed gene's known function. Comparison between mouse ES cells containing the whole human chromosome 21 (trisomic mouse ES cells) and mouse ES cells overexpressing single human chromosome 21 genes allowed us to evaluate the contribution of single genes to the trisomic mouse ES cell transcriptome. In addition, for the clones overexpressing the Runx1 gene, we compared the transcriptome changes with the corresponding protein changes by mass spectroscopy analysis. Conclusions: We determined that only a subset of genes produces a strong transcriptional response when overexpressed in mouse ES cells and that this effect can be predicted taking into account the basal gene expression level and the protein secondary structure. We showed that the human chromosome 21-mouse ES cell bank is an important resource, which may be instrumental towards a better understanding of Down syndrome and other human aneuploidy disorders.
Nucleic Acids Research, 2012
Gene expression profiles can be used to infer previously unknown transcriptional regulatory inter... more Gene expression profiles can be used to infer previously unknown transcriptional regulatory interaction among thousands of genes, via systems biology 'reverse engineering' approaches. We 'reverse engineered' an embryonic stem (ES)-specific transcriptional network from 171 gene expression profiles, measured in ES cells, to identify master regulators of gene expression ('hubs'). We discovered that E130012A19Rik (E13), highly expressed in mouse ES cells as compared with differentiated cells, was a central 'hub' of the network. We demonstrated that E13 is a protein-coding gene implicated in regulating the commitment towards the different neuronal subtypes and glia cells. The overexpression and knock-down of E13 in ES cell lines, undergoing differentiation into neurons and glia cells, caused a strong up-regulation of the glutamatergic neurons marker Vglut2 and a strong down-regulation of the GABAergic neurons marker GAD65 and of the radial glia marker Blbp. We confirmed E13 expression in the cerebral cortex of adult mice and during development. By immuno-based affinity purification, we characterized protein partners of E13, involved in the Polycomb complex. Our results suggest a role of E13 in regulating the division between glutamatergic projection neurons and GABAergic interneurons and glia cells possibly by epigenetic-mediated transcriptional regulation.
Bioinformatics, 2013
Motivation: After more than a decade since microarrays were used to predict phenotype of biologic... more Motivation: After more than a decade since microarrays were used to predict phenotype of biological samples, real-life applications for disease screening and identification of patients who would best benefit from treatment are still emerging. The interest of the scientific community in identifying best approaches to develop such prediction models was reaffirmed in a competition style international collaboration called IMPROVER Diagnostic Signature Challenge whose results we describe herein. Results: Fifty-four teams used public data to develop prediction models in four disease areas including multiple sclerosis, lung cancer, psoriasis and chronic obstructive pulmonary disease, and made predictions on blinded new data that we generated. Teams were scored using three metrics that captured various aspects of the quality of predictions, and best performers were awarded. This article presents the challenge results and introduces to the community the approaches of the best overall three p...
Scientific Reports, 2019
Evidence is accumulating that the main chronic diseases of aging Alzheimer’s disease (AD) and typ... more Evidence is accumulating that the main chronic diseases of aging Alzheimer’s disease (AD) and type-2 diabetes mellitus (T2DM) share common pathophysiological mechanisms. This study aimed at applying systems biology approaches to increase the knowledge of the shared molecular pathways underpinnings of AD and T2DM. We analysed transcriptomic data of post-mortem AD and T2DM human brains to obtain disease signatures of AD and T2DM and combined them with protein-protein interaction information to construct two disease-specific networks. The overlapping AD/T2DM network proteins were then used to extract the most representative Gene Ontology biological process terms. The expression of genes identified as relevant was studied in two AD models, 3xTg-AD and ApoE3/ApoE4 targeted replacement mice. The present transcriptomic data analysis revealed a principal role for autophagy in the molecular basis of both AD and T2DM. Our experimental validation in mouse AD models confirmed the role of autoph...
PLoS ONE, 2013
Alzheimer's disease is the most common cause of dementia worldwide, affecting the elderly populat... more Alzheimer's disease is the most common cause of dementia worldwide, affecting the elderly population. It is characterized by the hallmark pathology of amyloid-b deposition, neurofibrillary tangle formation, and extensive neuronal degeneration in the brain. Wealth of data related to Alzheimer's disease has been generated to date, nevertheless, the molecular mechanism underlying the etiology and pathophysiology of the disease is still unknown. Here we described a method for the combined analysis of multiple types of genome-wide data aimed at revealing convergent evidence interest that would not be captured by a standard molecular approach. Lists of Alzheimer-related genes (seed genes) were obtained from different sets of data on gene expression, SNPs, and molecular targets of drugs. Network analysis was applied for identifying the regions of the human protein-protein interaction network showing a significant enrichment in seed genes, and ultimately, in genes associated to Alzheimer's disease, due to the cumulative effect of different combinations of the starting data sets. The functional properties of these enriched modules were characterized, effectively considering the role of both Alzheimer-related seed genes and genes that closely interact with them. This approach allowed us to present evidence in favor of one of the competing theories about AD underlying processes, specifically evidence supporting a predominant role of metabolism-associated biological process terms, including autophagy, insulin and fatty acid metabolic processes in Alzheimer, with a focus on AMP-activated protein kinase. This central regulator of cellular energy homeostasis regulates a series of brain functions altered in Alzheimer's disease and could link genetic perturbation with neuronal transmission and energy regulation, representing a potential candidate to be targeted by therapy.
Cancers, 2021
High-throughput technologies make it possible to produce a large amount of data representing diff... more High-throughput technologies make it possible to produce a large amount of data representing different biological layers, examples of which are genomics, proteomics, metabolomics and transcriptomics. Omics data have been individually investigated to understand the molecular bases of various diseases, but this may not be sufficient to fully capture the molecular mechanisms and the multilayer regulatory processes underlying complex diseases, especially cancer. To overcome this problem, several multi-omics integration methods have been introduced but a commonly agreed standard of analysis is still lacking. In this paper, we present MOUSSE, a novel normalization-free pipeline for unsupervised multi-omics integration. The main innovations are the use of rank-based subject-specific signatures and the use of such signatures to derive subject similarity networks. A separate similarity network was derived for each omics, and the resulting networks were then carefully merged in a way that con...
Nucleic Acids Research, 2015
SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of ... more SCUDO (Signature-based ClUstering for DiagnOstic purposes) is an online tool for the analysis of gene expression profiles for diagnostic and classification purposes. The tool is based on a new method for the clustering of profiles based on a subject-specific, as opposed to disease-specific, signature. Our approach relies on construction of a reference map of transcriptional signatures, from both healthy and affected subjects, derived from their respective mRNA or miRNA profiles. A diagnosis for a new individual can then be performed by determining the position of the individual's transcriptional signature on the map. The diagnostic power of our method has been convincingly demonstrated in an open scientific competition (SBV Improver Diagnostic Signature Challenge), scoring second place overall and first place in one of the sub-challenges.
Systems Biomedicine, 2013
Desktop grids have already been used to perform some of the largest computations in the world and... more Desktop grids have already been used to perform some of the largest computations in the world and have the potential to grow by several more orders of magnitude. However current approaches to utilizing desktop resources require either centralized servers or extensive knowledge of the underlying system, limiting their scalability. We propose a biologically inspired and fully-decentralized approach to the organization of computation that is based on the autonomous scheduling of strongly mobile agents on a peer-to-peer network. In a radical departure from current models, we envision large scale desktop grids in which agents autonomically organize themselves so as to maximize resource utilization. We demonstrate this concept with a reduced scale proof-of-concept implementation that executes a data-intensive parameter sweep application on a set of heterogeneous geographically distributed machines. We present a detailed exploration of the design space of our system and a performance evalu...