Using Bayesian network inference algorithms to recover molecular genetic regulatory networks (original) (raw)
Related papers
Using fuzzy logic inference algorithm to recover molecular genetic regulatory networks
IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04., 2004
Recent advances in high-throughput molecular biology has motivated in the field of bioinformatics the use of network inference algorithms to predict causal models of molecular networks from correlational data. However, it is extremely difficult to evaluate the effectiveness of these algorithms because we possess neither the knowledge of the correct biological networks nor the ability to experimentally validate the hundreds of predicted gene interactions within a reasonable amount of time. Here, we apply a new approach developed by Smith, et al. (2002) that tests the ability of network inference algorithms to accurately and efficiently recover network structures based on gene expression data taken from a simulated biological pathway in which the structure is known a priori. We simulated a genetic regulatory network and used the resultant sampled data to test variations in the design of a Bayesian Network inference algorithm, as well as variations in total quantity of available data, length of sampling interval, method of data discretization, and presence of interpolated data between observed data points. We also advanced the inference algorithm by developing a heuristic influence score that infers the strength and sign of regulation (up or down) between genes. In these experiments, we found that our inference algorithm worked best when presented with data discretized into three categories, when using a greedy search algorithm with random restarts, and when evaluating networks using the BDe scoring metric. Under these conditions, the algorithm was both accurate and efficient in recovering the simulated molecular network when the sampled data sets were large. Under more biologically reasonable small amounts of sampled data, the algorithm worked best only when interpolated data was included, but had difficulty recovering relationships describing genes with more than one regulatory influence. These results suggest that network inference algorithms and sampling methods must be carefully designed and tested before they can be used to recover biological genetic pathways, especially in the context of highly limited quantities of data.
Analyzing the Effect of Prior Knowledge in Genetic Regulatory Network Inference
Lecture Notes in Computer Science, 2005
Inferring the metabolic pathways that control the cell cycles is a challenging and difficult task. Its importance in the process of understanding living organisms has motivated the development of several models to infer gene regulatory networks from DNA microarray data. In the last years, many works have been adding biological information to those models to improve the obtained results. In this work, we add prior biological knowledge into a Bayesian Network model with non parametric regression and analyze the effects of such information in the results.
Current Bioinformatics, 2014
In the post-genome era, designing and conducting novel experiments have become increasingly common for modern researchers. However, the major challenge faced by researchers is surprisingly not the complexity in designing new experiments or obtaining the data generated from the experiments, but instead it is the huge amount of data to be processed and analyzed in the quest to produce meaningful information and knowledge. Gene regulatory network (GRN) inference from gene expression data is one of the common examples of such challenge. Over the years, GRN inference has witnessed a number of transitions, and an increasing amount of new computational and statistical-based methods have been applied to automate the procedure. One of the widely used approaches for GRN inference is the dynamic Bayesian network (DBN). In this review paper, we first discuss the evolution of molecular biology research from reductionism to holism. This is followed by a brief insight on various computational and statistical methods used in GRN inference before focusing on reviewing the current development and applications of DBN-based methods. Chai et al. Category Inference Model Logical models Boolean networks Probabilistic Boolean networks [30, 31] Bayesian networks Continuous models Continuous linear models [32] Dynamic Bayesian networks Ordinary differential equations Regulated flux balance analysis [33] Single-molecule level Stochastic simulation algorithm [34] Inferring Gene Regulatory Networks
Biosystems, 2018
The study of biological systems at a system level has become a reality due to the increasing powerful computational approaches able to handle increasingly larger datasets. Uncovering the dynamic nature of gene regulatory networks in order to attain a system level understanding and improve the predictive power of biological models is an important research field in systems biology. The task itself presents several challenges, since the problem is of combinatorial nature and highly depends on several biological constraints and also the intended application. Given the intrinsic interdisciplinary nature of gene regulatory network inference, we present a review on the currently available approaches, their challenges and limitations. We propose guidelines to select the most appropriate method considering the underlying assumptions and fundamental biological and data constraints.
BMC Systems Biology, 2011
Background Reverse engineering in systems biology entails inference of gene regulatory networks from observational data. This data typically include gene expression measurements of wild type and mutant cells in response to a given stimulus. It has been shown that when more than one type of experiment is used in the network inference process the accuracy is higher. Therefore the development of generally applicable and effective methodologies that embed multiple sources of information in a single computational framework is a worthwhile objective. Results This paper presents a new method for network inference, which uses multi-objective optimisation (MOO) to integrate multiple inference methods and experiments. We illustrate the potential of the methodology by combining ODE and correlation-based network inference procedures as well as time course and gene inactivation experiments. Here we show that our methodology is effective for a wide spectrum of data sets and method integration strategies. Conclusions The approach we present in this paper is flexible and can be used in any scenario that benefits from integration of multiple sources of information and modelling procedures in the inference process. Moreover, the application of this method to two case studies representative of bacteria and vertebrate systems has shown potential in identifying key regulators of important biological processes.
Data- and knowledge-based modeling of gene regulatory networks: an update
EXCLI journal, 2015
Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions.
Increasing feasibility of optimal gene network estimation, 2004
Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks from microarray data, which reduces the CPU time and memory consumption of previous algorithms. We prove that the space complexity can be reduced from O(n 2 · 2 n ) to O(2 n ), and that the expected calculation time can be reduced from O(n 2 · 2 n ) to O(n · 2 n ), where n is the number of genes. We make intrinsic use of a limitation of the maximal number of regulators of each gene, which has biological as well as statistical justifications. The improvements are significant for some applications in research.
Inferring Gene Networks: Dream or Nightmare?
Annals of the New York Academy of Sciences, 2009
We describe several algorithms with winning performance in the Dialogue for Reverse Engineering Assessments and Methods (DREAM2) Reverse Engineering Competition 2007. After the gold standards for the challenges were released, the performance of the algorithms could be thoroughly evaluated under different parameters or alternative ways of solving systems of equations. For the analysis of Challenge 4, the "In-silico" challenges, we employed methods to explicitly deal with perturbation data and timeseries data. We show that original methods used to produce winning submissions could easily be altered to substantially improve performance. For Challenge 5, the genomescale Escherichia coli network, we evaluated a variety of measures of association. These data are troublesome, and no good solutions could be produced, either by us or by any other teams. Our best results were obtained when analyzing subdatasets instead of considering the dataset as a whole.
Increasing feasibility of optimal gene network estimation
2004
Disentangling networks of regulation of gene expression is a major challenge in the field of computational biology. Harvesting the information contained in microarray data sets is a promising approach towards this challenge. We propose an algorithm for the optimal estimation of Bayesian networks from microarray data, which reduces the CPU time and memory consumption of previous algorithms. We prove that the space complexity can be reduced from O(n(2) x 2(n)) to O(2(n)), and that the expected calculation time can be reduced from O(n(2) x 2(n)) to O(n x 2(n)), where n is the number of genes. We make intrinsic use of a limitation of the maximal number of regulators of each gene, which has biological as well as statistical justifications. The improvements are significant for some applications in research.
A survey of models for inference of gene regulatory networks
Nonlinear Analysis: Modelling and Control, 2013
In this article, I present the biological backgrounds of microarray, ChIP-chip and ChIPSeq technologies and the application of computational methods in reverse engineering of gene regulatory networks (GRNs). The most commonly used GRNs models based on Boolean networks, Bayesian networks, relevance networks, differential and difference equations are described. A novel model for integration of prior biological knowledge in the GRNs inference is presented, too. The advantages and disadvantages of the described models are compared. The GRNs validation criteria are depicted. Current trends and further directions for GRNs inference using prior knowledge are given at the end of the paper.