Analyzing the Effect of Prior Knowledge in Genetic Regulatory Network Inference (original) (raw)
Related papers
2012
Enabled by recent advances in bioinformatics, the inference of gene regulatory networks (GRNs) from gene expression data has garnered much interest from researchers. This is due to the need of researchers to understand the dynamic behavior and uncover the vast information lay hidden within the networks. In this regard, dynamic Bayesian network (DBN) is extensively used to infer GRNs due to its ability to handle time-series microarray data and modeling feedback loops. However, the efficiency of DBN in inferring GRNs is often hampered by missing values in expression data, and excessive computation time due to the large search space whereby DBN treats all genes as potential regulators for a target gene. In this paper, we proposed a DBN-based model with missing values imputation to improve inference efficiency, and potential regulators detection which aims to lessen computation time by limiting potential regulators based on expression changes. The performance of the proposed model is assessed by using time-series expression data of yeast cell cycle. The experimental results et al.
Bayesian Network Approach to Estimate Gene Networks
Concepts, Methodologies, Tools, and Applications
In cells, genes interact with each other and this system can be viewed as directed graphs. A gene network is a graphical representation of transcriptional relations between genes and the problem of estimation of gene networks from genome-wide data, such as DNA microarray gene expression data, is one of the important issues in bioinformatics and systems biology. Here, we present a statistical method based on Bayesian networks to estimate gene networks from microarray data and other biological data. Because microarray data are measured as continuous variables and the relationship between genes are usually nonlinear, we combine Bayesian networks and nonparametric regression to handle continuous variables and nonlinear relations. Most parts of gene networks are still unknown, and we need to estimate them from observational data. This problem is equivalent to the structural learning of Bayesian networks, and we solve it from a Bayes approach. The main difficulty of gene network estimatio...
Statistical Applications in Genetics and Molecular Biology, 2007
There have been various attempts to reconstruct gene regulatory networks from microarray expression data in the past. However, owing to the limited amount of independent experimental conditions and noise inherent in the measurements, the results have been rather modest so far. For this reason it seems advisable to include biological prior knowledge, related, for instance, to transcription factor binding locations in promoter regions or partially known signalling pathways from the literature. In the present paper, we consider a Bayesian approach to systematically integrate expression data with multiple sources of prior knowledge. Each source is encoded via a separate energy function, from which a prior distribution over network structures in the form of a Gibbs distribution is constructed. The hyperparameters associated with the different sources of prior knowledge, which measure the influence of the respective prior relative to the data, are sampled from the posterior distribution with MCMC. We have evaluated the proposed scheme on the yeast cell cycle and the Raf signalling pathway. Our findings quantify to what extent the inclusion of independent prior knowledge improves the network reconstruction accuracy, and the values of the hyperparameters inferred with the proposed scheme were found to be close to optimal with respect to minimizing the reconstruction error.
Current Bioinformatics, 2014
In the post-genome era, designing and conducting novel experiments have become increasingly common for modern researchers. However, the major challenge faced by researchers is surprisingly not the complexity in designing new experiments or obtaining the data generated from the experiments, but instead it is the huge amount of data to be processed and analyzed in the quest to produce meaningful information and knowledge. Gene regulatory network (GRN) inference from gene expression data is one of the common examples of such challenge. Over the years, GRN inference has witnessed a number of transitions, and an increasing amount of new computational and statistical-based methods have been applied to automate the procedure. One of the widely used approaches for GRN inference is the dynamic Bayesian network (DBN). In this review paper, we first discuss the evolution of molecular biology research from reductionism to holism. This is followed by a brief insight on various computational and statistical methods used in GRN inference before focusing on reviewing the current development and applications of DBN-based methods. Chai et al. Category Inference Model Logical models Boolean networks Probabilistic Boolean networks [30, 31] Bayesian networks Continuous models Continuous linear models [32] Dynamic Bayesian networks Ordinary differential equations Regulated flux balance analysis [33] Single-molecule level Stochastic simulation algorithm [34] Inferring Gene Regulatory Networks
A Bayesian regression approach to the inference of regulatory networks from gene expression data
Bioinformatics/computer Applications in The Biosciences, 2005
Motivation: There is currently much interest in reverse-engineering regulatory relationships between genes from microarray expression data. We propose a new algorithmic method for inferring such interactions between genes using data from gene knockout experiments. The algorithm we use is the Sparse Bayesian regression algorithm of Tipping and Faul. This method is highly suited to this problem as it does not require the data to be discretized, overcomes the need for an explicit topology search and, most importantly, requires no heuristic thresholding of the discovered connections. Results: Using simulated expression data, we are able to show that this algorithm outperforms a recently published correlation-based approach. Crucially, it does this without the need to set any ad hoc threshold on possible connections. Availability: Matlab code which allows all experimental results to be reproduced is available at
Using Bayesian network inference algorithms to recover molecular genetic regulatory networks
… Conference on Systems …, 2002
Recent advances in high-throughput molecular biology has motivated in the field of bioinformatics the use of network inference algorithms to predict causal models of molecular networks from correlational data. However, it is extremely difficult to evaluate the effectiveness of these algorithms because we possess neither the knowledge of the correct biological networks nor the ability to experimentally validate the hundreds of predicted gene interactions within a reasonable amount of time. Here, we apply a new approach developed by Smith, et al. (2002) that tests the ability of network inference algorithms to accurately and efficiently recover network structures based on gene expression data taken from a simulated biological pathway in which the structure is known a priori. We simulated a genetic regulatory network and used the resultant sampled data to test variations in the design of a Bayesian Network inference algorithm, as well as variations in total quantity of available data, length of sampling interval, method of data discretization, and presence of interpolated data between observed data points. We also advanced the inference algorithm by developing a heuristic influence score that infers the strength and sign of regulation (up or down) between genes. In these experiments, we found that our inference algorithm worked best when presented with data discretized into three categories, when using a greedy search algorithm with random restarts, and when evaluating networks using the BDe scoring metric. Under these conditions, the algorithm was both accurate and efficient in recovering the simulated molecular network when the sampled data sets were large. Under more biologically reasonable small amounts of sampled data, the algorithm worked best only when interpolated data was included, but had difficulty recovering relationships describing genes with more than one regulatory influence. These results suggest that network inference algorithms and sampling methods must be carefully designed and tested before they can be used to recover biological genetic pathways, especially in the context of highly limited quantities of data.
Gene networks inference using dynamic Bayesian networks
2003
This article deals with the identification of gene regulatory networks from experimental data using a statistical machine learning approach. A stochastic model of gene interactions capable of handling missing variables is proposed. It can be described as a dynamic Bayesian network particularly well suited to tackle the stochastic nature of gene regulation and gene expression measurement. Parameters of the model are learned through a penalized likelihood maximization implemented through an extended version of EM algorithm.
Biosystems, 2018
The study of biological systems at a system level has become a reality due to the increasing powerful computational approaches able to handle increasingly larger datasets. Uncovering the dynamic nature of gene regulatory networks in order to attain a system level understanding and improve the predictive power of biological models is an important research field in systems biology. The task itself presents several challenges, since the problem is of combinatorial nature and highly depends on several biological constraints and also the intended application. Given the intrinsic interdisciplinary nature of gene regulatory network inference, we present a review on the currently available approaches, their challenges and limitations. We propose guidelines to select the most appropriate method considering the underlying assumptions and fundamental biological and data constraints.
Using Bayesian Networks to Construct Gene Regulatory Networks from Microarray Data
Jurnal teknologi, 2012
"In this research, Bayesian network is proposed as the model to construct gene regulatory networks from Saccharomyces cerevisiae cell-cycle gene expression dataset and Escherichia coli dataset due to its capability of handling microarray datasets with missing values. The goal of this research is to study and to understand the framework of the Bayesian networks, and then to construct gene regulatory networks from Saccharomyces cerevisiae cell-cycle gene expression dataset and Escherichia coli dataset by developing Bayesian networks using hill-climbing algorithm and Efron’s bootstrap approach and then the performance of the constructed gene networks of Saccharomyces cerevisiae are evaluated and are compared with the previously constructed sub-networks by Dejori [14]. At the end of this research, the gene networks constructed for Saccharomyces cerevisiae not only have achieved high True Positive Rate (more than 90%), but the networks constructed also have discovered more potential interactions between genes. Therefore, it can be concluded that the performance of the gene regulatory networks constructed using Bayesian networks in this research is proved to be better because it can reveal more gene relationships."
Journal of Multimedia, 2007
Reverse engineering of genetic regulatory networks from time series microarray data are investigated. We propose a dynamic Bayesian networks (DBNs) modeling and a full Bayesian learning scheme. The proposed DBN directly models the continuous expression levels and also is associated with parameters that indicate the degree as well as the type of regulations. To learn the network from data, we proposed a reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. The RJMCMC algorithm can provide not only more accurate inference results than the deterministic alternative algorithms but also an estimate of the a posteriori probabilities (APPs) of the network topology. The estimated APPs provide useful information on the confidence of the inferred results and can also be used for efficient Bayesian data integration. The proposed approach is tested on yeast cell cycle microarray data and the results are compared with the KEGG pathway map.