Combining Probabilistic Graphical Model-based and Knowledge-based Methods for Automatic Reconstruction of Metabolic Pathways (original) (raw)

Reconstruction of metabolic pathways by combining probabilistic graphical model-based and knowledge-based methods

BMC proceedings, 2014

Automatic reconstruction of metabolic pathways for an organism from genomics and transcriptomics data has been a challenging and important problem in bioinformatics. Traditionally, known reference pathways can be mapped into an organism-specific ones based on its genome annotation and protein homology. However, this simple knowledge-based mapping method might produce incomplete pathways and generally cannot predict unknown new relations and reactions. In contrast, ab initio metabolic network construction methods can predict novel reactions and interactions, but its accuracy tends to be low leading to a lot of false positives. Here we combine existing pathway knowledge and a new ab initio Bayesian probabilistic graphical model together in a novel fashion to improve automatic reconstruction of metabolic networks. Specifically, we built a knowledge database containing known, individual gene / protein interactions and metabolic reactions extracted from existing reference pathways. Known...

Prediction of metabolic pathways from genome-scale metabolic networks

Bio Systems, 2011

The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document and are linked to publications on ResearchGate, letting you access and read them immediately. This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues.

Metabolic Pathway Extraction Using Combined Probabilistic Models

International Journal of Bio-Science and Bio-Technology, 2012

Extracting metabolic pathway from microarray gene expression data that dictates a specific biological response is currently one of the important disciplines in system biology research. However due to the complexity of the global metabolic network and the importance to maintain the biological structure, this has become a greater challenge. Previous methods have successfully identified those pathways but without concerning the genetic effect and relationship of the genes, representation of the underlying structure is not precise and cannot be justified to be significant biologically. In this article, probabilistic models that are capable of identifying the significant pathways through metabolic networks related to a specific biological response are implemented. This article utilized combination of two probabilistic models to address the limitations of previous methods with the annotation to pathway database to ensure the pathway is biologically plausible.

Reconstruction of Metabolic Networks Using Incomplete Information

1995

This paper describes an approach that uses methods for automated sequence analysis and multiple databases accessed through an object+attribute view of the data (Baehr ete/. 1992), together with metabolic pathways, reaction equations, and compounds parsed into a logical representation from the Enzyme and Metabolic Pathway Database (Selkov, Yunus, ~z et.aL 1994), as the sources of data for automatically reconstructing a weighted pa~ial metabolic network for a prokaxyotic organiRm. Additional information can be provided interactivdy by the expert user to guide z~onstruction.

Bayesian Integrative Modeling of Genome-Scale Metabolic and Regulatory Networks

Informatics

The integration of high-throughput data to build predictive computational models of cellular metabolism is a major challenge of systems biology. These models are needed to predict cellular responses to genetic and environmental perturbations. Typically, this response involves both metabolic regulations related to the kinetic properties of enzymes and a genetic regulation affecting their concentrations. Thus, the integration of the transcriptional regulatory information is required to improve the accuracy and predictive ability of metabolic models. Integrative modeling is of primary importance to guide the search for various applications such as discovering novel potential drug targets to develop efficient therapeutic strategies for various diseases. In this paper, we propose an integrative predictive model based on techniques combining semantic web, probabilistic modeling, and constraint-based modeling methods. We applied our approach to human cancer metabolism to predict in silico ...

Inference of pathways from metabolic networks by subgraph extraction

… international workshop on …, 2007

In this work, we present different algorithmic approaches to the inference of metabolic pathways from metabolic networks. Metabolic pathway inference can be applied to uncover the biological function of sets of co-expressed, enzyme-coding genes. We compare the kWalks algorithm based on random walks and an alternative approach relying on k-shortest paths. We study the influence of various parameters on the pathway inference accuracy, which we measure on a set of 71 reference metabolic pathways. The results illustrate that kWalks is significantly faster and has a higher sensitivity but the positive predictive value is better for the pair-wise k-shortest path algorithm. This finding motivated the design of a hybrid approach, which reaches an average accuracy of 72% for the given set of reference pathways.

Substructure Analysis of Metabolic Pathways by Graph-Based Relational Learning

Biomedical Data and Applications, 2009

Systems biology has become a major field of post-genomic bioinformatics research. A biological network containing various objects and their relationships is a fundamental way to represent a bio-system. A graph consisting of vertices and edges between these vertices is a natural data structure to represent biological networks. Substructure analysis of metabolic pathways by graph-based relational learning provides us biologically meaningful substructures for system-level understanding of organisms. This chapter presents a graph representation of metabolic pathways to describe all features of metabolic pathways and describes the application of graph-based relational learning for structure analysis on metabolic pathways in both supervised and unsupervised scenarios. We show that the learned substructures can not only distinguish between two kinds of biological networks and generate hierarchical clusters for better understanding of them, but also have important biological meaning.

A new algorithm for Predicting Metabolic Pathways

The reconstruction of the metabolic network of an organism based on its genome sequence is a key challenge in systems biology. The aim of the work described here is to develop a new algorithm to predict pathway classes and individual pathways for a previously unknown query molecule. The main idea is to use a dense graph, where the compounds are represented as vertices and the enzymes are represented as edges, the weights are assigned to the edges according to the previous known pathways. The shortest path algorithm is applied for each missing enzyme in a pathway. A pathway is considered belong to an organism if the total cost between the initial and final compound is higher than a threshold. Validation experiments show that the suggested algorithm is capable to classify more than 90% of pathways correctly.

Predicting Metabolic Pathways by Sub-network Extraction

Methods in molecular biology (Clifton, NJ), 2012

Various methods result in groups of functionally related genes obtained from genomes (operons, regulons, syntheny groups, and phylogenetic profiles), transcriptomes (co-expression groups) and proteomes (modules of interacting proteins). When such groups contain two or more enzyme-coding genes, graph analysis methods can be applied to extract a metabolic pathway that interconnects them.

An Algorithm to Assemble Gene-Protein-Reaction Associations for Genome-Scale Metabolic Model Reconstruction

Lecture Notes in Computer Science, 2012

The considerable growth in the number of sequenced genomes and recent advances in Bioinformatics and Systems Biology fields have provided several genome-scale metabolic models (GSMs) that have been used to provide phenotype simulation methods. Given their importance in biomedical research and biotechnology applications (e.g. in Metabolic Engineering efforts), several workflows and computational platforms have been proposed for GSM reconstruction. One of the challenges of these methods is related to the assignment of gene-protein-reaction (GPR) associations that allow to add transcriptional/ translational information to GSMs, a task typically addressed through manual literature curation. This work proposes a novel algorithm to create a set of GPR rules, based on the integration of the information provided by the genome annotation with information on protein composition and function (protein complexes, sub-units, iso-enzymes, etc.) provided by the UniProt database. The methods are validated by using two state-of-the-art models for E. coli and S. cerevisiae, with competitive results.