Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods - PubMed (original) (raw)

Review

Constraining the metabolic genotype-phenotype relationship using a phylogeny of in silico methods

Nathan E Lewis et al. Nat Rev Microbiol. 2012.

Abstract

Reconstructed microbial metabolic networks facilitate a mechanistic description of the genotype-phenotype relationship through the deployment of constraint-based reconstruction and analysis (COBRA) methods. As reconstructed networks leverage genomic data for insight and phenotype prediction, the development of COBRA methods has accelerated following the advent of whole-genome sequencing. Here, we describe a phylogeny of COBRA methods that has rapidly evolved from the few early methods, such as flux balance analysis and elementary flux mode analysis, into a repertoire of more than 100 methods. These methods have enabled genome-scale analysis of microbial metabolism for numerous basic and applied uses, including antibiotic discovery, metabolic engineering and modelling of microbial community behaviour.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Fundamentals of the genome-scale metabolic genotype-phenotype relationship

The COBRA approach is based on three primary fundamental concepts: network constraints (a–d), objective functions (e), and the association of reactions with the genome. (a) A complex mixture of molecules (red) can react to yield end products (blue). (b) The stoichiometry of this reaction network is described mathematically in a stoichiometric matrix, with each column representing the stoichiometry of a reaction. Negative and positive values represent reactants and products, respectively. Reaction flux is limited by thermodynamics and catalytic capacities (Vm=Vmax), described by upper and lower bounds on flux for each reaction (green). (c) Reaction constraints result in a “solution space” that contains all feasible flux distributions. Additional constraints (e.g., mass balance, the steady-state assumption, and measured metabolite consumption rates) reduce the space of feasible flux distributions, as shown by the pink line. (d) In vivo biochemical networks involve additional complexity. Gene regulation can change the abundance of catalysts (e.g., the transformation of D to E). Often components are also localized in different organelles (e.g., E and F), thereby blocking reactions. (e) The biomass objective function describes an evolutionary pressure for microbial growth, and describes the metabolic demands to make basic metabolite building blocks for all cellular components (e.g., membranes, macromolecules, ATP, etc.). (f) The association of metabolism with the genome is done by mathematically linking the genome to transcripts, proteins, and chemical reactions. The gene-protein-reaction schema is used to describe gene association in the models, and provide an interface for the integration of high-throughput data.

Figure 2

Figure 2. The “phylogeny” of constraint-based modeling methods

Over the past years, the constraint-based modeling community has rapidly expanded. Because of the versatility and scalability of these models, more than 100 methods have been developed for their modeling and analysis, all based on the analysis of the underlying metabolic network structure (i.e., the stoichiometric matrix). A phylogenetic tree is used to depict the similarities between application and use of the methods, and the underlying algorithms for many of the methods. See Supplementary Table 1 for a more complete list of methods and descriptions of methods.

Figure 3

Figure 3. Flux balance analysis (FBA)

(a) In FBA, a cellular objective (e.g., biomass production) is optimized. This provides the predicted flux for each reaction in the network. (b) FBA solutions are typically not unique, i.e., there are alternate optimal solutions that use different pathways to achieve the same objective value (e.g., growth rate). (c) Additional constraints can be applied to reduce the solution space size, and may remove competing optimal solutions, or (d) change the optimal solution. If the optimal solution is moved, then the choice of the new optimal solution may depend on the solver and/or algorithm, as shown for the MOMA method. (e) The addition of constraints can enhance predictions. For example, when constraints on molecular crowding are added, the model-predicted order of substrate metabolism is consistent with experimental observation. Panel e reproduced from, Copyright 2007, National Academy of Sciences, USA. NTPs, nucleotide triphosphates; AAs, amino acids; FVA, flux variability analysis; v, reaction flux; μmax, predicted maximum growth rate.

Figure 4

Figure 4. Principles of model-guided strain design

(a) Non-growth-coupled production strains witness a decrease in product yield over time, while growth-coupled strains can enhance product yield. (b) Growth-coupled strain designs are predicted to force product secretion while growing optimally. Several methods have been developed to predict growth-coupled production strains by modeling reaction deletion, gene deletion, or reaction addition. Different reaction deletion algorithms, such as OptKnock, Objective tilting, and RobustKnock can provide different optimal growth-coupled strain designs, due to algorithmic differences. (d) Many algorithms predict the set of reactions that must be blocked to obtain a desired product. However, methods like OptGene and GDLS, provide a more realistic view by modeling genetic modifications, since some genes catalyze multiple reactions, and other reactions are spontaneous.

Figure 5

Figure 5. Refining thermodynamic constraints

Thermodynamic constraints in COBRA models can be refined. (a) For example, when a metabolic network is not adequately constrained, metabolites can cycle infinitely in loops. Akin to Kirchhoff’s loop law for electrical circuits, this property is thermodynamically infeasible. (b) Thus, methods like ll-FVA, which uses the loopless-COBRA constraints on flux variability analysis, are able to systematically remove these loops by adding a constraint that limits flux to the solution space regions that are not involved in these loops.

Figure 6

Figure 6. Incorporating and inferring regulation

(a) Signaling, transcription regulation, and metabolism are interlinked in the cell. Therefore integrating the networks may provide more holistic modeling of organisms. Two primary paradigms exist in COBRA modeling for integrating transcription regulation and metabolism. (b) Algorithms such as GIMME and MBA use high-throughput data and model simulations to identify which pathways are likely expressed and active in the cells when the data were sampled. This results in a tailored context-specific representation of the metabolic network. (c) Algorithms such as rFBA, iFBA, and SR-FBA incorporate detailed mathematical representations of the known molecular mechanisms of transcription regulation. These approaches contain binary regulatory logic that dictates, under a specific signal, which metabolic pathways are suppressed and cannot carry flux. (d) Hybrid methods, such as PROM are arising, in which transcriptomic data are used to infer the regulatory network. This allows for the elucidation of novel regulatory interactions and their immediate incorporation into model simulations. PROM also uses probabilistic measures to allow for a more continuous regulation of reaction flux. For example, Gene 2 is tightly regulated by a transcription factor (TF). Thus, when the TF is activated by a signal, reaction flux is more tightly constrained than Gene 1, which is only loosely regulated.

Figure 7

Figure 7. Integrating COBRA methods to study community interactions

COBRA methods are providing insight into the metabolic interactions in various types of microbial communities. (a) To study the mutualistic behavior of co-dependent mutant E. coli, researchers used MOMA to simulate synergistic growth of pairs of auxotrophic E. coli. (b) Shadow prices from FBA simulations of these pairs were used to compute cooperation efficiencies between strains, which were subsequently compared with measured fitness improvements. (c) Competition in communities was modeled using DMMM to understand how communities of Geobacter and Rhodoferax compete for resources, and how the demographics vary under different nutrient ratios, thereby affecting the efficiency of bioremediation efforts. Host-pathogen interactions between M. tuberculosis and a human macrophage were studied using COBRA. (d) While transcriptomic data were employed to build host-pathogen models at different stages of infection, the cellular objective of internalized M. tuberculosis is not known, so refinements to the objective function were predicted from transcriptomic data to account for changes in required amounts of compounds like lipids and amino acids (AAs). (e) This information was used to compute flux states of internalized M. tuberculosis with MCMC sampling. This demonstrated a suppression of central metabolism and activation of the glyoxylate shunt, represented here by enolase and isocitrate lyase, respectively. The role of communities in evolution has been studied using Reductive evolutionary simulation. In particular, this method predicted the minimal set of genes needed to for Buchnera to grow in the rich innards of the aphid. The predicted minimial gene sets (f) and temporal order of gene loss (g) were consistent with the gene content and phylogenetic structure of several Buchnera species.

Similar articles

Cited by

References

    1. Feist AM, Herrgard MJ, Thiele I, Reed JL, Palsson BO. Reconstruction of biochemical networks in microorganisms. Nat Rev Microbiol. 2009;7:129–143. This review provides the detailed concepts of metabolic network reconstruction. - PMC - PubMed
    1. Thiele I, Palsson BO. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protocols. 2010;5:93–121. - PMC - PubMed
    1. Henry CS, et al. High-throughput generation, optimization and analysis of genome-scale metabolic models. Nat Biotechnol. 2010;28:977–982. - PubMed
    1. Feist AM, Palsson BO. The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat Biotechnol. 2008;26:659–667. - PMC - PubMed
    1. Oberhardt MA, Palsson BO, Papin JA. Applications of genome-scale metabolic reconstructions. Mol Syst Biol. 2009;5:320. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources