The outcomes of pathway database computations depend on pathway ontology - PubMed (original) (raw)
The outcomes of pathway database computations depend on pathway ontology
M L Green et al. Nucleic Acids Res. 2006.
Abstract
Different biological notions of pathways are used in different pathway databases. Those pathway ontologies significantly impact pathway computations. Computational users of pathway databases will obtain different results depending on the pathway ontology used by the databases they employ, and different pathway ontologies are preferable for different end uses. We explore differences in pathway ontologies by comparing the BioCyc and KEGG ontologies. The BioCyc ontology defines a pathway as a conserved, atomic module of the metabolic network of a single organism, i.e. often regulated as a unit, whose boundaries are defined at high-connectivity stable metabolites. KEGG pathways are on average 4.2 times larger than BioCyc pathways, and combine multiple biological processes from different organisms to produce a substrate-centered reaction mosaic. We compared KEGG and BioCyc pathways using genome context methods, which determine the functional relatedness of pairs of genes. For each method we employed, a pair of genes randomly selected from a BioCyc pathway is more likely to be related by that method than is a pair of genes randomly selected from a KEGG pathway, supporting the conclusion that the BioCyc pathway conceptualization is closer to a single conserved biological process than is that of KEGG.
Figures
Figure 1
(A) Pathway for biosynthesis of pantothenate from 2-keto-isovalerate. (B) Pathway for biosynthesis of coenzyme A from pantothenate. Of the 204 BioCyc organisms, 31 include the pathway for biosynthesis of coenzyme A, but lack the pantothenate biosynthesis pathway. Another 21 organisms include the pathway for pantothenate biosynthesis, but lack the coenzyme A biosynthesis pathway. A total of 108 organisms include both pathways, while 44 lack both pathways.
Figure 2
Predicted pathway for teichoic acid biosynthesis in S.meliloti. Only where enzyme names and gene names are shown has an enzyme catalyzing a reaction in this pathway been identified in S.meliloti. Of the 204 BioCyc organisms, 155 include the branch of the teichoic acid biosynthesis pathway that synthesizes UDP-
d
-glucose. Only 38 organisms include any of the remaining reactions in the pathway, and only 31 of these include reactions from both branches.
Figure 3
Size distribution of the KEGG and EcoCyc metabolic pathways.
Figure 4
Number of conserved gene neighbors. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.
Figure 5
Number of gene pairs with similar phylogenetic profiles. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.
Figure 6
Number of gene pairs occurring in a predicted gene cluster. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.
Figure 7
Number of gene pairs related by gene fusion events. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.
Figure 8
Number of conserved gene neighbors and similar phylogenetic profiles randomly selected from EcoCyc superpathways compared to standard EcoCyc pathways and KEGG E.coli metabolic maps.
Similar articles
- Expansion of the BioCyc collection of pathway/genome databases to 160 genomes.
Karp PD, Ouzounis CA, Moore-Kochlacs C, Goldovsky L, Kaipa P, Ahrén D, Tsoka S, Darzentas N, Kunin V, López-Bigas N. Karp PD, et al. Nucleic Acids Res. 2005 Oct 24;33(19):6083-9. doi: 10.1093/nar/gki892. Print 2005. Nucleic Acids Res. 2005. PMID: 16246909 Free PMC article. - Event ontology: a pathway-centric ontology for biological processes.
Kushida T, Takagi T, Fukuda KI. Kushida T, et al. Pac Symp Biocomput. 2006:152-63. Pac Symp Biocomput. 2006. PMID: 17094236 - The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases.
Caspi R, Foerster H, Fulcher CA, Kaipa P, Krummenacker M, Latendresse M, Paley S, Rhee SY, Shearer AG, Tissier C, Walk TC, Zhang P, Karp PD. Caspi R, et al. Nucleic Acids Res. 2008 Jan;36(Database issue):D623-31. doi: 10.1093/nar/gkm900. Epub 2007 Oct 27. Nucleic Acids Res. 2008. PMID: 17965431 Free PMC article. - The KEGG database.
Kanehisa M. Kanehisa M. Novartis Found Symp. 2002;247:91-101; discussion 101-3, 119-28, 244-52. Novartis Found Symp. 2002. PMID: 12539951 Review. - Ontology annotation: mapping genomic regions to biological function.
Thomas PD, Mi H, Lewis S. Thomas PD, et al. Curr Opin Chem Biol. 2007 Feb;11(1):4-11. doi: 10.1016/j.cbpa.2006.11.039. Epub 2007 Jan 5. Curr Opin Chem Biol. 2007. PMID: 17208035 Review.
Cited by
- e-Science and biological pathway semantics.
Luciano JS, Stevens RD. Luciano JS, et al. BMC Bioinformatics. 2007 May 9;8 Suppl 3(Suppl 3):S3. doi: 10.1186/1471-2105-8-S3-S3. BMC Bioinformatics. 2007. PMID: 17493286 Free PMC article. Review. - Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology.
Karp PD, Paley SM, Krummenacker M, Latendresse M, Dale JM, Lee TJ, Kaipa P, Gilham F, Spaulding A, Popescu L, Altman T, Paulsen I, Keseler IM, Caspi R. Karp PD, et al. Brief Bioinform. 2010 Jan;11(1):40-79. doi: 10.1093/bib/bbp043. Epub 2009 Dec 2. Brief Bioinform. 2010. PMID: 19955237 Free PMC article. - Pathway Analysis: State of the Art.
García-Campos MA, Espinal-Enríquez J, Hernández-Lemus E. García-Campos MA, et al. Front Physiol. 2015 Dec 17;6:383. doi: 10.3389/fphys.2015.00383. eCollection 2015. Front Physiol. 2015. PMID: 26733877 Free PMC article. Review. - Post-transcriptional knowledge in pathway analysis increases the accuracy of phenotypes classification.
Alaimo S, Giugno R, Acunzo M, Veneziano D, Ferro A, Pulvirenti A. Alaimo S, et al. Oncotarget. 2016 Aug 23;7(34):54572-54582. doi: 10.18632/oncotarget.9788. Oncotarget. 2016. PMID: 27275538 Free PMC article. - Effects of protein interaction data integration, representation and reliability on the use of network properties for drug target prediction.
Mora A, Donaldson IM. Mora A, et al. BMC Bioinformatics. 2012 Nov 12;13:294. doi: 10.1186/1471-2105-13-294. BMC Bioinformatics. 2012. PMID: 23146171 Free PMC article.
References
- Karp P.D., Paley S., Romero P. The pathway tools software. Bioinformatics. 2002;18:S225–S232. - PubMed
- Papin J.A., Stelling J., Price N.D., Klamt S., Schuster S., Palsson B.O. Comparison of network-based pathway analysis methods. Trends Biotechnol. 2004;22:400–405. - PubMed
Publication types
MeSH terms
Grants and funding
- R01 RR007861/RR/NCRR NIH HHS/United States
- R33 GM070065/GM/NIGMS NIH HHS/United States
- R01 RR007861-14/RR/NCRR NIH HHS/United States
- GM70065/GM/NIGMS NIH HHS/United States
- RR07861/RR/NCRR NIH HHS/United States
- R33 GM070065-01/GM/NIGMS NIH HHS/United States