The outcomes of pathway database computations depend on pathway ontology - PubMed (original) (raw)

The outcomes of pathway database computations depend on pathway ontology

M L Green et al. Nucleic Acids Res. 2006.

Abstract

Different biological notions of pathways are used in different pathway databases. Those pathway ontologies significantly impact pathway computations. Computational users of pathway databases will obtain different results depending on the pathway ontology used by the databases they employ, and different pathway ontologies are preferable for different end uses. We explore differences in pathway ontologies by comparing the BioCyc and KEGG ontologies. The BioCyc ontology defines a pathway as a conserved, atomic module of the metabolic network of a single organism, i.e. often regulated as a unit, whose boundaries are defined at high-connectivity stable metabolites. KEGG pathways are on average 4.2 times larger than BioCyc pathways, and combine multiple biological processes from different organisms to produce a substrate-centered reaction mosaic. We compared KEGG and BioCyc pathways using genome context methods, which determine the functional relatedness of pairs of genes. For each method we employed, a pair of genes randomly selected from a BioCyc pathway is more likely to be related by that method than is a pair of genes randomly selected from a KEGG pathway, supporting the conclusion that the BioCyc pathway conceptualization is closer to a single conserved biological process than is that of KEGG.

PubMed Disclaimer

Figures

Figure 1

Figure 1

(A) Pathway for biosynthesis of pantothenate from 2-keto-isovalerate. (B) Pathway for biosynthesis of coenzyme A from pantothenate. Of the 204 BioCyc organisms, 31 include the pathway for biosynthesis of coenzyme A, but lack the pantothenate biosynthesis pathway. Another 21 organisms include the pathway for pantothenate biosynthesis, but lack the coenzyme A biosynthesis pathway. A total of 108 organisms include both pathways, while 44 lack both pathways.

Figure 2

Figure 2

Predicted pathway for teichoic acid biosynthesis in S.meliloti. Only where enzyme names and gene names are shown has an enzyme catalyzing a reaction in this pathway been identified in S.meliloti. Of the 204 BioCyc organisms, 155 include the branch of the teichoic acid biosynthesis pathway that synthesizes UDP-

d

-glucose. Only 38 organisms include any of the remaining reactions in the pathway, and only 31 of these include reactions from both branches.

Figure 3

Figure 3

Size distribution of the KEGG and EcoCyc metabolic pathways.

Figure 4

Figure 4

Number of conserved gene neighbors. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.

Figure 5

Figure 5

Number of gene pairs with similar phylogenetic profiles. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.

Figure 6

Figure 6

Number of gene pairs occurring in a predicted gene cluster. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.

Figure 7

Figure 7

Number of gene pairs related by gene fusion events. Each pair of genes is selected randomly from a single KEGG metabolic map or from a single BioCyc metabolic pathway.

Figure 8

Figure 8

Number of conserved gene neighbors and similar phylogenetic profiles randomly selected from EcoCyc superpathways compared to standard EcoCyc pathways and KEGG E.coli metabolic maps.

Similar articles

Cited by

References

    1. Kanehisa M., Goto S., Kawashima S., Okuno Y., Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004;32:D277–D280. - PMC - PubMed
    1. Karp P.D., Ouzounis C.A., Moore-Kochlacs C., Goldovsky L., Kaipa P., Ahren D., Tsoka S., Darzentas N., Kunin V., Lopez-Bigas N. Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res. 2005;33:6083–6089. - PMC - PubMed
    1. Karp P.D., Paley S., Romero P. The pathway tools software. Bioinformatics. 2002;18:S225–S232. - PubMed
    1. Bowers P., Pellegrini M., Thompson M., Fierro J., Yeates T., Eisenberg D. Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol. 2004;5:R35. - PMC - PubMed
    1. Papin J.A., Stelling J., Price N.D., Klamt S., Schuster S., Palsson B.O. Comparison of network-based pathway analysis methods. Trends Biotechnol. 2004;22:400–405. - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources