Zhang, C. & DeLisi, C. Protein folds: molecular systematics in three dimensions. Cell. Mol. Life Sci.58, 72–79 (2001). ArticleCAS Google Scholar
Rost, B. Did evolution leap to create the protein universe? Curr. Opin. Struct. Biol.12, 409–416 (2002). ArticleCAS Google Scholar
Dayhoff, M. The origin and evolution of protein superfamilies. Fed. Proc.35, 2132–2138 (1976). CASPubMed Google Scholar
Dayhoff, M. O., Barker, W. C. & Hunt, L. T. Establishing homologies in protein sequences. Methods Enzymol.91, 524–545 (1983). ArticleCAS Google Scholar
Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol.247, 536–540 (1995). CASPubMed Google Scholar
Murzin, A. G. Structural classification of proteins: new superfamilies. Curr. Opin. Struct. Biol.6, 386–394 (1996). ArticleCAS Google Scholar
Orengo, C. A. et al. CATH—a hierarchic classification of protein domain structures. Structure5, 1093–1108 (1997). ArticleCAS Google Scholar
Todd, A. E., Orengo, C. A. & Thornton, J. M. Evolution of function in protein superfamilies, from a structural perspective. J. Mol. Biol.307, 1113–1143 (2001). ArticleCAS Google Scholar
Lo Conte, L., Brenner, S. E., Hubbard, T. J., Chothia, C. & Murzin, A. G. SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res.30, 264–267 (2002). ArticleCAS Google Scholar
Orengo, C. A. et al. The CATH protein family database: a resource for structural and functional annotation of genomes. Proteomics2, 11–21 (2002). ArticleCAS Google Scholar
Branden, C.-I & Tooze, J. Introduction to Protein Structure (Garland Publishing, New York, 1999). Google Scholar
Anantharaman, V., Koonin, E. V. & Aravind, L. Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res.30, 1427–1464 (2002). ArticleCAS Google Scholar
Anantharaman, V., Koonin, E. V. & Aravind, L. Regulatory potential, phyletic distribution and evolution of ancient, intracellular small-molecule-binding domains. J. Mol. Biol.307, 1271–1292 (2001). ArticleCAS Google Scholar
Saraste, M., Sibbald, P. R. & Wittinghofer, A. The P-loop—a common motif in ATP- and GTP-binding proteins. Trends Biochem. Sci.15, 430–434 (1990). Article Google Scholar
Koonin, E. V. A superfamily of ATPases with diverse functions containing either classical or deviant ATP-binding motif. J. Mol. Biol.229, 1165–1174 (1993). ArticleCAS Google Scholar
Aravind, L., Mazumder, R., Vasudevan, S. & Koonin, E. V. Trends in protein evolution inferred from sequence and structure analysis. Curr. Opin. Struct. Biol.12, 392–399 (2002). ArticleCAS Google Scholar
Galperin, M. Y., Walker, D. R. & Koonin, E. V. Analogous enzymes: independent inventions in enzyme evolution. Genome Res.8, 779–790 (1998). ArticleCAS Google Scholar
Martin, A. C. et al. Protein folds and functions. Structure6, 875–884 (1998). ArticleCAS Google Scholar
Fitch, W. M. Distinguishing homologous from analogous proteins. Syst. Zool.19, 99–113 (1970). ArticleCAS Google Scholar
Fitch, W. M. Homology a personal view on some of the problems. Trends Genet.16, 227–231 (2000). ArticleCAS Google Scholar
Tatusov, R. L., Koonin, E. V. & Lipman, D. J. A genomic perspective on protein families. Science278, 631–637 (1997). ArticleADSCAS Google Scholar
Tatusov, R. L., Galperin, M. Y., Natale, D. A. & Koonin, E. V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res.28, 33–36 (2000). ArticleCAS Google Scholar
Jordan, I. K., Makarova, K. S., Spouge, J. L., Wolf, Y. I. & Koonin, E. V. Lineage-specific gene expansions in bacterial and archaeal genomes. Genome Res.11, 555–565 (2001). ArticleCAS Google Scholar
Remm, M., Storm, C. E. & Sonnhammer, E. L. Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol.314, 1041–1052 (2001). ArticleCAS Google Scholar
Lespinet, O., Wolf, Y. I., Koonin, E. V. & Aravind, L. The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res.12, 1048–1059 (2002). ArticleCAS Google Scholar
Henikoff, S. et al. Gene families: the taxonomy of protein paralogs and chimeras. Science278, 609–614 (1997). ArticleADSCAS Google Scholar
Alexandrov, N. N. & Go, N. Biological meaning, statistical significance, and classification of local spatial similarities in nonhomologous proteins. Protein Sci.3, 866–875 (1994). ArticleCAS Google Scholar
Orengo, C. A., Jones, D. T. & Thornton, J. M. Protein superfamilies and domain superfolds. Nature372, 631–634 (1994). ArticleADSCAS Google Scholar
Zuckerkandl, E. The appearance of new structures and functions in proteins during evolution. J. Mol. Evol.7, 1–57 (1975). ArticleADSCAS Google Scholar
Chothia, C. One thousand families for the molecular biologist. Nature357, 543–544 (1992). ArticleADSCAS Google Scholar
Zhang, C. T. Relations of the numbers of protein sequences, families and folds. Protein Eng.10, 757–761 (1997). ArticleCAS Google Scholar
Wang, Z. X. A re-estimation for the total numbers of protein folds and superfamilies. Protein Eng.11, 621–626 (1998). ArticleCAS Google Scholar
Zhang, C. & DeLisi, C. Estimating the number of protein folds. J. Mol. Biol.284, 1301–1305 (1998). ArticleCAS Google Scholar
Govindarajan, S., Recabarren, R. & Goldstein, R. A. Estimating the total number of protein folds. Proteins35, 408–414 (1999). ArticleCAS Google Scholar
Wolf, Y. I., Grishin, N. V. & Koonin, E. V. Estimating the number of protein folds and families from complete genome data. J. Mol. Biol.299, 897–905 (2000). ArticleCAS Google Scholar
Coulson, A. F. & Moult, J. A unifold, mesofold, and superfold model of protein fold use. Proteins46, 61–71 (2002). ArticleCAS Google Scholar
Kuznetsov, V. A. in Computational and Statistical Approaches to Genomics (eds Zhang, W. & Shmulevich, I.) 125–171 (Kluwer, Boston, 2002). Google Scholar
Karev, G. P., Wolf, Y. I., Rzhetsky, A. Y., Berezovskaya, F. S. & Koonin, E. V. in Computational Genomics: from Sequence to Function (eds Galperin, M. Y. & Koonin, E. V.) (Horizon, Amsterdam, in the press).
Karev, G. P., Wolf, Y. I., Rzhetsky, A. Y., Berezovskaya, F. S. & Koonin, E. V. Birth and death of protein domains: a simple model of evolution explains power law behavior. BMC Evol. Biol. (in the press).
Huynen, M. A. & van Nimwegen, E. The frequency distribution of gene family sizes in complete genomes. Mol. Biol. Evol.15, 583–589 (1998). ArticleCAS Google Scholar
Qian, J., Luscombe, N. M. & Gerstein, M. Protein family and fold occurrence in genomes: power-law behaviour and evolutionary model. J. Mol. Biol.313, 673–681 (2001). ArticleCAS Google Scholar
Harrison, P. M. & Gerstein, M. Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J. Mol. Biol.318, 1155–1174 (2002). ArticleCAS Google Scholar
Luscombe, N., Qian, J., Zhang, Z., Johnson, T. & Gerstein, M. The dominance of the population by a selected few: power-law behaviour applies to a wide variety of genomic properties. Genome Biol.3, research0040.1–0040.7 (2002). Article Google Scholar
Bilke, S. & Peterson, C. Topological properties of citation and metabolic networks. Phys. Rev. E64, 036106-1–036106-5 (2001). ArticleADS Google Scholar
Barabasi, A. L. Linked: The New Science of Networks (Perseus, New York, 2002). Google Scholar
Albert, R. & Barabasi, A. L. Statistical mechanics of complex networks. Rev. Mod. Phys.74, 47–97 (2002). ArticleADSMathSciNet Google Scholar
Gisiger, T. Scale invariance in biology: coincidence or footprint of a universal mechanism? Biol. Rev. Camb. Phil. Soc.76, 161–209 (2001). ArticleCAS Google Scholar
Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. The large-scale organization of metabolic networks. Nature407, 651–654 (2000). ArticleADSCAS Google Scholar
Zipf, G. K. Human Behaviour and the Principle of Least Effort (Addison-Wesley, Boston, 1949). Google Scholar
Pareto, V. Cours d'Economie Politique (Rouge et Cie, Paris, 1897). Google Scholar
Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Barabasi, A. L. Hierarchical organization of modularity in metabolic networks. Science297, 1551–1555 (2002). ArticleADSCAS Google Scholar
Jeong, H., Mason, S. P., Barabasi, A. L. & Oltvai, Z. N. Lethality and centrality in protein networks. Nature411, 41–42 (2001). ArticleADSCAS Google Scholar
Li, H., Helling, R., Tang, C. & Wingreen, N. Emergence of preferred structures in a simple model of protein folding. Science273, 666–669 (1996). ArticleADSCAS Google Scholar
Li, H., Tang, C. & Wingreen, N. S. Are protein folds atypical? Proc. Natl Acad. Sci. USA95, 4987–4990 (1998). ArticleADSCAS Google Scholar
Rzhetsky, A. & Gomez, S. M. Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genome. Bioinformatics17, 988–996 (2001). ArticleCAS Google Scholar
Yule, G. U. A mathematical theory of evolution, based on the conclusions of Dr. J.C. Willis, F.R.S. Phil. Trans. R. Soc. Lond. B213, 21–87 (1924). Article Google Scholar
Gould, S. J. The Structure of Evolutionary Theory (Harvard Univ. Press, Cambridge, MA, 2002). Book Google Scholar
Doolittle, W. F. Phylogenetic classification and the universal tree. Science284, 2124–2129 (1999). ArticleCAS Google Scholar
Doolittle, W. F. You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet14, 307–311 (1998). ArticleCAS Google Scholar
Koonin, E. V., Makarova, K. S. & Aravind, L. Horizontal gene transfer in prokaryotes: quantification and classification. Annu. Rev. Microbiol.55, 709–742 (2001). ArticleCAS Google Scholar
Ragan, M. A. Detection of lateral gene transfer among microbial genomes. Curr. Opin. Genet. Dev.11, 620–626 (2001). ArticleCAS Google Scholar
Marcotte, E. M. et al. Detecting protein function and protein-protein interactions from genome sequences. Science285, 751–753 (1999). ArticleCAS Google Scholar
Enright, A. J., Illopoulos, I., Kyrpides, N. C. & Ouzounis, C. A. Protein interaction maps for complete genomes based on gene fusion events. Nature402, 86–90 (1999). ArticleADSCAS Google Scholar
Galperin, M. Y. & Koonin, E. V. Who's your neighbor? New computational approaches for functional genomics. Nature Biotechnol.18, 609–613 (2000). ArticleCAS Google Scholar
Aravind, L. Guilt by association: contextual information in genome analysis. Genome Res.10, 1074–1077 (2000). ArticleCAS Google Scholar
Koonin, E. V., Aravind, L. & Kondrashov, A. S. The impact of comparative genomics on our understanding of evolution. Cell101, 573–576 (2000). ArticleCAS Google Scholar
Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature409, 860–921 (2001). ArticleADSCAS Google Scholar
Wolf, Y. I., Brenner, S. E., Bash, P. A. & Koonin, E. V. Distribution of protein folds in the three superkingdoms of life. Genome Res.9, 17–26 (1999). CASPubMed Google Scholar
Wuchty, S. Scale-free behavior in protein domain networks. Mol. Biol. Evol.18, 1694–1702 (2001). ArticleCAS Google Scholar
Apic, G., Gough, J. & Teichmann, S. A. An insight into domain combinations. Bioinformatics17 (Suppl. 1), S83–S89 (2001). Article Google Scholar
Bork, P. et al. A superfamily of conserved domains in DNA damage-responsive cell cycle checkpoint proteins. FASEB J.11, 68–76 (1997). ArticleCAS Google Scholar
Derbyshire, D. J. et al. Crystal structure of human 53BP1 BRCT domains bound to p53 tumour suppressor. EMBO J.21, 3863–3872 (2002). ArticleCAS Google Scholar
Vitkup, D., Melamud, E., Moult, J. & Sander, C. Completeness in structural genomics. Nature Struct. Biol.8, 559–566 (2001). ArticleCAS Google Scholar
Marchler-Bauer, A. et al. CDD: a database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res.30, 281–283 (2002). ArticleCAS Google Scholar