Protein similarity networks reveal relationships among sequence, structure, and function within the Cupin superfamily - PubMed (original) (raw)

Protein similarity networks reveal relationships among sequence, structure, and function within the Cupin superfamily

Richard Uberto et al. PLoS One. 2013.

Abstract

The cupin superfamily is extremely diverse and includes catalytically inactive seed storage proteins, sugar-binding metal-independent epimerases, and metal-dependent enzymes possessing dioxygenase, decarboxylase, and other activities. Although numerous proteins of this superfamily have been structurally characterized, the functions of many of them have not been experimentally determined. We report the first use of protein similarity networks (PSNs) to visualize trends of sequence and structure in order to make functional inferences in this remarkably diverse superfamily. PSNs provide a way to visualize relatedness of structure and sequence among a given set of proteins. Structure- and sequence-based clustering of cupin members reflects functional clustering. Networks based only on cupin domains and networks based on the whole proteins provide complementary information. Domain-clustering supports phylogenetic conclusions that the N- and C-terminal domains of bicupin proteins evolved independently. Interestingly, although many functionally similar enzymatic cupin members bind the same active site metal ion, the structure and sequence clustering does not correlate with the identity of the bound metal. It is anticipated that the application of PSNs to this superfamily will inform experimental work and influence the functional annotation of databases.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Structures of representative of members of the cupin superfamily.

A. oxalate oxidase (PDB code: 2et1) , B. oxalate decarboxylase (PDB code: 1uw8) , C. seed storage protein Ara h (PDB code: 3s7i) , D. NovW, a 4-keto-6-deoxy sugar epimerase (PDB code: 2c0z) , E. cysteine dioxygenase (PDB code: 2q4s) , F. phosphomannose isomerase (PDB code: 1 pmi) G. acireductone dioxygenase (PDB code: 1 zrr) H. taurine/alpha-ketoglutarate dioxygenase (PDB code: 1os7) , I. hypoxia-inducible factor 1-alpha inhibitor (PBD code: 2y0i) , J. lysine-specific demethylase 6B (PDB code: 2 xue) . β-sheets are shown in green, α-helices are shown in red, and random coils are shown in grey. Spheres represent bound metal ions. Figures were generated using Pymol (The PyMOL Molecular Graphics System, Schrödinger, LLC).

Figure 2

Figure 2. Metal-binding sites of representative members of the cupin superfamily.

A. Mn ion of oxalate oxidase coordinated by His88, His90, Glu95, and His137 (PDB code: 2et1) , B. N-terminal Mn ion of oxalate decarboxylase coordinated by His95, His97, Glu101, and His140 (PDB code: 1uw8) , C. C-terminal Mn ion of oxalate decarboxylase coordinated by His273, His275, Glu280, and His319 (PDB code: 1uw8) , D. Ni ion of cysteine dioxygenase coordinated by His86, His88, and His140 (PDB code: 2q4s) , E. Zn ion of phosphomannose isommerase coordinated by Gln111, His113, Glu138, and His285 (PDB code: 1 pmi) F. Ni ion of acireductone dioxygenase coordinated by His96, His98, Glu102, and His140 (PDB code: 1 zrr) , G. Fe ion of taurine/alphaketoglutarate dioxygenase coordinated by His99, Asp101, and His255 (PDB code: 1os7) , H. Fe ion of hypoxia-inducible factor 1-alpha inhibitor coordinated by His199, Asp201, and His279 (PBD code: 2y0i) , I. Fe ion of lysine-specific demethylase 6B coordinated by His1390, Glu1392, and His1470 (PDB code: 2 xue) , J. Zn ion of lysine-specific demethylase 6B coordinated by Cys1575, Cys1578, Cys1602, and Cys1605 (not part of the cupin domain) (PDB code: 2 xue) . β-sheets are shown in green, α-helices are shown in red, and random coils are shown in grey. Spheres represent bound metal ions. Figures were generated using Pymol (The PyMOL Molecular Graphics System, Schrödinger, LLC).

Figure 3

Figure 3. Structure similarity networks of the cupin protein stuctures colored by metal ligand.

Pairwise similarities for a non-redundant set of 183 structures from the Pfam cupin clan (CL0029) were calculated using TM-align . Each node represents a structure. Nodes were arranged using the yfiles organic layout of of Cytoscape version 3.0. A. Edges between nodes were drawn only if the average TM-score >0.53 for that edge. At this cutoff, the average r.m.s.d. is 2.91 Å with an average of 158.0 Cα atoms aligned. B. Edges between nodes were drawn only if the average TM-score >0.65 for that edge. At this cutoff, the average r.m.s.d. is 2.44 Å with an average of 185.4 Cα atoms aligned.

Figure 4

Figure 4. Structure similarity networks of cupin domains colored by metal ligand.

Pairwise similarities for a non-redundant set of 213 domains from the Pfam cupin clan (CL0029) were calculated using TM-align . Each node represents a domain. Nodes were arranged using the yfiles organic layout of of Cytoscape version 3.0. A. Edges between nodes were drawn only if the average TM-score >0.53 for that edge. At this cutoff, the average r.m.s.d. is 1.73 Å with 74.2 Cα atoms aligned. B. Edges between nodes were drawn only if the average TM-score >0.65 for that edge. At this cutoff, the average r.m.s.d. is 1.42 Å with 80.1 Cαatoms aligned.

Figure 5

Figure 5. Structure similarity networks colored by species and function.

This is the same network as in 3B but colored according to A. species and B. function.

Figure 6

Figure 6. Sequence similarity networks colored by metal ligand.

Networks were generated by all-by all BLAST comparisons of the 183 sequences corresponding to the unique cupin structures shown in Figure 3. Nodes were arranged using the yfiles organic layout of of Cytoscape version 3.0. A. Edges between nodes are drawn only if the E-value is better than of 1E-3.5. At this cutoff, edges at this threshold represent alignments with a median 32.1% identity over 93 residues. B. Edges between nodes are drawn only if the E-value is better than of 1E-6.0. At this cutoff, edges at this threshold represent alignments with a median 36.2% identity over 185 residues.

Similar articles

Cited by

References

    1. Dunwell JM, Purvis A, Khuri S (2004) Cupins: the most functionally diverse protein superfamily? Phytochemistry 65: 7–17. - PubMed
    1. Dunwell JM, Khuri S, Gane PJ (2000) Microbial relatives of the seed storage proteins of higher plants: conservation of structure and diversification of function during evolution of the cupin superfamily. Microbiol Mol Biol Rev 64: 153–179. - PMC - PubMed
    1. Dunwell JM, Gane PJ (1998) Microbial relatives of seed storage proteins: conservation of motifs in a functionally diverse superfamily of enzymes. J Mol Evol 46: 147–154. - PubMed
    1. Dunwell JM (1998) Cupins: a new superfamily of functionally diverse proteins that include germins and plant storage proteins. Biotechnol Genet Eng Rev 15: 1–32. - PubMed
    1. Dunwell JM, Culham A, Carter CE, Sosa-Aguirre CR, Goodenough PW (2001) Evolution of functional diversity in the cupin superfamily. Trends Biochem Sci 26: 740–746. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

This work was supported by the National Science Foundation (MCB-1041912) to EWM (http://www.nsf.gov). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

LinkOut - more resources