Accurate prediction of secreted substrates and identification of a conserved putative secretion signal for type III secretion systems - PubMed (original) (raw)

Accurate prediction of secreted substrates and identification of a conserved putative secretion signal for type III secretion systems

Ram Samudrala et al. PLoS Pathog. 2009 Apr.

Abstract

The type III secretion system is an essential component for virulence in many Gram-negative bacteria. Though components of the secretion system apparatus are conserved, its substrates--effector proteins--are not. We have used a novel computational approach to confidently identify new secreted effectors by integrating protein sequence-based features, including evolutionary measures such as the pattern of homologs in a range of other organisms, G+C content, amino acid composition, and the N-terminal 30 residues of the protein sequence. The method was trained on known effectors from the plant pathogen Pseudomonas syringae and validated on a set of effectors from the animal pathogen Salmonella enterica serovar Typhimurium (S. Typhimurium) after eliminating effectors with detectable sequence similarity. We show that this approach can predict known secreted effectors with high specificity and sensitivity. Furthermore, by considering a large set of effectors from multiple organisms, we computationally identify a common putative secretion signal in the N-terminal 20 residues of secreted effectors. This signal can be used to discriminate 46 out of 68 total known effectors from both organisms, suggesting that it is a real, shared signal applicable to many type III secreted effectors. We use the method to make novel predictions of secreted effectors in S. Typhimurium, some of which have been experimentally validated. We also apply the method to predict secreted effectors in the genetically intractable human pathogen Chlamydia trachomatis, identifying the majority of known secreted proteins in addition to providing a number of novel predictions. This approach provides a new way to identify secreted effectors in a broad range of pathogenic bacteria for further experimental characterization and provides insight into the nature of the type III secretion signal.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Accurate identification of type III secreted effectors using sequence data.

The sensitivity (TP/(TP+FN); solid lines) and specificity (TN/(FP+TN); dashed lines) of SIEVE on S. Typhimurium predictions (PSY to STM model; red) and P. syringae (STM to PSY model; blue) effectors were calculated as a function of a SIEVE score threshold (X axis). The results show that both models perform well providing a maximum sensitivity and specificity at about 90%. For example 33 of 36 known S. Typhimurium effectors are in the top 10% of predictions.

Figure 2

Figure 2. Delineating the length of the type III secretion signal.

A. The performance of SIEVE on S. Typhimurium (PSY to STM model; red) and P. syringae (STM to PSY model; blue) was evaluated using the ROC area under the curve metric described in the text (Y axes). Models were trained using the indicated number of residues from the N-termini of the examples (X axis) and tested on the complete testing set (i.e. the entire set of positive and negative examples from the other organism). Maximum performance of both models was at approximately 30 residues (asterisks) suggesting that this might be the maximum length of a secretion signal. B. From the analysis in panel A we calculated the difference from the maximum ROC value (at 29 for the PSY to STM model and 32 for the STM to PSY model) for each length sequence and divided this by the standard error (difference from maximum, Y axis) for that sequence length (X axis). This shows the significance of each sequence length, with values below 2.0 (grey area) having insignificant differences (as judged using standard error). For S. Typhimurium effectors (PSY to STM model) the longest sequence length that is significantly different from the maximum value is 21 residues and for the P. syringae effectors (STM to PSY model) it is 16 residues. These lengths agree generally with previous estimates of secretion signal length.

Figure 3

Figure 3. Identification of a shared sequence motif in type III secreted effectors.

We identified the features (sequence locations and residue types) with the greatest ability to classify S. Typhimurium and P. syringae secreted effectors (see text and Figure S4). The residue type with the highest positive weight is shown in bold for each position, followed by the other residue types that were also found to be significant. Amino acids with a negative weight are also shown. Positions with an “x” have no representation in the minimal set. Grey background indicates sequence positions where both models agree (for at least one amino acid type). It is important to note that this does not represent a consensus sequence, since there is very little similarity between individual effector signals (see Table S4). Rather it shows those sequence positions and amino acid types that SIEVE found particularly helpful in discriminating between the secreted effectors and negative examples.

References

    1. FAO/WHO/OIE. 2007. Report of the joint FAO/WHO/OIE Expert meeting on Critically important antimicrobials. Rome, Italy.
    1. Nordfelth R, Kauppi AM, Norberg HA, Wolf-Watz H, Elofsson M. Small-molecule inhibitors specifically targeting type III secretion. Infect Immun. 2005;73:3104–3114. - PMC - PubMed
    1. Stavrinides J, McCann HC, Guttman DS. Host-pathogen interplay and the evolution of bacterial effectors. Cell Microbiol. 2008;10:285–292. - PubMed
    1. Galan JE, Wolf-Watz H. Protein delivery into eukaryotic cells by type III secretion machines. Nature. 2006;444:567–573. - PubMed
    1. Kubori T, Sukhan A, Aizawa SI, Galan JE. Molecular characterization and assembly of the needle complex of the Salmonella typhimurium type III protein secretion system. Proc Natl Acad Sci U S A. 2000;97:10225–10230. - PMC - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources