Multicoil2: predicting coiled coils and their oligomerization states from sequence in the twilight zone - PubMed (original) (raw)

Multicoil2: predicting coiled coils and their oligomerization states from sequence in the twilight zone

Jason Trigg et al. PLoS One. 2011.

Abstract

The alpha-helical coiled coil can adopt a variety of topologies, among the most common of which are parallel and antiparallel dimers and trimers. We present Multicoil2, an algorithm that predicts both the location and oligomerization state (two versus three helices) of coiled coils in protein sequences. Multicoil2 combines the pairwise correlations of the previous Multicoil method with the flexibility of Hidden Markov Models (HMMs) in a Markov Random Field (MRF). The resulting algorithm integrates sequence features, including pairwise interactions, through multinomial logistic regression to devise an optimized scoring function for distinguishing dimer, trimer and non-coiled-coil oligomerization states; this scoring function is used to produce Markov Random Field potentials that incorporate pairwise correlations localized in sequence. Multicoil2 significantly improves both coiled-coil detection and dimer versus trimer state prediction over the original Multicoil algorithm retrained on a newly-constructed database of coiled-coil sequences. The new database, comprised of 2,105 sequences containing 124,088 residues, includes reliable structural annotations based on experimental data in the literature. Notably, the enhanced performance of Multicoil2 is evident when tested in stringent leave-family-out cross-validation on the new database, reflecting expected performance on challenging new prediction targets that have minimal sequence similarity to known coiled-coil families. The Multicoil2 program and training database are available for download from http://multicoil2.csail.mit.edu.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1

Figure 1. Flow-chart overview of the Multicoil2 method.

From the training set of labelled dimer, trimer and negative coiled coil sequences, we compute the probability of each amino acid at each heptad position in dimer sequences and trimer sequences. Also, we compute the probability of each amino acid in negative sequences. From the resulting frequency tables shown in the upper left, along with the training sequences shown in the upper right, we compute sequence features for each training sequence. Running a multinomial logistic regression on these values generates a three-way classifier which is used in the MRF as described in Methods.

Figure 2

Figure 2. Dimer versus trimer recognition.

Multicoil2 and Multicoil ROC curves based on leave-family-out cross validation for per-residue (a) and per-sequence (b) recognition.

Figure 3

Figure 3. Coiled coil detection.

Multicoil2, retrained Multicoil and Paircoil2 ROC curves based on leave-family-out cross validation for per-residue (a) and per-sequence (b) detection. Note the very small values on x-axis for false positive rate.

Figure 4

Figure 4. Overview of training of the Multicoil and Multicoil2 algorithms.

(a) The raw scores used to generate the Multicoil gaussians for each of n-1 families are computed based on frequency tables generated from the other n-2 families. (b) The Multicoil2 sequence features for each of n-1 families are computed based on frequency tables generated from the other n-2 families. Those features are used to find the regression coefficients, which determine the MRF.

Figure 5

Figure 5. Graph of allowed transitions.

There are two copies of each state, one for the dimer state and one for the trimer state (the transitions of the trimer states are omitted - they are identical to the dimer transitions). The “0” nodes at the top and bottom of the figure refer to the same non-coiled-coil state.

Figure 6

Figure 6. Example of hidden states corresponding to a given amino acid sequence in the Multicoil2 model.

Similar articles

Cited by

References

    1. Parry DA, Fraser RD, Squire JM. Fifty years of coiled-coils and alpha-helical bundles: a close relationship between sequence and structure. J Struct Biol. 2008;163:258–269. - PubMed
    1. Craig AW, Zirngibl R, Greer P. Disruption of coiled-coil domains in Fer protein-tyrosine kinase abolishes trimerization but not kinase activation. J Biol Chem. 1999;274:19934–19942. - PubMed
    1. Kilmartin JV, Dyos SL, Kershaw D, Finch JT. A spacer protein in the Saccharomyces cerevisiae spindle poly body whose transcript is cell cycle-regulated. J Cell Biol. 1993;123:1175–1184. - PMC - PubMed
    1. Rose A, Meier I. Scaffolds, levers, rods and springs: diverse cellular functions of long coiledcoil proteins. Cell Mol Life Sci. 2004;61:1996–2009. - PMC - PubMed
    1. Gruber M, Soding J, Lupas AN. Comparative analysis of coiled-coil prediction methods. J Struct Biol. 2006;155:140–145. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources