Finding nuclear localization signals - PubMed (original) (raw)

Finding nuclear localization signals

M Cokol et al. EMBO Rep. 2000 Nov.

Abstract

A variety of nuclear localization signals (NLSs) are experimentally known although only one motif was available for database searches through PROSITE. We initially collected a set of 91 experimentally verified NLSs from the literature. Through iterated 'in silico mutagenesis' we then extended the set to 214 potential NLSs. This final set matched in 43% of all known nuclear proteins and in no known non-nuclear protein. We estimated that >17% of all eukaryotic proteins may be imported into the nucleus. Finally, we found an overlap between the NLS and DNA-binding region for 90% of the proteins for which both the NLS and DNA-binding regions were known. Thus, evolution seemed to have used part of the existing DNA-binding mechanism when compartmentalizing DNA-binding proteins into the nucleus. However, only 56 of our 214 NLS motifs overlapped with DNA-binding regions. These 56 NLSs enabled a de novo prediction of partial DNA-binding regions for approximately 800 proteins in human, fly, worm and yeast.

PubMed Disclaimer

Figures

None

Fig. 1. Simplified scheme for nuclear import. Upon synthesis of nuclear proteins in the cytoplasm, e.g. the family of importins or transportins bind to the NLS. The complex importin/NLS protein (or transportin/protein) is then actively transported into the nucleus through nuclear pores involving the Ran GTPase cycle. Currently, this is the only known mechanism for nuclear import (Mattaj and Englmeier, 1998; Weis, 1998).

None

Fig. 2. NLS motif also used for DNA binding. Zoom into the interface between DNA and P55-C-fos proto-oncogene protein [note, the other parts of the amazing crystal structure of the complex with PDB code 1a02 (Chen et al., 1998) are not shown]. The coloured region corresponds to the residues RRERNKMAAAKSRNRRR. In fact, this motif is also contained in our data set of potential NLS motifs. Colouring scheme: basic residues shown in red, others in yellow. Graph created with RASMOL (Sayle and Milner-White, 1995).

None

Fig. 3. Scheme for the concept of ‘in silico mutagenesis’. We started the search with the hypothetical motif GNKAKRQRST. We searched the data sets of proteins known to be nuclear and proteins known to be non-nuclear for the presence of this motif. In this particular example, two nuclear and one non-nuclear protein matched. Requiring 100% accuracy for all motifs, we did not include GNKAKRQRST into our data set of potential motifs. Note, the particular example was one of many failed attempts to generalize an experimental NLS.

Similar articles

Cited by

References

    1. Bairoch A. and Apweiler, R. (1999) The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res., 27, 49–54. - PMC - PubMed
    1. Berman H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N. and Bourne, P.E. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242. - PMC - PubMed
    1. Bonifaci N., Moroianu, J., Radu, A. and Blobel, G. (1997) Karyopherin β2 mediates nuclear import of a mRNA binding protein. Proc. Natl Acad. Sci. USA, 94, 5055–5060. - PMC - PubMed
    1. Boulikas T. (1993) Nuclear localization signals (NLS). Crit. Rev. Eukaryot. Gene Expr., 3, 193–227. - PubMed
    1. Boulikas T. (1994) Putative nuclear localization signals (NLS) in protein transcription factors. J. Cell. Biochem., 55, 32–58. - PubMed

MeSH terms

Substances

LinkOut - more resources