High-throughput expression of C. elegans proteins - PubMed (original) (raw)

. 2004 Oct;14(10B):2102-10.

doi: 10.1101/gr.2520504.

Shihong Qiu, James B Finley, Mike Carson, Rita J Gray, Wenying Huang, David Johnson, Jun Tsao, Jérôme Reboul, Philippe Vaglio, David E Hill, Marc Vidal, Lawrence J Delucas, Ming Luo

Affiliations

High-throughput expression of C. elegans proteins

Chi-Hao Luan et al. Genome Res. 2004 Oct.

Abstract

Proteome-scale studies of protein three-dimensional structures should provide valuable information for both investigating basic biology and developing therapeutics. Critical for these endeavors is the expression of recombinant proteins. We selected Caenorhabditis elegans as our model organism in a structural proteomics initiative because of the high quality of its genome sequence and the availability of its ORFeome, protein-encoding open reading frames (ORFs), in a flexible recombinational cloning format. We developed a robotic pipeline for recombinant protein expression, applying the Gateway cloning/expression technology and utilizing a stepwise automation strategy on an integrated robotic platform. Using the pipeline, we have carried out heterologous protein expression experiments on 10,167 ORFs of C. elegans. With one expression vector and one Escherichia coli strain, protein expression was observed for 4854 ORFs, and 1536 were soluble. Bioinformatics analysis of the data indicates that protein hydrophobicity is a key determining factor for an ORF to yield a soluble expression product. This protein expression effort has investigated the largest number of genes in any organism to date. The pipeline described here is applicable to high-throughput expression of recombinant proteins for other species, both prokaryotic and eukaryotic, provided that ORFeome resources become available.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Reproducibility of protein expression in 96-well format. The data are from two experiments of protein expression, purification, and solubility profiling for plate 19. The data demonstrate good reproducibility for identifying soluble proteins in the multigene, multistep experiments. The data on left in 96-well format are ELISA readings of OD (optical density) at 405 nm for the supernatant (soluble fraction) of the expression, each well for one ORF. The maximum OD reading is 4.0. Color coding describes the level of protein expression, where red is for high, orange for medium, and yellow for low. On the right, listed with OD for each soluble protein, is σ which is calculated as the ratio of OD reading for the supernatant (soluble fraction, data shown) to that for pellet (insoluble fraction, data not shown) from the same expression experiment of an ORF. The OD and σ values combined serve as the solubility signature of a protein.

Figure 2

Figure 2

Temperature dependence of total and soluble protein expression in 96-well format. More soluble proteins were found at 18°C; more proteins were expressed at 37°C. For this plate, the total protein expression was about 40%, whereas the soluble expression was ∼12%. The data are OD readings from 96-well format ELISA for ORFs in plate 51. Color coding is the same as in Figure 1.

Figure 3

Figure 3

Comparison of expression vector pDEST17.1 vs. pET15G for protein expression. Protein expression of the genes in the four plates was carried out using pDEST17.1 (plots on left) and pET15G (plots on right). The number of soluble proteins was increased in all cases by using pET15G. The data are represented as histograms of OD readings at 405 nm from ELISA vs. ORFs in the 96-well plate.

Figure 4

Figure 4

SDS-polyacrylamide gels showing the 1-L scale-up expressions of 12 soluble proteins. The expression levels in the gel are typical for the C. elegans proteins expressed in E. coli. The bands for 76D4, 76F6, 56D11, and 76F10 represents a high-level expression with yield greater than 10 mg per liter of bacterial cell culture (mg/L). The bands for 37F1, 37G9, and 37C1 represent a medium level expression from 6 to 10 mg/L, whereas the bands for 37D4, 37B8, and 37D8 represent a low level expression from 3 to 6 mg/L.

Figure 5

Figure 5

Histograms (top) of GRAVY, molecular weight, and isoelectric point for the 10,167 ORFs studied (white), expressed (yellow), and soluble (pink), and percentage plots (bottom) of the expressed and soluble ORFs over the studied, respectively, vs. the three parameters. Correlation between protein expression and GRAVY is apparent in the plots. The lack of a correlation to isoelectric point and molecular weight is indicated by the flatness of the curves in the percentage plots.

Figure 6

Figure 6

Expression data in correlation to GRAVY, Signal peptide, and Transmembrane helices as demonstrated on 87 ORFs for plate 11041. The genes are listed in two panels ordered by GRAVY value from low to high in the left panel and continued in the same manner in the right panel. The break point for the two panels was chosen for easy presentation. (Right) Higher GRAVY values. The expression data are color coded with gray for no expression, white for expression but not soluble, yellow for low level soluble, orange for mid level soluble, and red for high level soluble. (SP) Signal peptide or anchor; (TM) number of transmembrane helices.

Figure 7

Figure 7

Schematic for the HTP protein expression pipeline. The rectangles represent material, the arrows represent process, and the octagon represents data.

Similar articles

Cited by

References

    1. Adams, M.W.W., Dailey, H.A., DeLucas, L.J., Luo, M., Prestegard, J.H., Rose, J.P., and Wang, B.C. 2003. The Southeast collaboratory for structural genomics: A high-throughput gene to structure factory. Acc. Chem. Res. 36: 191-198. - PubMed
    1. Bairoch, A., Bucher, P., and Hofmann, K. 1997. The PROSITE database, its status in 1997. Nucleic Acids Res. 25: 217-221. - PMC - PubMed
    1. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E., 2000. The Protein Data Bank. Nucleic Acids Res. 28: 235-242. - PMC - PubMed
    1. Braun, P. and LaBaer, J. 2003. High throughput protein production for functional proteomics. Trends Biotechnol. 21: 383-388. - PubMed
    1. Braun, P., Hu, Y., Shen, B., Halleck, A., Koundinya, M., Harlow, E., and LaBar, J. 2002. Proteome-scale purification of human proteins from bacteria. Proc. Natl. Acad. Sci. 99: 2654-2659. - PMC - PubMed

WEB SITE REFERENCES

    1. http://sgce.cbse.uab.edu; Structural Genomics of C. elegans.
    1. http://sgi.com; MineSet decision tree software.
    1. http://ww.invitrogen.com; Gateway Cloning and Expression Technologies—Invitrogen.
    1. http://ww1.qiagen.com; Qiagen.
    1. http://ww1.novagen.com; Novagen.

Publication types

MeSH terms

Substances

LinkOut - more resources