Generating a synthetic genome by whole genome assembly: phiX174 bacteriophage from synthetic oligonucleotides - PubMed (original) (raw)

Generating a synthetic genome by whole genome assembly: phiX174 bacteriophage from synthetic oligonucleotides

Hamilton O Smith et al. Proc Natl Acad Sci U S A. 2003.

Abstract

We have improved upon the methodology and dramatically shortened the time required for accurate assembly of 5- to 6-kb segments of DNA from synthetic oligonucleotides. As a test of this methodology, we have established conditions for the rapid (14-day) assembly of the complete infectious genome of bacteriophage X174 (5386 bp) from a single pool of chemically synthesized oligonucleotides. The procedure involves three key steps: (i). gel purification of pooled oligonucleotides to reduce contamination with molecules of incorrect chain length, (ii). ligation of the oligonucleotides under stringent annealing conditions (55 degrees C) to select against annealing of molecules with incorrect sequences, and (iii). assembly of ligation products into full-length genomes by polymerase cycling assembly, a nonexponential reaction in which each terminal oligonucleotide can be extended only once to produce a full-length molecule. We observed a discrete band of full-length assemblies upon gel analysis of the polymerase cycling assembly product, without any PCR amplification. PCR amplification was then used to obtain larger amounts of pure full-length genomes for circularization and infectivity measurements. The synthetic DNA had a lower infectivity than natural DNA, indicating approximately one lethal error per 500 bp. However, fully infectious X174 virions were recovered after electroporation into Escherichia coli. Sequence analysis of several infectious isolates verified the accuracy of these synthetic genomes. One such isolate had exactly the intended sequence. We propose to assemble larger genomes by joining separately assembled 5- to 6-kb segments; approximately 60 such segments would be required for a minimal cellular genome.

PubMed Disclaimer

Figures

Fig. 1.

Fig. 1.

Schematic diagram of the steps in the global synthesis of infectious φX174 bacteriophage from synthetic oligonucleotides.

Fig. 2.

Fig. 2.

(A) Computer simulation of the ligation reaction as a function of variation in the concentrations of the input oligonucleotides. Computer simulation parameters are 130 top oligonucleotides (with bottom oligonucleotides saturating) and 2,000 molecules of each top oligonucleotide. Percent variation of oligonucleotide concentrations is determined by the randomly chosen number of molecules of each oligonucleotide. For example, at 10% variation in oligonucleotide concentrations, the number of each oligonucleotide is randomly chosen between 1,800 and 2,000. At each iteration of the program, a random pair of assemblies is selected and the pairs are joined together to form a larger assembly if the end coordinate of one assembly is one less than the beginning of the other assembly. The process is iterated until the number of assemblies no longer changes during 1 million pairings. -▪-▪-, average percent length of assemblies; -•-•-, percent of full-length assemblies. (B) The final products of a theoretical PCA reaction that starts with oligonucleotides of uniform size. Only the two terminal oligonucleotides can be extended to full length because polymerization is only in the 5′ to 3′ direction. An assembly is considered “active” if it can be extended by overlapping with another assembly followed by fill-in synthesis. An assembly is “inactive” if it cannot be extended further. Mass increase factor = final mass/beginning mass = sn (n + 2)/2_sn_ = (n + 2)/2 ≈ n/2. Fraction of final mass full length = [2 (sn + s/2)]/[sn (n + 2)] = [2 (n + 1/2)]/[n (n + 2)]≈ 2/n. s = nucleotide length of oligonucleotide; n = number of oligonucleotides on one strand.

Fig. 3.

Fig. 3.

PCA of full-length synφX molecules. The first stage of PCA (50-μl reaction volume) was carried out for 35 cycles with 0.2, 0.5, or 1 μlofthe Taq ligation product. PCA products were analyzed on a 0.8% E-gel. (A) Lanes 1 and 6, 1-kb ladder; lane 2, 0.5 μlof Taq ligation product; lane 3, 2 μl of the 0.2-μl PCA; lane 4, 2 μl of the 0.5-μl PCA; lane 5, 2 μl of the 1-μl PCA. The second stage of PCA was for an additional 35 cycles in five new 50-μl reactions. For reaction 1, the 0.2-μl first-stage reaction was continued without change for another 35 cycles with the addition of 0.5 μl of fresh HF polymerase mixture. For reactions 2 and 3, 10 and 20 μl of the 0.5-μl first-stage PCA product was used. For reactions 4 and 5, 5 and 10 μl of the 1-μl first-stage PCA product was used. Analysis was on 0.8% E-gels. (B) Formamide-denatured DNA. (C) Native DNA. (B and C) Lanes 1 and 7, 1-kb ladder; lane 2, 2 μl of reaction 1; lanes 3 and 4, 2 μl of reactions 2 and 3; lanes 5 and 6, 2 μl of reactions 4 and 5. (D) PCR amplification of the products of the second set of PCA products as shown in B and C.(E) Taq ligase assembly of 259 oligonucleotides. A 0.5-μl sample of the ligation products was analyzed on a 2% E-gel (Invitrogen) in duplex form (lane N). One microliter of the ligation products was mixed with 20 μl of formamide, heated to 95°C for 2 min, and then analyzed (lane D). Denatured standards run approximately the same as native standards, based on other experiments (data not shown).

Fig. 4.

Fig. 4.

Plaques of synφX-A. There appear to be several plaque morphologies: small plaques with sharp borders, medium-sized plaques, and large plaques with fuzzy borders.

Fig. 5.

Fig. 5.

Sequence comparisons of natural φX and synφX genomes. Differences from the Sanger sequence (13) are indicated. A4, A8, B1, and B3 are the synφX described in the text. NEB, φX RF I DNA supplied by NEB (catalog no. N3021S); RF70s, DNA prepared in the late 1970s and stored since then by C.A.H.

Fig. 6.

Fig. 6.

Sequence differences between the Sanger sequence and more recent sequencing of natural φX DNAs. RF70s, a preparation of φX double-stranded RF from the late 1970s; SS78, a preparation of φX virion single-stranded DNA from 1978; Bull, the sequence of wild-type φX used by Bull et al. (15); G'97, φX RF DNA from 1997; NEB'03, φX RF DNA from NEB in use at the Institute for Biological Energy Alternatives during the φX genome synthesis.

Similar articles

Cited by

References

    1. Wöhler, F. (1828) Ann. Phys. Chem. 88, 253–256.
    1. Sekiya, T., Takeya, T., Brown, E. L., Belagaje, R., Contreras, R., Fritz, H. J., Gait, M. J., Lees, R. G., Ryan, M. J., Khorana, H. G., et al. (1979) J. Biol. Chem. 254, 5787–5801. - PubMed
    1. Letsinger, R. L. & Mahadevan, V. (1965) J. Am. Chem. Soc. 87, 3526–3527. - PubMed
    1. Letsinger, R. L., Ogillvie, K. K. & Miller, P. S. (1969) J. Am. Chem. Soc. 91, 3360–3365.
    1. Matteucci, M. D. & Caruthers, M. H. (1981) J. Am. Chem. Soc. 103, 3185–3191.

Publication types

MeSH terms

Substances

LinkOut - more resources