Orientation-dependent and sequence-specific expansions of CTG/CAG trinucleotide repeats in Saccharomyces cerevisiae (original) (raw)

Abstract

A quantitative and selective genetic assay was developed to monitor expansions of trinucleotide repeats (TNRs) in yeast. A promoter containing 25 repeats allows expression of a URA3 reporter gene and yields sensitivity to the drug 5-fluoroorotic acid. Expansion of the TNR to 30 or more repeats turns off URA3 and provides drug resistance. When integrated at either of two chromosomal loci, expansion rates were 1 × 10−5 to 4 × 10−5 per generation if CTG repeats were replicated on the lagging daughter strand. PCR analysis indicated that 5–28 additional repeats were present in 95% of the expanded alleles. No significant changes in CTG expansion rates occurred in strains deficient in the mismatch repair gene MSH2 or the recombination gene RAD52. The frequent nature of CTG expansions suggests that the threshold number for this repeat is below 25 in this system. In contrast, expansions of the complementary repeat CAG occurred at 500- to 1,000-fold lower rates, similar to a randomized (C,A,G) control sequence. When the reporter plasmid was inverted within the chromosome, switching the leading and lagging strands of replication, frequent expansions were observed only when CTG repeats resided on the lagging daughter strand. Among the rare CAG expansions, the largest gain in tract size was 38 repeats. The control repeats CTA and TAG showed no detectable rate of expansions. The orientation-dependence and sequence-specificity data support the model that expansions of CTG and CAG tracts result from aberrant DNA replication via hairpin-containing Okazaki fragments.


Expansions of endogenous trinucleotide repeats (TNRs) underlie more than 10 human genetic disorders, including Huntington’s disease and myotonic dystrophy (reviewed in refs. 1 and 2). Several lines of evidence indicate that TNR alterations occur differently than for other microsatellites. For example, large increases in TNR number range from 10 to 2,000 repeats among affected kindreds (reviewed in refs. 1 and 2). Also, the propensity to undergo large TNR expansions is unaffected by the mismatch repair status of the cell (3, 4). These and other features suggest that TNR instability results from an unusual mechanism.

Although each triplet repeat disease exhibits unique genetic features (reviewed in ref. 1), there are several common aspects that provide clues to the mechanism of TNR instability. First is sequence specificity; all known TNR diseases result from expansions of the sequences CNG (where N is any nucleotide) or GAA. Second, a single genetic event is sufficient to incorporate many additional TNRs. Since the unaffected allele often is unaltered (5, 6), this suggests that repeats are added de novo by a replicational mechanism. Third, instability often is governed by a threshold number. TNR lengths below the threshold are stable whereas alleles above the threshold are prone to expansions (reviewed in ref. 1).

Examination of TNR structures suggests that DNA hairpins may play an important role in expansions. When single-stranded, a subset of TNRs forms secondary structures in vitro (7, 8). For example, 15-repeat molecules of CTG and CAG form hairpins that melt at 47° and 38°, respectively (summarized in ref. 8). Most other TNRs show no predisposition to expand and are unlikely to form stable secondary structures (7, 8). In a second corollary, duplex DNA containing CTG/CAG and CGG/CCG repeats form slipped-strand structures upon denaturation and reannealing (9), indicating that slipped structures can form in competition with duplexes. A third parallel is that hairpin formation in vitro is dictated by a threshold number of repeats. Based on modeling studies, a minimum number of repeats is important to stabilize the hairpin (7).

The noteworthy correlations between TNR genetics and formation of secondary structures such as hairpins led a number of authors to suggest that expansions result from aberrant lagging-strand DNA replication (2, 5, 1014). As diagrammed in Fig. 1A, lagging-strand DNA replication involves synthesis of ≈150- to 300-nt Okazaki fragments that subsequently are processed and ligated together. If TNR sequences such as CTG or CAG are present, hairpins could form on the lagging daughter strand. Extension of these hairpins by DNA synthesis would lead to the presence of extra triplet repeats. If unrepaired, the anomalous strand would template for an expanded second strand in the next round of DNA synthesis. The hairpin model for TNR expansions therefore predicts not only sequence specificity, with expansions arising most often from sequences that readily form hairpins, but also suggests possible orientation effects for complementary pairs such as CTG and CAG. In other words, the presence of the stronger, CTG hairpin-forming sequence on the lagging daughter strand should yield more expansions than the reciprocal case for a CAG tract.

Figure 1.

Figure 1

Hairpin model for TNR expansions. The figure shows the possible behavior of CAG and CTG tracts in our experiments, assuming that (CTG)25 repeats form stable hairpins in vivo more readily than (CAG)25. In A_–_D, chromosomal DNA replication proceeds from left to right and the lagging strand synthesis is on top. The direction of the URA3 reporter (sense strand, 5′ → 3′) is indicated by the open arrow. In A, CTG sequences on the lagging daughter strand are predicted to form hairpins occasionally. Incorporation of the hairpin into the replicated product and failure to repair this structure ultimately would lead to an expanded allele. B shows an orientation effect, in which CAG sequences occupy the lagging daughter strand. If CAG sequences form less-stable hairpins than CTG, the CAG configuration will exhibit fewer expansions. In C and D, the entire reporter has been inverted to the opposite direction (depicted by the open arrow and the notations 3ARU GTC and 3ARU GAC, respectively). Inversion of the reporter is predicted to affect expansions because the sequences present on the lagging daughter strand will be altered. Inversion of the sequences in A (genetically unstable) results in the situation in D, which should yield fewer expansions. Similarly, inversion of the reporter from B (low rate of expansions) will yield the C scenario and should increase the rate of expansions.

Direct experimental proof for the hairpin model has been elusive. In transgenic mice, triplet repeats are quite stable (summarized in ref. 15). Among the alterations reported, all have been either losses or small gains. Thus, these animals have not revealed much about the mechanism of TNR expansions. In bacteria and yeast, contractions occur much more frequently than expansions (13, 16, 17). Assays that monitor TNRs by nonselective physical techniques (PCR or Southern blotting) tend to lack the sensitivity for facile detection of expansions. Sensitivity can be increased in yeast by the presence of a rad27 mutation that increases the frequency of expansions and contractions (18, 19). Under these conditions, expansions become abundant enough to detect by physical means. However, interpretation of results is complicated by the pleiotropic nature of rad27 mutations (2022). In other experiments (19), CTG repeats were shown to act as fragile sites in yeast. Unfortunately, chromosome breakage also results in recombinational deletion of the TNR sequences, making it difficult to characterize the nature of expansions.

In this paper we describe a new genetic assay for TNR expansions in yeast that is selective and quantitative. Using this assay, we observe a number of characteristics for CTG and CAG expansions that are consistent with predictions of the hairpin model. To our knowledge, this is the most conclusive in vivo evidence supporting the hairpin-based, replicational model for CTG and CAG expansions.

MATERIALS AND METHODS

Strains.

The bacterial strain GM4257 [F− _lacI_Q _lacZ proAB_+_/ara thi Δ(gpt-lac) mutS215∷_Tn_10_; ref. 23] was used for cloning and plasmid purifications. The yeast strain MW3317–21A (_MATα Δtrp1 ura3–52 ade2 Δade8 hom3–10 his3-Kpn_I met4 met13; ref. 24) or its isogenic Δleu2 derivative was utilized as host for fluctuation assays. Gene disruptions to yield msh2∷LEU2 (plasmid pBL20, this lab) or rad52∷LEU2 (plasmid pSM20 from L. Symington, Columbia University) derivatives were performed by single-step protocols (25). Integration of plasmids containing TNRs was accomplished by linearization with _Stu_I (for targeting to URA3) or with _Bsu_36I (for targeting to LYS2) followed by lithium acetate-mediated transformation (26). All constructs were confirmed for single integration events at the desired site by Southern blotting.

Plasmid Constructions.

Plasmid pURA, the vector used for all TNR constructs, was derived from pBL23 (27). Briefly, pURA is an integrating plasmid based on vector pRS303 (28). pURA includes a HIS3 selectable marker for identification of cells harboring the integration. pURA also includes the promoter region of the Schizosaccharomyces pombe adh1 gene [coordinates −806 to −101 (27)] fused to the S. cerevisiae URA3 reporter gene. Introduction of URA3 in this fusion was accomplished by PCR amplification of genomic yeast DNA. Primer oJJ8 is (GCATGCATGCTCTAACCACATACTTAAGATGTCGAAAGCTACATATAAGGAACGT; underlined sequence is an _Sph_I site, and bold sequence corresponds to initiator codon of URA3). Primer oJJ9 is (CTTAAGCAGTTTTTTAGTTTTGCTGGCCGC; nucleotides 7–30 are complementary to positions 810–787 of URA3, relative to the start codon). The 844-bp PCR product was cleaved with _Sph_I and inserted into the _Sph_I and _Msc_I sites of pBL23, placing URA3 under control of the adh1 promoter (Fig. 2). The precise sequence of the fusion, starting at the TATA box and ending at the initiator ATG of URA3, is TATAAATGTGGGCATGCTCTAACCACATACTTAAGATG. The underlined sequence indicates an _Sph_I site, into which TNR-containing oligonucleotide duplexes were cloned. The duplexes contained compatible ends for _Sph_I, resulting in duplication of the site and hence the presence of an out-of-frame ATG codon 23 bp 5′ to the URA3 initiator codon (Fig. 2). A control plasmid was created in which the repeat sequence was randomized. In this case, the oligonucleotide residing on the sense strand (relative to URA3) was CGCTGGCCGCTTGCGTTGCGTCGTTGCTCTTTGGCGGTCCTTGCTCGGCCCGCGTTTCGTTTCGTTGCGTCGCCG. The primary sequence of all plasmids was confirmed by DNA sequence analysis.

Figure 2.

Figure 2

A genetic assay to monitor TNR expansions in yeast. The regulatory region controlling expression of the reporter gene URA3 is shown. The important features include: the TATA box; the 25-repeat triplet, marked with an inverted triangle, where N = A or T; an out-of-frame ATG initiator codon; the preferred transcription initiation site I (CCACA sequence); and the start of the URA3 structural gene. Upper diagram illustrates the starting construct, with anticipated transcription (right-angle arrow) initiating within 55–125 bp (square brackets) from TATA. Initiation at I results in functional expression of URA3 and sensitivity to 5FOA. If the TNR expands to ≥30 repeats (Lower diagram), the window of allowed transcription no longer includes I. Transcription initiating upstream of I will include the out-of-frame ATG, resulting in translational incompetence (indicated by X) and resistance to 5FOA.

Integration at LYS2 was targeted by inclusion of a 1,141-bp portion of LYS2 (coordinates 750-1891, relative to the coding sequence) into pURA. The LYS2 fragment was PCR-amplified from plasmid pRS317 (28) with primers oBL180 (CGCGGATCCGCGCCAGAGAGAACCTGTGTTGT) and oBL181 (CGCGGATCCGCGTCGGCCAAACCACCTGCACG). Underlined sequences denote _Bam_HI sites, with LYS2 sequences starting at the fourth base downstream. The PCR product was cleaved with _Bam_HI and cloned into the corresponding site of pURA. Two clones were obtained, with the LYS2 reading frame either in the same orientation as URA3 (“forward”) or in the opposite orientation (“reverse”). Targeted integration into the genomic LYS2 locus resulted in reporter constructs with the URA3 reporter gene aligned in either the same (forward) or opposite (reverse) direction as the endogenous LYS2 allele (see Fig. 1 and ref. 13).

Fluctuation Analysis.

Rates of TNR instability were measured by the method of the median (29). Briefly, single colonies were resuspended in water and appropriate dilutions plated onto nonselective (YPD) plates. After 34–65 hr of growth at 30° to yield medium (≈106 cells) or large (≈107 cells) colonies, 5–10 colonies were resuspended independently in water and replated onto SC-His + 5FOA (to score expansions) or appropriate dilutions were plated onto YPD (to score total cells). After 2–3 days of growth, colonies were counted. To ensure reproducibility, two to five independent clones were assayed for each test construct.

Single-Colony PCR Analysis.

Isolated colonies from the fluctuation tests were picked from the SC-His + 5FOA plates and the parallel YPD plates, resuspended in 100 μl of 50 mM DTT/0.5% Triton X-100, incubated at 37° for 30 min, heated to 95° for 5 min, and chilled on ice for 5 min. A portion (10 μl) of this material was used as a template for PCR analysis by using primers oBL91 (AAACTCGGTTTGACGCCTCCCATG, coordinates −54 to −31 of pBL24) and oBL157 (AGCAACAGGACTAGGATGAGTAGC, complementary to coordinates 53–30 of URA3). After amplification (35 cycles of 1.5 min at 93°, 2 min at 55°, and 3 min at 72° plus a final 10-min incubation at 72°) in the presence of approximately 5 μCi of [α-32P]dCTP, the products were analyzed on 7% denaturing polyacrylamide gels. PCR product sizes (±2 repeat units) were determined by comparison with a DNA-sequencing ladder. In some cases (Fig. 3), the amplified product was purified on a 6% nondenaturing gel, cleaved with _Sph_I + _Afl_II, and displayed on a sequencing gel (27).

Figure 3.

Figure 3

PCR analysis of CTG expansions. PCR products were generated from individual colonies harboring the (CTG)25 tract at the URA3 locus. A shows the predicted sizes of PCR products. Undigested product should yield a 193-nt product. Digestion with _Sph_I and _Afl_II generates several discrete fragments: 41 and 37 nt from the 5′ flanking region, 79 nt for a repeat of 25 trinucleotides, and 14, 22, 118, and 122 nt from the 3′ flanking region. All size estimates allow for the 4-nt overhanging ends generated by the restriction enzymes. B and C show sequencing gels with the uncut and cut PCR products, respectively. Sizes were deduced from a sequencing ladder (not shown). Lanes 1 and 6 are uncut and cut products from a starting (5FOAS) colony. Lanes 2 and 7, 3 and 8, 4 and 9, and 5 and 10 show products from individual 5FOAR colonies. The doublet products at 59 and 37 nt in C presumably reflect a 1-nt “stuttering” during amplification, as these fragments constitute the 3′ end of their respective strands. The faint intensities of the 41- and 22-nt products are a result of the availability of only a single C residue for radioactive labeling.

RESULTS

Selective Identification of Expanded TNRs.

To evaluate the hairpin model, a quantitative genetic assay was developed that monitors TNR expansions in yeast (Fig. 2). In this assay, the S. pombe adh1 promoter exhibits specific spacing requirements for function in S. cerevisiae (30). For this promoter, the permissive distance between the TATA element and the transcription initiation site is 55–125 bp; shorter or longer distances result in transcription initiation at alternate sites (30). We fused the adh1 promoter to the reporter gene URA3 to take advantage of existing selections for the expression of this gene. When URA3 is expressed, cells are sensitive to the cytotoxic effects of the drug 5-fluoroorotic acid (5FOA; ref. 31). If URA3 is not expressed, cells are resistant to 5FOA and they will grow on media containing this compound. Insertion of TNRs between the TATA box and the transcription initiation site makes expression of URA3, and hence sensitivity or resistance to 5FOA, dependent on the length of the repeat tract (Fig. 2). We estimated that tract lengths of ≥33 repeats would disallow URA3 expression. Experimentally we found that tract lengths of 25 result in 5FOA sensitivity whereas ≥30 repeats yield 5FOA resistance, in satisfactory agreement with prediction. In the experiments here, starting-tract lengths were 25 repeats. Expansions were identified as spontaneous 5FOA-resistant colonies. We refer to TNR sequences based on their presence in the lagging daughter DNA strand.

Expansions Are Frequent When CTG Repeats Occupy the Lagging Daughter Strand.

To ensure that TNRs are subject to DNA metabolism typical of the entire genome, repeat plasmids were integrated chromosomally by homology-based targeting. The reporter first was targeted to the URA3 locus on chromosome V. This integration results in a duplication of URA3 sequences, with one copy arising from the endogenous allele and the second, exogenous copy arising from the reporter gene. Because of the duplication, 5FOA resistance could arise by “popout” recombination, which could remove the exogenous URA3 sequences from the chromosome. Popouts were eliminated from our assay by maintaining strains on media lacking histidine. The targeting vector carries the HIS3 gene; if a popout occurred, it also would eliminate HIS3, leading to the inability to grow in the absence of histidine. Another type of event that possibly could cause 5FOA resistance is gene conversion of the reporter by the endogenous ura3–52 allele. However, gene conversions should be relatively infrequent because of the presence of the large, Ty element in the ura3–52 allele. Experimental results shown below indicate that gene conversions do not interfere significantly with the assay.

Results from strains harboring a (CTG)25 tract indicate that this TNR is unstable in yeast (Table 1). The CTG repeat tract yielded 5FOA resistance at a rate of 1 × 10−5 per generation. This rate is several orders of magnitude higher than spontaneous mutation rates in yeast, which are typically <10−8 per generation (32). Assignment of the CTG repeat to the lagging daughter strand stems from the knowledge that replication forks move from the 5′ end of the URA3 gene toward the 3′ end (K. Kolor, W. L. Fangman, and B. J. Brewer, personal communication). Since our reporter gene is integrated in the same direction as the endogenous allele, DNA replication also proceeds from left to right (Fig. 1A). Therefore, these results indicate that (CTG)25 is unstable when present on the lagging daughter strand.

Table 1.

Expansion rates at URA3

Lagging strand sequence Strain background Tract alterations per generation, mean (±SD)
(CTG)25 Wild type 1.0 (±0.3) × 10−5
msh2 0.9 (±0.5) × 10−5
rad52 1.2 (±0.2) × 10−5
(CAG)25 Wild type <5 × 10−7
msh2 <3 × 10−8
(C,A,G)25 Wild type <5 × 10−8

If expansions occur primarily by a replicational event involving a large single-stranded intermediate such as a hairpin, then there should be little effect of mismatch correction because such heteroduplexes are refractory to this repair pathway (33, 34). When CTG expansions were measured in an msh2 mutant, there was no significant effect on the rate (0.9 × 10−5 per generation; Table 1). Similarly, loss of the major recombination factor RAD52 did not alter the rate of CTG expansions (1.2 × 10−5 per generation). Thus, neither mismatch repair nor recombination appears to play a major role in CTG expansions.

Molecular Analysis Confirms CTG Tract Expansions.

Single-colony PCR was employed to investigate the TNR tract in the 5FOAR colonies. The repeat tract locus was amplified with primers flanking the TNR. Genetically independent colonies arising either from the 5FOA-containing plate or from the parallel, nonselective plate were amplified and the products were displayed on a sequencing gel (Fig. 3). The unexpanded allele should yield a 193-nt amplification product (Fig. 3A) whereas expanded TNR tracts should generate larger species. Fig. 3B (lane 1) shows the expected 193-nt product from an unexpanded tract. Lanes 2–5 show the amplified DNA from four different 5FOAR colonies. The larger sizes, which correspond to tract lengths of 44, 42, 43, and 37 repeats, indicate that 5FOA resistance is a result of TNR expansion. Restriction digests of the PCR products confirmed that the repetitive element contains the alteration. Cleavage with _Sph_I plus _Afl_II results in a predictable pattern of bands (Fig. 3A) that is observed with the nonexpanded allele (Fig. 3C, lane 6). Note the 79-nt TNR fragment in lane 6. In the four expanded alleles, corresponding to the samples tested in Fig. 3B, only the TNR-containing fragment is larger. The size of the TNR fragment agrees well with the estimates from the uncut samples (Fig. 3B). The other bands, from flanking DNA, are unchanged. This analysis indicates that 5FOA resistance arises primarily from bona fide TNR expansions.

Sizing of the PCR products from 76 genetically independent events revealed that 72 were accompanied by expansion of the triplet tract (Fig. 4). Thus, 95% of the 5FOAR colonies can be attributed to expansions. In the four cases where no expansion was observed, we assume that resistance to 5FOA arose by some other mechanism, such as mutation within the URA3 structural gene. Expansions ranged from 5 to 28 repeats; in the largest expansion, the repeat tract has approximately doubled to 53 repeats. The median expansion event was 10.5 repeats. No evidence was apparent for a strongly preferred expansion size.

Figure 4.

Figure 4

Distribution of expansion sizes. A summary histogram shows the change in PCR product size for 76 independent genetic events from wild-type cells harboring the (CTG)25 tract at the URA3 locus. Expansion sizes were estimated by comparing matched 5FOAS and 5FOAR product sizes (see Materials and Methods). Seventy-two of the samples exhibited size increases of +5 to +28 repeats, with a median value of +10.5. The remaining four samples showed no increase in product size.

(CAG)25 Tracts Are Much More Stable than (CTG)25 Repeats.

To assay for CAG expansions, we created clones in which the (CTG)25 tract was reversed to yield the complementary sequence (CAG)25. When integrated into the genome at URA3, the CAG reporter yields the configuration shown in Fig. 1B. We were surprised to find that this orientation was very stable, yielding 5FOA resistance at a rate of <5 × 10−7 per generation (Table 1). The CAG construct yielded no 5FOAR colonies in several experiments. The stability of this repeat was not detectably different from a reporter harboring a “scrambled” (C,A,G)25 control sequence, devoid of any repeating nature (Table 1). An msh2 mutation did not destabilize the CAG tract (<3 × 10−8 per generation), suggesting that mismatch repair is not interfering with CAG expansions. This orientation dependence suggests that, in our assay, (CAG)25 is unable to assume a hairpin configuration.

Two control experiments confirmed the unexpected stability of the CAG repeat. The strain was reconstructed and reassayed but yielded no expansions. We also tested the idea that something outside the TNR might be altered in the CAG construct. To make the CAG tract in a different way, the reporter plasmid used for the CTG construct was treated in vitro with _Sph_I to liberate the repeat tract and then religated. Clones containing the CAG orientation were identified and used to construct the strain again, but no expansions were detected. These control experiments lend further weight in favor of an orientation effect.

Results at Another Locus Recapitulate the Orientation Dependence for Expansions.

Was the orientation dependence typical of the genome or was there some unusual position effect at the URA3 locus? This question was answered by targeting the reporter plasmid to the LYS2 locus on chromosome II. This experiment allowed further evaluation of orientation dependence because the reporter plasmid can be integrated such that the URA3 reporter and the endogenous LYS2 gene (13) are aligned in either the same direction (“forward,” as in Fig. 1 A and B) or in opposite directions (“reverse,” as in Fig. 1 C and D). In the forward direction, which is equivalent to the situation at the URA3 locus, we again observed a strong orientation effect. The rate of 5FOA resistance for the CTG construct was 3.6 × 10−5 per generation but only 8 × 10−8 for the CAG reporter (Table 2), a reduction of some 500-fold. The strong orientation effect at two chromosomal loci argues against a position effect.

Table 2.

Expansion rates at LYS2

Chromosomal alignment Tract alterations per generation, mean (±SD)
(CTG)25 (CAG)25
Forward 3.6 (±0.7) × 10−5 8 (±6) × 10−8
Reverse 1.0 (±0.2) × 10−5 <1 × 10−8
(CTA)24CTG CAG(TAG)24
Reverse <1.6 × 10−8 <1.3 × 10−8

If the orientation effect is a result of differential hairpin strengths of the CTG and CAG tracts, then by inverting the reporter within the genome, we should be able to make unstable tracts stable and vice versa. The rationale for this idea is shown in Fig. 1. Inversion of the reporter in Fig. 1A (with the genetically unstable CTG tract in the lagging daughter strand) will yield the predicted stable CAG tract shown in Fig. 1D. Similarly, inversion of the reporter in Fig. 1B (stable CAG tract) should result in frequent expansions (unstable CTG tract, Fig. 1C). The results in Table 2 show that inversion led to the predicted expansion rates; in the reverse alignment, the rate of 5FOA resistance for CTG was 1 × 10−5 per generation. In contrast, the CAG tract yielded an expansion rate of <1 × 10−8, for an orientation effect of about 1,000-fold.

As another test of the variable-strength hairpin idea, we examined the 5FOAR colonies arising from the experiments at LYS2. If CAG forms weaker hairpins, then it might be necessary to include more repeats to stabilize the hairpin than for CTG. If so, the rare CAG expansions should be larger than CTG expansions. PCR analysis supported this prediction; for eight CAG expansions, both the range (+18 to +38) and the median (+26) size are larger than that for 20 expansions of CTG (range, +10 to +22; median, +15). The CAG and CTG expansion sizes are significantly different, as judged by a Student’s t test (P < 0.001). The results with CTG at LYS2 are qualitatively similar to those observed at URA3 (Fig. 4). These observations, when coupled with the orientation effect and the inversion experiment, support the hairpin hypothesis for CTG and CAG expansions. This experiment also increases the largest observed expansion to 63 repeats.

Expansions also were evaluated for TNRs that are unlikely to form hairpins (7, 8). CTA was used as an alternative to CTG, and TAG served as a substitute for CAG. Both CTA and TAG were stable, yielding rates of <1.6 × 10−8 per generation (Table 2). We conclude that frequent expansions depend on the sequence of the TNR tract.

DISCUSSION

This paper describes a selective and quantitative genetic assay for TNR expansions in yeast. The expansions arise spontaneously in wild-type backgrounds, obviating the need for mutant strains. Expansions occur at high frequencies (≈10−5 per generation) for a (CTG)25 tract, indicating the unstable nature of this TNR. Molecular analysis of the expanded alleles indicated that 95% of the selected colonies harbored bona fide CTG expansions. Interpretation of the results is also facilitated by integration of the TNR at loci where the direction of replication is known, allowing assignment of the leading and lagging strands.

We feel that three results are particularly supportive of the hairpin model for CTG and CAG expansions. First is the inversion experiment in Table 2 in which TNRs were tested in the forward and reverse directions. Frequent expansions were observed only when CTG tracts resided on the lagging daughter strand. It is difficult to imagine other, nonreplicational mechanisms to explain the conversion of stable sequences to unstable and vice versa. Second, the rare expansions of CAG result in notably larger tracts than for CTG, consistent with the idea that CAG forms weaker hairpins. Third, CTA and TAG tracts and a randomized (C,A,G) sequence, which are very unlikely to adopt secondary structure, do not expand in our system.

Some of the unique features of TNR expansions are underscored by facets of our experiments. We find it remarkable that simple insertions of CTG/CAG tracts into yeast leads to genetic instability. This interpretation is consistent with conclusions by McMurray and colleagues (4, 6) that expansions are directed primarily by the DNA rather than the cellular environment. Many authors have speculated that flanking sequences from the human genes may influence expansions. Molecular modeling suggests that incorporation of flanking nucleotides helps stabilize TNR hairpins (7). In our system, however, no human flanking sequences are present, yet CTG tracts clearly are unstable. Perhaps inclusion of human flanking elements may increase the rate of expansions by adding stability to the putative hairpin. We note that in humans, TNR expansions can be extremely unstable, with expansion frequencies approaching 100% in extreme cases (reviewed in ref. 1). For CTG tracts in our system, the flanking elements are not essential for instability.

We hypothesize that a threshold feature may explain the strong orientation effects for CTG/CAG repeats. In our system, rate results and PCR findings are consistent with the idea of an operative threshold of <25 CTG repeats. By extension, we predict that the threshold for CAG sequences will be >25 repeats. These conclusions are consistent with in vitro melting temperature data for single-stranded 15-repeat molecules (summarized in ref. 8), which showed that CTG hairpins melt 9° higher than CAG hairpins. Our data with TNR contractions (27) also are consistent with the threshold explanation, if one assumes that expansions and contractions are governed by similar threshold values. In the contraction experiments, CAG or CTG tracts of 50 repeats deleted at high rates (≈10−3 per generation) regardless of the TNR orientation (27). The lack of an orientation effect for contractions suggests that a threshold of <50 repeats may exist. This conclusion supports our data with expansions. Other labs investigating TNR contractions in yeast have reported orientation effects, but only for longer repeats of 52–130 repeats (13, 17). It is possible that shorter repeats in our system are more sensitive to orientation, perhaps because of their placement near an active promoter element. Our estimate of a CTG threshold of <25 repeats in yeast is somewhat smaller than the corresponding estimate of roughly 35 repeats in humans (reviewed in ref. 1).

The results of this work are consistent with the hairpin hypothesis for CTG and CAG expansions. It will be interesting to see whether other TNRs that expand in humans, such as CGG and GAA, behave similarly in yeast. Data on other TNRs will help determine whether a single model explains all large expansions or whether multiple mechanisms may be involved.

Acknowledgments

We thank Bonny Brewer for sharing results before publication. We also thank Richard Pelletier, Mike Rolfsmeier, Troy Luster, and Luisa Pessoa-Brandão for valuable assistance. This work was supported by a research grant from the Muscular Dystrophy Association (to R.S.L.), by a postdoctoral fellowship from the Hereditary Disease Foundation (to J.J.M.), and by National Cancer Institute Cancer Center Support Grant P30 CA36727 (to the Eppley Institute).

ABBREVIATIONS

TNR

trinucleotide repeat

5FOA

5-fluoroorotic acid

References