How Selenium Has Altered Our Understanding of the Genetic Code (original) (raw)


Selenium is an essential micronutrient in the diet of many life forms, including humans and other mammals. Significant health benefits have been attributed to this element. It is rapidly becoming recognized as one of the more promising cancer chemopreventive agents (19), and there are strong indications that it has a role in reducing viral expression (4), in preventing heart disease and other cardiovascular and muscle disorders (23), and in delaying the progression of AIDS in human immunodeficiency virus-infected patients (3). Additional evidence suggests that selenium may have a role in mammalian development (51), in immune function (70), in male reproduction (30), and in slowing the aging process (70).

Despite the many potential health benefits of selenium, the means by which this element promotes better health are only just beginning to be elucidated (31, 91). There are about 20 known selenium-containing proteins in mammals (33), and it would seem very likely that several of these are mediators of health benefits of dietary selenium. Therefore, it is critical to understand how selenium is inserted into protein and the identities and functions of the resulting protein products. Selenium is present in naturally occurring selenium-containing proteins in two basic forms. It can be inserted posttranslationally as a dissociable cofactor (32). This rare form of protein-associated selenium has been found only in several bacterial molybdenum-containing enzymes and will not be discussed further in this review. Selenium is also cotranslationally inserted into protein as the amino acid selenocysteine (Sec). Such occurrence of this element in protein is widespread in all major domains of life and is responsible for the majority of biological effects of selenium. The elucidation of how Sec is incorporated into protein has progressed at a rapid pace in the last decade and has revealed some surprising results. In fact, unraveling this mystery has altered our understanding of the genetic code, as the code has now been expanded to include Sec as the 21st naturally occurring amino acid. When the code was deciphered in the mid-1960s (48, 79), 20 amino acids were assigned to 61 of the possible 64 codons within the triplet code and 3 codons were found to function as terminators for protein synthesis. Each of the 64 code words was therefore assigned a function, and there did not appear to be room for additional amino acids. Although it was recognized in the mid-1960s that one code word, AUG, had a dual role of initiating protein synthesis and inserting methionine at internal protein positions, the possibility that a second codon also had two functions was not considered at that time. We now know that UGA serves as both a termination codon and a Sec codon. The means by which UGA serves as a Sec codon and how Sec is biosynthesized and incorporated into protein have been examined in considerable detail with eubacteria (reviewed in reference 7) and with mammals (this review). While the fundamental mechanism of Sec insertion in these organisms appears to be similar, recent studies suggest that mammals evolved additional components that allow incorporation of multiple Secs into a single protein and provide stringent regulation of Sec biosynthesis. The present review discusses our current knowledge of these features in mammals.

It should be noted that selenium can also be incorporated nonspecifically into protein (42). The nonspecific occurrence of this element in protein arises when selenium replaces sulfur in the biosynthesis of cysteine or methionine and the resulting selenoamino acid (Sec or selenomethionine) is inserted in place of the natural amino acid. Such misincorporation of selenium into protein may be toxic; this subject has been reviewed elsewhere (42).

UGA DICTATES Sec INCORPORATION INTO PROTEIN

The initial studies, which suggested that UGA coded for Sec, involved sequencing selenoprotein genes and aligning their open reading frames with the amino acid sequences of the corresponding gene products. The genes for two Sec-containing proteins, glutathione peroxidase 1 (GPX1) in mammals (13, 77, 85) and formate dehydrogenase in Escherichia coli (103), were the first genes found to contain TGA in the open reading frame. Interestingly, the TGA codons aligned with the Sec residue in the corresponding gene products. Such studies did not demonstrate unequivocally that UGA codes for Sec. Theoretically, a tRNA decoding UGA could introduce a precursor of Sec into the nascent selenopeptide in response to UGA and the resulting amino acid residue would be modified to Sec posttranslationally. Since Sec, unlike the other 20 amino acids in the genetic code, was found to be biosynthesized on its tRNA (8, 41), the insertion of a precursor amino acid (e.g., phosphoserine [39 and references therein]) was a possibility. However, the fact that Sec was subsequently shown to be attached to its tRNA intracellularly in both bacterial (59) and mammalian (55) cells provided the strongest evidence at that time that Sec was indeed the 21st amino acid in the genetic code. The expanded genetic code that includes Sec is shown in Fig. 1.

FIG. 1.

FIG. 1.

The genetic code showing that Sec is the 21st amino acid and that Sec is coded by UGA. ∗, UGA and Sec; ▴, AUG, the other codon in the genetic code that serves a dual function.

Sec tRNA, A MOST NOVEL tRNA

Sec tRNA (hereafter designated Sec tRNA[Ser]Sec and defined below in “Sec biosynthesis occurs on its tRNA”) is the only known tRNA that governs the expression of an entire class of proteins, the selenoproteins. Sec tRNA[Ser]Sec has therefore been called the key molecule (8) and the central component (41) in selenoprotein biosynthesis. The structure of Sec tRNA[Ser]Sec from mammals is shown in Fig. 2 in both a 9/4 (8) and a 7/5 (24) cloverleaf form (i.e., nine or seven paired bases in the acceptor stem and four or five paired bases in the T stem). Evidence for both secondary structures has been presented (46, 47).

FIG. 2.

FIG. 2.

Mammalian Sec tRNA[Ser]Sec shown in a 9/4 and a 7/5 cloverleaf model (see text). Brackets around bases at positions 11, 12, and 47c indicate those bases that have been shown to vary in different mammals, as discussed in the text.

Sec tRNA[Ser]Sec has additional features that distinguish it from all other tRNAs. For example, at 90 nucleotides in length, it is the longest eukaryotic tRNA sequenced to date (2, 24, 26, 39). This is due to an atypically long variable arm and the presence of 13 nucleotides in the acceptor and TΨC stem helices (where Ψ indicates pseudouridine) instead of the 12 normally found in all other tRNAs. Sec tRNA[Ser]Sec contains relatively few modified nucleotides (Fig. 2) compared to other tRNAs, which may have as many as 15 to 17 modified nucleotides. It may have up to six base pairs in the dihydrouracil stem instead of the three to four found in other tRNAs. Transcription of Sec tRNA[Ser]Sec is also unusual as it begins at the first nucleotide within the coding sequence (56) while all other tRNAs, whether they are of nuclear or organelle origin, have a 5′ leader sequence that must be processed. Sec tRNA[Ser]Sec, therefore, has a 5′ triphosphate on its terminal guanosine moiety. Many additional novel features of Sec tRNA[Ser]Sec transcription have also been reported, and this subject has been thoroughly reviewed elsewhere (42).

The Sec tRNA[Ser]Sec gene occurs in single copy in the genomes of all mammals examined thus far, including humans, mice, rats, rabbits, cattle, and Chinese hamsters (reference 10 and references therein). The primary sequence of the gene has also been determined in chickens, frogs, zebra fish, fruit flies, and nematodes (10); like that in mammals, it is 87 nucleotides long (the CCA terminus is added posttranscriptionally to make its final length of 90 nucleotides). The only exception to the occurrence of single-gene copy in the genomes of animals was found in zebra fish, where two gene copies exit (100). The genomes of humans and rabbits also encode one pseudogene, while that of Chinese hamsters encodes three pseudogenes.

The Sec tRNA[Ser]Sec population in mammalian cells contains two major isoforms that differ from each other by a single methyl group in the wobble position (position 34) of the anticodon (see Fig. 2). One isoform contains methylcarboxymethyl-5′-uridine (mcm5U) at position 34, and the other contains methylcarboxymethyl-5′-uridine-2′-_O_-methylribose (2, 26). mcm5U is the precursor of mcm5Um (17, 50), and addition of this methyl group marks the final step in Sec tRNA[Ser]Sec maturation. Interestingly, the addition of this methyl group is responsive to selenium status (15, 26, 40), and its presence confers dramatic changes in tertiary structure (26). Efficient methylation of mcm5U to form mcm5Um requires the prior synthesis of each modified base (m1A at position 58, Ψ at position 55, and isopentenyladenosine [i6A] at position 37) and an intact tertiary structure (50). Synthesis of m1A, Ψ, i6A, and mcm5U was not connected to primary and tertiary structure as stringently as that of mcm5Um. These studies, as well as those demonstrating increased methylation at the 2′-_O_-ribose position in the presence of selenium (15, 26, 40) and alteration in tertiary structure with 2′-_O_-ribose methylation (26, 50), suggest that the two major Sec tRNA[Ser]Sec isoforms have different physiological roles.

As noted above, the distributions and relative amounts of mcm5U and mcm5Um are influenced by selenium status, and their levels vary in different mammalian cells and tissues (15, 26, 40). The enrichment of the Sec tRNA[Ser]Sec population in the presence of selenium is apparently due to reduced turnover rather than enhanced transcription, as evidenced by studies showing this effect in Xenopus oocytes in the absence of de novo transcription (17) and by direct measurement of the effects of selenium on Sec tRNA[Ser]Sec turnover in CHO cells (R. J. Coppinger, B. A. Carlson, M. Butz, K. Esser, D. L. Hatfield, and A. M. Diamond, unpublished data).

Sec BIOSYNTHESIS OCCURS ON ITS tRNA

Although the biosyntheses of glutamine and asparagine can occur on their tRNAs, this mode of synthesizing amino acids is restricted to certain life forms and is not universal in nature (93). In contrast, the biosynthesis of Sec that is incorporated into protein in response to UGA codons is distinctive from the other 20 amino acids in the genetic code in that its synthesis always occurs on its tRNA (7, 10). Since Sec can be attached to tRNACys by cysteyl tRNA synthetase and incorporated nonspecifically into protein in response to Cys codons (42), then evolution of Sec biosynthesis on its tRNA provides another means by which this amino acid could have been included in the genetic code. This proposal is consistent with the suggestion that the presence of Sec in the code occurred relatively late in the code's evolution (33).

Sec tRNA[Ser]Sec is initially aminoacylated with serine in both prokaryotes (7) and eukaryotes (10), and serine serves as the backbone for Sec synthesis (7, 10, 90). Since serine is attached to Sec tRNA[Ser]Sec by seryl tRNA synthetase and the identity elements in Sec tRNA[Ser]Sec are for serine and not Sec, but the amino acid inserted into protein is Sec, this tRNA has been designated Sec tRNA[Ser]Sec (41). The identity elements for mammalian Sec tRNA[Ser]Sec include the long variable arm and the discriminator base, both of which are essential for aminoacylation (80, 99). The acceptor, TΨC, and D stems also play a role in the identity process (1).

The biosynthesis of Sec from serine on Sec tRNA[Ser]Sec has been completely characterized for E. coli (7, 8), but the specific steps in this process in mammals are unknown. In E. coli, a pyridoxal phosphate-dependent Sec synthase catalyzes the removal of hydroxyl group from serine to form an aminoacrylyl intermediate. This intermediate serves as the acceptor for activated selenium, resulting in the formation of selenocysteyl- tRNA[Ser]Sec (7, 8). In mammals, a minor seryl tRNA which decoded UGA (38) and formed phosphoseryl tRNA (reference 39 and references therein) was subsequently identified as Sec tRNA[Ser]Sec (55). The roles of the kinase and phosphoseryl tRNA[Ser]Ser in the biosynthesis of Sec have not been characterized. However, the formation of phosphoserine is consistent with a Sec synthase-catalyzed reaction, as phosphorylated serine would have a better leaving group than serine in the Sec biosynthetic pathway.

The active form of selenium that is donated to the intermediate in Sec biosynthesis has been identified in prokaryotes as monoselenophosphate, which is synthesized from selenide and ATP by selenophosphate synthetase (34). Although the active selenium donor in eukaryotes has not been characterized, it is likely the same selenium form (37, 49, 63). Two selenophosphate synthetase genes in mammals, designated Sps1 and Sps2, have been identified (37, 49, 63). SPS2 is a selenoprotein, suggesting that it is involved in the autoregulation of its own biosynthesis (37). Once the activated form of selenium is donated to the intermediate, the biosynthesis of Sec on tRNA[Ser]Sec is completed.

NOVEL INSERTION OF Sec INTO PROTEIN

The fact that UGA has a dual role of serving as a stop and a Sec codon (see Fig. 1) raises an important question of how the cell distinguishes between these two functions. Besides Sec tRNA[Ser]Sec and the in-frame UGA codon in selenoprotein mRNA, there are several other factors that are required for the donation of Sec to protein and dictate the specific function of UGA as Sec. These include (i) the _cis_-acting stem-loop structure, designated the Sec insertion sequence (SECIS) element (62); (ii) the SECIS-binding protein 2 (SBP2) (21, 22, 64); and (iii) the Sec-specific elongation factor (EFsec, also called mSelB) (27, 92).

SECIS elements.

SECIS elements are present in 3′ untranslated regions (3′-UTRs) of all eukaryotic selenoprotein genes (62). In archaea, SECIS elements are also located in the 3′-UTRs, but the structures themselves are different from the eukaryotic counterparts (82). Bacterial SECIS elements differ from both eukaryotic and archaeal structures and are located in the coding regions of selenoprotein genes, immediately downstream of Sec-encoding UGA codons (7).

Eukaryotic SECIS elements are composed of two helixes separated by an internal loop; a SECIS core structure, Quartet, located at the base of helix 2; and an apical loop (Fig. 3). The Quartet is formed by four non-Watson-Crick interacting base pairs and is the main functional site of the stem-loop structure (94). When the apical loop is large enough, an additional ministem is formed that presumably stabilizes the SECIS element. The presence of this ministem was used to classify SECIS elements into form 1 and form 2 structures, with form 1 SECIS elements lacking, and form 2 SECIS elements containing, the ministem (35). These SECIS forms are interconvertible by mutations that extend or shorten the apical loop or by natural evolution of selenoprotein genes. Primary sequence conservation of eukaryotic SECIS elements is almost nonexistent, with the only strictly conserved nucleotides being UGA in the 5′ portion and GA in the 3′ portion of the Quartet. In addition, a nucleotide immediately preceding the Quartet and two unpaired nucleotides in the apical loop are adenosines in the majority of selenoprotein genes (52).

FIG. 3.

FIG. 3.

SECIS element consensus structure. Conserved nucleotides in the SECIS core (Quartet) are shown in boldface. Adenosines that are conserved in most, but not all, selenoprotein genes are also indicated.

The presence of a SECIS element in the 3′-UTR of selenoprotein genes dictates any in-frame TGA codon within the coding region to serve as Sec when a minimal spacing requirement between TGA and SECIS element (51 to 111 nucleotides) is met (62). This property suggests that SECIS elements are both necessary and sufficient for Sec insertion, provided that mRNA bearing an in-frame UGA and a 3′-UTR SECIS element have access to ribosome-based protein synthesis machinery and Sec-specific translation factors. This property also suggests that designing UGA-SECIS pairs in nucleotide sequences can be used for targeted insertion of Sec into protein.

SBP2 and EFsec.

SECIS elements function by recruiting SBP2 to form a tight SECIS-SBP2 complex (Fig. 4) (21, 22, 64). SBP2 binds to the SECIS Quartet and also to sequences directly preceding the Quartet, but not to the apical loop (21, 29). The reason for strict conservation of the length of helix 2 (located between the Quartet and adenosines in the apical loop) is not understood. Evidence was also presented that SBP2 is stably associated with ribosomes via 28S rRNA in a manner independent of its Sec insertion function, suggesting that SBP2 preselects ribosomes for Sec insertion (21). An RNA-binding domain was identified in the C-terminal sequence of SBP2, and an additional domain was identified that was required for Sec insertion, but not for SECIS binding.

FIG. 4.

FIG. 4.

Mechanism of Sec insertion in eukaryotes (see text). Selenocysteyl-tRNA (in orange with Sec in yellow) is shown in a complex with EFsec (in blue) and SBP2 (in green) and the SECIS element (shown as a hairpin loop in black) that is ready for donation to the ribosomal A site to be decoded by UGA (shown in the selenoprotein mRNA in black). Once the Sec tRNA[Ser]Sec complex is donated to the A site, Sec tRNA[Ser]Sec is transferred to the peptidyl site and Sec is incorporated into the nascent selenopeptide. The growing selenopeptide is shown as alternating gold and blue balls attached to the tRNA in the peptidyl site. The mRNA (shown in black with its start and stop codons indicated) is attached to the smaller of the two ribosomal subunits, and the unacylated tRNA is shown leaving the ribosomal exit site.

Besides binding to SECIS elements and ribosomes, SBP2 binds EFsec, which in turn recruits Sec tRNA[Ser]Sec and inserts Sec into nascent polypeptides in response to UGA codons (27, 92). EFsec is specific for Sec and is different from EF1A, which is involved in insertion of the other 20 amino acids. SBP2 and EFsec jointly constitute the functional equivalent of the single SELB factor in bacteria (7). The occurrence of SBP2 and EFsec as separate proteins in eukaryotes suggests a mechanism for rapid exchange of the Sec tRNA[Ser]Sec-EFsec complex (from empty to aminoacyl tRNA bound), following Sec insertion.

Other factors.

Additional _trans_-acting factors have been implicated in Sec insertion in eukaryotes. Among these, Sec synthase is probably the major missing piece in the eukaryotic selenoprotein machinery. Bacterial Sec synthase was described several years ago (reference 7 and references therein), but its counterpart in archaea and eukaryotes is not known. As discussed above, the roles of the seryl tRNA[Ser]Sec kinase and phosphoseryl tRNA[Ser]Sec in Sec biosynthesis are not known, but characterization of the kinase would certainly shed light on this issue. As also discussed above, Sec tRNA[Ser]Sec exists in two isoforms that differ by a single methyl group, and the Sec tRNA[Ser]Sec methylase may in addition play a role in regulating the mammalian Sec insertion machinery.

UGA: STOP OR GO?

Even though we have considerable insight into the factors involved in Sec insertion into protein (see above), there are several additional aspects of selenoprotein biosynthesis that determine whether a UGA Sec codon dictates termination or readthrough. Sec insertion appears to be an inefficient process (references 28, 29, 64, and 66 and references therein), and clearly some Sec UGA codons support both readthrough and termination. Selenoprotein P (SelP) from rat plasma contains 10 Sec residues and occurs in four isoforms (65). The shorter isoforms arise from termination at the second, third, and seventh UGA codons. Thus, these UGA codons are programmed to dictate a cessation in SelP expression as well as a continuation in SelP production. What then determines the fate of a UGA Sec codon? In addition to the _cis_-elements UGA and SECIS that are essential for Sec insertion, there are several _trans_-acting factors as well as other _cis_-elements that influence the interplay between Sec incorporation and translation termination.

The nucleotide context of UGA is a _cis_-feature that also has a role in governing Sec incorporation versus translation termination (62, 69). In mammals, a purine at the position immediately 3′ to UGA (the +1 position) favors termination, while a pyrimidine in this position favors readthrough (62, 69). The base at the +2 position and the first codon (36) or the first two codons (78) immediately upstream of UGA also influence termination efficiency. A +1 pyrimidine followed by a +2 purine appears to favor termination (36); interestingly, all 10 residues in rat plasma SelP contain either a +1 purine or the +1, +2 pyrimidine-purine combination (65). It is not clear what role the penultimate codon or codons play in termination versus readthrough in SelP, but it would seem that all 10 Sec residues are encoded in favorable termination contexts. Furthermore, Ma et al. (65) suggested that _trans_-acting factors likely play an important role in the fate of SelP UGA Sec codons. It should be noted, however, that the +1 base and other downstream _cis_-acting elements have been reported to play only a minor role in termination in one study (78) but a more significant role in another study (36). The discrepancies in these two studies most certainly reflect the use of different model (reporter) systems. Different Sec codons manifest different insertion efficiencies (36, 65, 78) and thus may have different _cis_- and _trans_-acting requirements in establishing Sec insertion-translation termination interplay.

_trans_-Acting factors, such as SBP2, EFsec, Sec tRNA[Ser]Sec, the termination factor eukaryotic released factor 1 (eRF1), and the eRF1- and ribosome-dependent GTPase eRF3, are likely candidates that regulate Sec incorporation-translation termination interplay (21, 29, 36, 64, 78). SBP2 is envisioned to function in Sec incorporation and to prevent translation termination (20, 21), and although this is most certainly the case, hard data supporting this model are lacking (21). As discussed above, SBP2 is bound tightly to ribosomes and this and other characteristics of SBP2 suggest that this factor is involved in ribosome selection for Sec incorporation (21). In this model, ribosomes containing SBP2 would incorporate Sec, while those that do not would terminate translation. If SBP2 is first bound to ribosomes, then a likely next step would involve the formation of a quaternary complex between SBP2, the SECIS element, EFsec, and Sec tRNA[Ser]Sec (21). What is so intriguing about this model is that SBP2 is the major player in determining the efficiency of selenoprotein synthesis.

Overexpressing selenoprotein mRNA either in vitro or in transformed cells provides a model system for determining what factors are limiting in selenoprotein expression, as the level of termination at the Sec UGA codon increases due to limitation of one or more of the _trans_-acting factors (reference 64 and references therein). Under these conditions, SBP2 enhances selenoprotein expression, whereas EFsec and Sec tRNA[Ser]Sec have only marginal effects, demonstrating that SBP2 is the limiting factor. SBP2 is also limiting in rabbit recticulocytes (21). Polysome loading onto mRNAs programmed for Sec incorporation was decreased in wild-type compared to cysteine mutant mRNAs where UGA is changed to a Cys codon (28, 66). However, polysome loading and Sec incorporation were increased by excess eRF1 (66) or SBP2 (28). These results suggest a defect in translation at the UGA codon in selenoprotein mRNAs and that this flaw in protein synthesis is influenced by _trans_-acting factors.

Once SPB2 is bound to a SECIS element, it does not readily disassociate from the element (21, 64), suggesting that SBP2 remains attached primarily to its element throughout protein synthesis. Increased levels of eRF1, however, were observed to have minimal affects on Sec incorporation (78) or to enhance this process (36), while excess eRF3 had no affect (36). Since it is envisioned that Sec incorporation competes with translation termination (28, 29, 64, 66) and that different UGA codons have different Sec incorporation efficiencies (36, 65, 78), it was suggested that the termination factors are at saturating levels intracellularly, and thus excess amounts of these components do not have any apparent effect on Sec incorporation (78). Alternatively, excess RF1 may be involved in sequestering RF3 and therefore not be available for UGA Sec codon competition (36). As noted above, the variations in findings reported in these two studies are most likely due to the use of different model systems.

Overexpression of Sec tRNA[Ser]Sec was found to enhance Sec insertion in one study involving an analysis of the factors effecting Sec incorporation and translation termination (78). This finding, however, is not consistent with studies showing that reductions (9, 14) or enrichments (75, 76) in the levels of the Sec tRNA[Ser]Sec population in mammalian cells or tissues do not affect selenoprotein biosynthesis. Inclusion of two or three UGA codons in the same reading frame was found to result in considerable reduction in synthesis of the full-length product compared to that observed with a single UGA codon (78). These investigators proposed, largely from this observation, that Sec insertion favors a nonprocessive rather than a processive mechanism (78).

SELENOPROTEINS: IDENTITY AND FUNCTIONS

The known Sec-containing proteins in animals are shown in Fig. 5. The number of selenoproteins identified has increased dramatically in the last several years. Interestingly, with the exception of selenophosphate synthetase, there is no overlap between eukaryotic and prokaryotic selenoproteomes (all selenoproteins in an organism). Bacterial and archaeal selenoproteins are primarily involved in catabolic processes and utilize selenium to catalyze various redox reactions (84). In contrast, functionally characterized eukaryotic selenoproteins participate in antioxidant and anabolic processes. These observations suggest an independent origin of prokaryotic and eukaryotic selenoproteomes (33).

FIG. 5.

FIG. 5.

Animal Sec-containing proteins. All currently known selenoproteins are listed (left). The relative sizes of selenoproteins (empty boxes) and the locations of Sec (red box) and an α-helix immediately downstream of Sec (green box) in the selenoprotein sequences are indicated (right).

In eukaryotes, disruption of the Sec tRNA[Ser]Sec gene is embryonically lethal, suggesting an essential function for one or more selenoproteins in development (9). One candidate for an essential selenoprotein gene is the thioredoxin reductase gene. The protein expressed by this gene is present in all living organisms, but its Sec-containing form occurs only in animals. Moreover, disruption of thioredoxin, a substrate for thioredoxin reductase, is lethal, as shown by studying mice lacking the thioredoxin gene (68).

One general theme is evident from the analysis of eukaryotic selenoproteins. Although these selenoproteins do not have sequence homology, similar structures, or related functions, the location of Sec in these proteins appears to be limited to only several positions. In fact, the majority of eukaryotic selenoproteins can be assigned to one of two groups according to Sec location (31). One selenoprotein group includes proteins containing Sec in the N-terminal portions of short domains. These proteins are largely αβ proteins, and Sec is often located in these proteins in the loop between a β-strand and an α-helix, according to secondary structure predictions. This location is similar to that of the CXXC motif (two cysteines separated by two other amino acids), which is involved in the redox reactions catalyzed by thiol-disulfide oxidoreductases. In fact, several selenoproteins employ a similar redox motif, except that one of the Cys residues is replaced by Sec. For example, SelW, SelT, SelM, BthD, and their homologs possess a CXXU motif (where U is Sec), whereas SelP and its homologs have a UXXC motif. Other selenoproteins of this group, such as the GPX homologs, contain only a single Sec (i.e., no Cys partner in the CXXC motif), suggesting that Sec forms either predicted intermolecular selenosulfide bonds or selenenic acid derivatives during redox catalysis.

The second group of eukaryotic selenoproteins is characterized by the presence of Sec in C-terminal sequences. In three mammalian thioredoxin reductases, which contain a C-terminal GCUG motif, Sec is located in the flexible C-terminal extension (86). This situation is functionally similar to the fusion of a low-molecular-weight redox compound to the C terminus of a common functional domain (in the case of thioredoxin reductases, it is a pyridine nucleotide disulfide oxidoreductase domain) (87). The function of the Sec-containing motif in thioredoxin reductase is to transfer reducing equivalents from the buried disulfide active site to the active center of a protein substrate. In the case of thioredoxin reductase 2, which contains an additional N-terminal thiol-disulfide oxidoreductase domain, the Sec center transfers electrons to this domain (87). An additional protein of the C-terminal Sec group is the Drosophila melanogaster G-rich protein (67). This protein also contains a C-terminal penultimate Sec residue, followed by a C-terminal glycine. The function of the G-rich protein is not known.

Independently of the location of Sec in functionally characterized selenoproteins, this amino acid appears to participate in redox reactions. In selenoenzymes where Sec has a close Cys partner (e.g., SelT, SelW, BthD, etc.), secondary structure appears to stabilize a highly reactive selenolate, whereas in the positions close to the C terminus, the advantage of Sec over Cys may be due to its lower pKa. Indeed, most cysteines are protonated under physiological pH conditions, whereas Sec residues (pKa, ∼5.5) are ionized. The role of steric differences has also been suggested to account for the use of Sec. For example, the C terminus of animal thioredoxin reductases, Gly-Cys-Sec-Gly, forms an intramolecular selenosulfide bond (57, 102). However, the corresponding disulfide bond is not stable due to the decreased atomic size of sulfur compared to that of selenium.

Among functionally characterized mammalian antioxidant selenoproteins are four glutathione peroxidases and three thioredoxin reductases. In addition, recent studies revealed that one of the new selenoproteins, SelR, is a zinc-containing methionine sulfoxide reductase with specificity for methionine-_R_-sulfoxides (54). MsrA, an enzyme catalyzing a complementary reaction (i.e., a methionine-_S_-sulfoxide reduction), has been known for decades (98). It has been implicated in antioxidant defense and the life span of mammals (74). With the discovery of SelR function, a possibility is raised that selenium is also involved in aging.

It should be noted that the functions of the majority of selenoproteins are not known. Characterization of their functions is an obvious direction in selenoprotein research.

HIERARCHY OF SELENOPROTEIN EXPRESSION

A characteristic of selenium deficiency in mammals is that a hierarchy exists with respect to maintaining the levels of individual selenoproteins and retaining selenium in different organs (5, 45, 58, 64, 72). For example, in selenium-deficient rats, GPX1 activity was reduced to 1% of that observed in the livers of selenium-sufficient rats and to about 4 to 9% in selenium-deficient heart, kidney, and lung tissue. GPX4 activity, however, was reduced only to 25 to 50% in these tissues and was unaffected in testes. Interestingly, transgenic mice expressing i6A-deficient Sec tRNA[Ser]Sec had reduced levels of selenium in their tissues and a hierarchy of selenoprotein activities similar to that observed with selenium-deficient mice (76). The extent of selenoprotein reduction varied in the i6A-deficient Sec tRNA[Ser]Sec mice, depending on the organ examined (76).

During selenium deprivation in the diets of rats and mice, the amounts of this element were substantially reduced in liver and kidney, while brain and testes retained most of their selenium (5, 44). The levels and maturation of Sec tRNA[Ser]Sec (15, 26, 40) and the efficiency of selenoprotein synthesis (see references 21 and 36 and references therein) are responsive to selenium status; thus, these parameters are more affected in liver and kidney than in brain and testes by changes in selenium status (15, 26, 44). The greater sensitivity of GPX1 activity to selenium deficiency has been attributed largely to an increased turnover in mRNA (18, 58, 83). The enhanced degradation of GPX1 mRNA under conditions of selenium deficiency occurs by the surveillance pathway, designated nonsense-mediated decay (NMD), where the UGA Sec codon is recognized as nonsense (73, 89, 97). Interestingly, the position of the UGA Sec codon relative to the sole, downstream intron in GPX1 mRNA determines whether the mRNA is subject to NMD (88). However, other selenoprotein mRNAs, such as DI1, GPX4, and SelP, are not as sensitive as GPX1 to NMD during selenium deprivation despite the presence of introns downstream of their UGA codons (44, 58, 89). The reduction in GPX1 activity in transgenic mice carrying i6A-deficient Sec tRNA[Ser]Sec is not likely due to mRNA turnover, since GPX1 mRNA levels were not significantly altered in the kidneys of these animals compared to those of wild-type animals (76). SBP2 has also been reported to preferentially recognize SECIS elements in specific selenoprotein mRNAs, suggesting a mechanism to account, at least in part, for selenoprotein expression hierarchy during selenium deficiency (64). However, another group found little or no difference in SPB2 recognition of the SECIS element and suggested that this is not likely to be a mechanism involved in selenoprotein hierarchy (29). Both groups agreed that if SPB2 recognition of SECIS elements is involved in the extreme sensitivity of GPX1 mRNA to NMD, then an additional factor must also be required. In any case, it would seem that there are several levels of regulation involved in determining the priority of selenoprotein synthesis under various biological conditions.

IDENTITY OF Sec UGA CODONS

Since the occurrence of UGA in the genetic code is most commonly used for the cessation of protein synthesis, the identification and correct annotation of selenoprotein genes containing Sec-encoding UGAs have been difficult. In fact, the absolute majority of selenoprotein genes are incorrectly annotated in completely sequenced genomes, including the human genome. Typically, Sec-encoding TGA codons are recognized by the currently available annotation programs as stop signals; alternatively, the entire exons containing TGA codons are not recognized. In addition, there are examples when in-frame 5′-UTR sequences or terminator UGA codons were incorrectly interpreted as Sec codons (12).

Since selenoprotein genes do not have a common amino acid consensus sequence or a functional motif and the location of TGA within coding sequences is not universally conserved, the SECIS element provides an identifier that can help in the annotation of uncharacterized selenoprotein genes. Although conservation of the primary sequence of SECIS elements is low, their secondary structures are conserved. In addition, calculation of the free energy of SECIS elements as a measure of their stability aided in describing these structures computationally. A computer program, SECISearch, has been developed that is capable of identifying selenoprotein genes in nucleotide sequence databases (53). Initial applications of SECISearch or similar approaches to expressed sequence tag databases identified three selenoprotein genes and were the first examples of identification of new genes by searching for RNA structures (53, 60). Subsequently, this approach was applied to the entire genome of D. melanogaster (11, 67). These studies revealed the presence of three selenoprotein genes, including two proteins (G-rich protein and BthD) that had no homology to known proteins (67). Not surprising, these genes were incorrectly annotated in the completed fly genome. Identification of selenoprotein genes through recognition of SECIS elements should be useful in the future analysis of the human genome.

CONSEQUENCES OF CHANGES IN Sec tRNA[Ser]Sec EXPRESSION

Since Sec tRNA[Ser]Sec is absolutely required for the expression of a relatively small class of proteins, genetic manipulations of this molecule can be used to study selenoproteins and the role of selenium in essential biological processes. The consequences of both overexpressing (75, 76) and underexpressing (9, 14) Sec tRNA[Ser]Sec have been examined, as well as the consequences of expressing different mutant Sec tRNA[Ser]Sec forms (reference 76 and see below). Chinese hamster ovary cells were transfected with varying numbers, up to as many as 10 (75), of Sec tRNA[Ser]Sec gene copies and transgenic mice carrying as many as 20 wild-type Sec tRNA[Ser]Sec transgenes were generated, but there was no detectable effect on selenoprotein biosynthesis in either study. Most of the increase in the amount of the Sec tRNA[Ser]Sec population occurred in mcm5U levels, suggesting that the methylase that converts this isoform to mcm5Um is limiting for tRNA maturation. The fact that selenoprotein biosynthesis was not affected by enriching the Sec tRNA[Ser]Sec population demonstrates that the Sec tRNA[Ser]Sec isoforms are not limiting in protein synthesis.

The Sec tRNA[Ser]Sec population has also been reduced approximately in half in mice that were heterozygous for a targeting vector lacking the Sec tRNA[Ser]Sec gene (9) and in mouse embryonic stem cells that were heterozygous for a similar targeting vector (14). GPX1 levels were virtually the same in wild-type and heterozygous cultured cells (14) and in each of the tissues examined in wild-type and heterozygous mice (9), suggesting that the Sec tRNA[Ser]Sec population is not limiting. As discussed above, removal of both copies of the Sec tRNA[Ser]Sec gene from the mouse genome is embryonically lethal, demonstrating that selenoprotein expression is essential in mammalian development (9).

The consequences of overexpressing either a mutant Sec tRNA[Ser]Sec lacking the highly modified i6A at position 37 (76) or containing an A at position 34 in place of mcm5U or mem5Um (M. Moustafa, B. Carlson, M. El-Saadani, M. Rao, and D. Hatfield, unpublished data) on selenoprotein synthesis were examined by introducing multiple copies of the corresponding mutant gene into the mouse genome. The levels of several selenoproteins were altered in mice expressing either mutant Sec tRNA[Ser]Sec in a protein- and tissue-specific manner (reference 76 and unpublished data). Since the mRNA levels of those selenoproteins that were most effected by expression of the i6A− Sec tRNA[Ser]Sec remained essentially the same, the defect in selenoprotein synthesis occurred at the translation step. Maturation of Sec tRNA[Ser]Sec was inhibited in both these transgenic strains, as evidenced by the reduction in the mcm5Um isoform. These studies mark the first examples of transgenic mice engineered to encode functional tRNA transgenes and provide a model system for studying the role of specific selenoproteins in health.

UGA, A LOGICAL CODON CHOICE FOR Sec IN EVOLUTION

It can be argued that UGA is by far the most fascinating codon within the genetic code as it likely has served more functions than any other code word in evolution. For example, an examination of current genetic language shows that UGA functions as a termination codon (79); a Sec codon (8, 41); a cysteine codon in Euplotes octocarinatus (71); a tryptophan codon in mitochondria (81), Mycoplasma, and Sprioplasma (81, 95); an inefficiently read tryptophan codon in Bacillus subtilis (61); and an inefficiently read codon in E. coli that is presumably decoded by tryptophan tRNA (96). In mammals, the UGA stop codon in rabbit β-globin mRNA has been shown to serve as many as eight functions (16), including a stop codon; a suppressor codon that supports partial readthrough for Arg Cys, Trp, and Ser tRNAs (the latter tRNA is Sec tRNA[Ser]Sec, which is aminoacylated with serine [16]); and a translation reading gap codon with the abyss consisting of one, two, or three codons. The fact that other globin mRNAs terminate in UAA or UAG, but do not appear to serve as suppressor codons or to promote translation reading gaps, suggest that these functions are associated solely with UGA.

Since other stop or infrequently read codons can code for Sec when the anticodon in Sec tRNA[Ser]Sec is complementary to the corresponding codon used in place of TGA (6, 43), it would seem that any of a number of codons could have evolved for Sec. However, the variety of functions of UGA suggest that this codon has been loosely programmed in evolution and therefore is the most likely code word to have evolved for the infrequently used amino acid Sec. This possibility would seem to be even more plausible if the inclusion of Sec in the genetic code occurred in evolution after the code had evolved rather than if the original code accommodated Sec.

It should be noted that there are two contrasting proposals about when living organisms acquired the ability to synthesize selenoproteins (8, 33). One suggests that Sec was encoded by UGA in primitive anaerobic organisms and was a component of the primordial genetic code (8). In this theory, the subsequent increase of oxygen in the atmosphere by photosynthetic organisms counterselected against the use of Sec, because of the sensitivity of this amino acid to oxidation. An alternative hypothesis posits that Sec evolved only in the later stages of the development of the genetic code and that the number of selenoproteins accumulated rather than decreased in evolution (33). In contrast to the idea that a declining use of Sec occurred in evolution, this latter proposal suggested that many eukaryotic selenoproteins, serving as antioxidant and redox proteins, were employed by aerobic organisms to function in antioxidant systems.

As discussed above, Sec is dramatically different from any other of the 20 protein amino acids in the mode of its incorporation and basic biosynthetic steps. It is the only amino acid that directly requires a structural element in mRNA in addition to the information specified by the genetic code. It is synthesized on its own tRNA, while free Sec is not a substrate for selenoprotein synthesis. The Sec biosynthetic machinery is strikingly different from that of other amino acids and employs additional Sec-specific components. These unique features of Sec biosynthesis and insertion favor the view that Sec was added to the already existing genetic code to take advantage of the unique chemistry of selenium to counteract environmental stress and/or evolve new functions (33).

SUMMARY

The only addition of a new amino acid to the genetic code since this code was deciphered in the mid 1960s was the inclusion of the selenium-containing amino acid, Sec, that is coded by UGA. UGA, therefore, functions as both a signal for termination and a codon for Sec. Tremendous progress has been made in recent years in understanding the mechanism of how Sec is synthesized and inserted into nascent selenopeptides in mammals. This includes discoveries of how specific 3′-UTR mRNA structures, designated SECIS elements, function in recruiting SBP2, the Sec-specific EF, and selenocysteyl-tRNA[Ser]Sec into a large Sec insertion complex, the selenosome. In this unique amino acid insertion system, Sec tRNA[Ser]Sec is the key molecule that is used both as the site for Sec biosynthesis and for its incorporation into protein. The gene carrying this tRNA has been used as a tool to study the expression of selenoproteins by the introduction of additional wild-type and mutant transgenes into the mouse genome and by removal of the gene from the mouse genome. SECIS elements have been used in computational screens to identify a number of new selenoprotein genes, whose characterization will shed light on many biological and health-related properties of selenium.

Acknowledgments

We express our sincere appreciation to Marla J. Berry and Alan M. Diamond for their helpful suggestions regarding this review and to Bradley A. Carlson and Gregory V. Kryukov for assistance in preparing figures.

This work was supported by NIH grants GM061603 and CA080946 (to V.N.G.).

REFERENCES