Codon pairs in the genome of Escherichia coli (original) (raw)
Related papers
bioRxiv (Cold Spring Harbor Laboratory), 2019
Translation initiation is a critical step in the regulation of protein synthesis, and it is subjected to different control mechanisms, such as 5' UTR secondary structure and initiation codon context, that can influence the rates at which initiation and consequentially translation occur. For some genes, translation elongation also affects the rate of protein synthesis. With a GFP library containing nearly all possible combinations of nucleotides from the 3 rd to the 5 th codon positions in the protein coding region of the mRNA, it was previously demonstrated that some nucleotide combinations increased GFP expression up to four orders of magnitude. While it is clear that the codon region from positions 3 to 5 can influence protein expression levels of artificial constructs, its impact on endogenous proteins is still unknown. Through bioinformatics analysis, we identified the nucleotide combinations of the GFP library in Escherichia coli genes and examined the correlation between the expected levels of translation according to the GFP data with the experimental measures of protein expression. We observed that E. coli genes were enriched with the nucleotide compositions that enhanced protein expression in the GFP library, but surprisingly, it seemed to affect the translation efficiency only marginally. Nevertheless, our data indicate that different enterobacteria present similar nucleotide composition enrichment as E. coli, suggesting an evolutionary pressure towards the conservation of short translational enhancer sequences.
Codon usage in bacteria: correlation with gene expressivity
Nucleic Acids Research, 1982
The nucleic acid sequence bank now contains over 600 protein coding genes of which 107 are from prokaryotic organisms. Codon frequencies in each new prokaryotic gene are given. Analysis of genetic code usage in the 83 sequenced genes of the Escherichia coli genome (chromosome, transposons and plasmids) is presented, taking into account new data on gene expressivity and regulation as well as iso-tRNA specificity and cellular concentration. The codon composition of each gene is summarized using two indexes: one is based on the differential usage of iso-tRNA species during gene translation, the other on choice between Cytosine and Uracile for third base. A strong relationship between codon composition and mRNA expressivity is confirmed, even for genes transcribed in the same operon.The influence of codon use on peptide elongation rate and protein yield is discussed. Finally, the evolutionary aspect of codon selection in mRNA sequences is studied.
Codon usage in regulatory genes in Escherichia coli does not reflect selection for ‘rare’ codons
Nucleic Acids Research, 1986
It has often been suggested that differential usage of codons recognized by rare tRNA species, i.e. "rare codons", represents an evolutionary strategy to modulate gene expression. In particular, regulatory genes are reported to have an extraordinarily high frequency of rare codons. From E.coli we have compiled codon usage data for highly expressed genes, moderately/lowly expressed genes, and regulatory genes. We have identified a clear and general trend in codon usage bias, from the very high bias seen in very highly expressed genes and attributed to selection, to a rather low bias in other genes which seems to be more influenced by mutation than by selection. There is no clear tendency for an increased frequency of rare codons in the regulatory genes, compared to a large group of other moderately/lowly expressed genes with low codon bias. From this, as well as a consideration of evolutionary rates of regulatory genes, and of experimental data on translation rates, we conlclude that the pattern of synonymous codon usage in regulatory genes reflects primarily the relaxation of natural selection.
Missing Codon Pairs in the Genome of Escherichia Coli
Escherichia coli genome contains 4290 open reading j-ames (ORFs). Using own computer programme, the full set of ORFs was analyzed for occurrence of all possible combinations of codons in doublets (codon pairs) except for the combinations stop:stop and stop:sense codons. Thus 19 missing pairs were identified among the overall 1358854 analyzed codon pairs. Two of the missing pairs CCU:AGG and ACU:AGA represented combinations betJ.veen sense codons and 17 were combinations of sense and stop codons. The stop codons in the latter group of missing pairs were definitely biased. Except for one codon pair (ACU: UGA), where the stop codon was UGA, the stop codon in all the rest was UAG. Surprisingly, there was not a single missing codon pair containing the stop codon UAA. The sense codons found in most of the missing codon pairs belonged to the category of rare codons.
Biochemical and Biophysical Research Communications, 2004
Expression patterns in Escherichia coli of two small archaeal proteins with a natural content of about 30% rare codons were analyzed. The proteins, a histone-like protein from Sulfolobus shibatae (Ssh10), and a glutaredoxin-like protein from Methanobacterium thermoautotrophicum (mtGrx), were produced with expression plasmids encoding wild-type genes, codon-optimized synthetic, and GST-fusion genes. These constructs were expressed in BL21 (DE3), its LysS derivative, and modified strains carrying copies for rare codon tRNAs or deletions in the RNAseE gene. Both Ssh10 and mtGrx expression levels were constitutively high in BL21(DE3) and its derivatives, with the exception of the LysS phenotype, which prevented high level expression of the Ssh10 wildtype gene. Surprisingly, a codon-optimized mtGrx gene construct displayed undetectable levels of protein production. The translational block observed with the synthetic mtGrx gene could be circumvented by using a synthetic mtGrx-glutathione S-transferase (GST) fusion construct or by in vitro translation. Taken together, the results underscore the importance of mRNA levels and RNA stability, but not necessarily tRNA abundance for efficient heterologous protein production in E. coli.
Journal of bacteriology, 1996
A number of silent codon changes were made in two Escherichia coli genes. For the ompA gene, the replacement of seven consecutive frequently used codons with synonymous infrequently used codons reduced the ompA mRNA level and its half-life. For the bla gene, the exchange of 24 codons for the most frequently used synonymous codons extended the bla mRNA half-life. A modification of ribosome traffic could account for these observations.
BMC genomics, 2018
Small RNAs (sRNAs) are key regulators of gene expression in bacteria. In addition to modulating translation initiation, sRNAs can interact with mRNA coding regions to regulate mRNA stability and translation efficiency, enhancing or impeding progression of the ribosome along the mRNA. Since most amino acids are decoded by more than one codon (synonymous) we asked as to whether there is a codon bias in the interaction of sRNAs with coding regions of mRNAs. Therefore, we explored whether there are differences in codon usage or tRNA availability according to whether an mRNA is regulated by sRNAs or not. We also explored these parameters in the coding interaction regions in mRNAs. We focused our analysis on sRNAs that regulate multiple mRNAs. We found differences in codon adaptation index and tRNA adaptation index between sRNA-regulated and non-sRNA-regulated mRNAs. Interestingly, the sRNA-mRNA interacting regions tended to be enriched in unpreferred codons decoded by scarce tRNAs. We al...
Error minimization explains the codon usage of highly expressed genes in Escherichia coli
Gene, 2007
Different organisms use synonymous codons with different preferences. Several measures have been introduced to compute the extent of codon usage bias within a gene or genome, among which the codon adaptation index (CAI) has been shown to be well correlated with mRNA levels of Escherichia coli. In this work an error adaptation index (eAI) is introduced, which estimates the level at which a gene can tolerate the effects of mistranslations. It is shown that the eAI has a strong correlation with CAI, as well as with mRNA levels, which suggests that the codons of highly expressed genes are selected so that mistranslation would have the minimum possible effect on the structure and function of the related proteins.
Biochemical and biophysical research communications, 2018
It is widely believed that if a high number of genes are found for any tRNA in a rapidly replicating bacteria, then the cytoplasmic levels of that tRNA will be high and an open reading frame containing a higher frequency of the complementary codon will be translated faster. This idea is based on correlations between the number of tRNA genes, tRNA concentration and the frequency of codon usage observed in a limited number of strains as well as from the fact that artificially changing the number of tRNA genes alters translation efficiency and consequently the amount of properly folded protein synthesized. tRNA gene number may greatly vary in a genome due to duplications, deletions and lateral transfer which in turn would alter the levels and functionality of many proteins. Such changes are potentially deleterious for fitness and as a result it is expected that changes in tRNA gene numbers should be accompanied by a modification of the frequency of codon usage. In contrast to this mode...
RNA Biology, 2019
Translation initiation is a critical step in the regulation of protein synthesis, and it is subjected to different control mechanisms, such as 5' UTR secondary structure and initiation codon context, that can influence the rates at which initiation and consequentially translation occur. For some genes, translation elongation also affects the rate of protein synthesis. With a GFP library containing nearly all possible combinations of nucleotides from the 3 rd to the 5 th codon positions in the protein coding region of the mRNA, it was previously demonstrated that some nucleotide combinations increased GFP expression up to four orders of magnitude. While it is clear that the codon region from positions 3 to 5 can influence protein expression levels of artificial constructs, its impact on endogenous proteins is still unknown. Through bioinformatics analysis, we identified the nucleotide combinations of the GFP library in Escherichia coli genes and examined the correlation between the expected levels of translation according to the GFP data with the experimental measures of protein expression. We observed that E. coli genes were enriched with the nucleotide compositions that enhanced protein expression in the GFP library, but surprisingly, it seemed to affect the translation efficiency only marginally. Nevertheless, our data indicate that different enterobacteria present similar nucleotide composition enrichment as E. coli, suggesting an evolutionary pressure towards the conservation of short translational enhancer sequences.