Selective and mutational patterns associated with gene expression in humans: influences on synonymous composition and intron presence (original) (raw)

Abstract

We report the results of a comprehensive study of the influence of gene expression on synonymous codons, amino acid composition, and intron presence and size in human protein-coding genes. First, in addition to a strong effect of isochores, we have detected the influence of transcription-associated mutational biases (TAMB) on gene composition. Genes expressed in different tissues show diverse degrees of TAMB, with genes expressed in testis showing the greatest influence. Second, the study of tissues with no evidence of TAMB reveals a consistent set of optimal synonymous codons favored in highly expressed genes. This result exposes the consequences of natural selection on synonymous composition to increase efficiency of translation in the human lineage. Third, overall amino acid composition of proteins closely resembles tRNA abundance but there is no difference in amino acid composition in differentially expressed genes. Fourth, there is a negative relationship between expression and CDS length. Significantly, this is observed only among genes with introns, suggesting that the cause for this relationship in humans cannot be associated only with costs of amino acid biosynthesis. Fifth, we show that broadly and highly expressed genes have more, although shorter, introns. The selective advantage for having more introns in highly expressed genes is likely counterbalanced by containment of transcriptional costs and a minimum exon size for proper splicing.

Full Text

The Full Text of this article is available as a PDF (129.4 KB).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Akashi H. Molecular evolution between Drosophila melanogaster and D. simulans: reduced codon bias, faster rates of amino acid substitution, and larger proteins in D. melanogaster. Genetics. 1996 Nov;144(3):1297–1307. doi: 10.1093/genetics/144.3.1297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Akashi Hiroshi, Gojobori Takashi. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci U S A. 2002 Mar 19;99(6):3695–3700. doi: 10.1073/pnas.062526999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Akashi Hiroshi. Translational selection and yeast proteome evolution. Genetics. 2003 Aug;164(4):1291–1303. doi: 10.1093/genetics/164.4.1291. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Altschul S. F., Madden T. L., Schäffer A. A., Zhang J., Zhang Z., Miller W., Lipman D. J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997 Sep 1;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Begun D. J. The frequency distribution of nucleotide variation in Drosophila simulans. Mol Biol Evol. 2001 Jul;18(7):1343–1352. doi: 10.1093/oxfordjournals.molbev.a003918. [DOI] [PubMed] [Google Scholar]
  6. Bernardi G. The human genome: organization and evolutionary history. Annu Rev Genet. 1995;29:445–476. doi: 10.1146/annurev.ge.29.120195.002305. [DOI] [PubMed] [Google Scholar]
  7. Betancourt Andrea J., Presgraves Daven C. Linkage limits the power of natural selection in Drosophila. Proc Natl Acad Sci U S A. 2002 Oct 7;99(21):13616–13620. doi: 10.1073/pnas.212277199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Bulmer M. The selection-mutation-drift theory of synonymous codon usage. Genetics. 1991 Nov;129(3):897–907. doi: 10.1093/genetics/129.3.897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Carlini David B., Stephan Wolfgang. In vivo introduction of unpreferred synonymous codons into the Drosophila Adh gene results in reduced levels of ADH protein. Genetics. 2003 Jan;163(1):239–243. doi: 10.1093/genetics/163.1.239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Carvalho A. B., Clark A. G. Intron size and natural selection. Nature. 1999 Sep 23;401(6751):344–344. doi: 10.1038/43827. [DOI] [PubMed] [Google Scholar]
  11. Castillo-Davis Cristian I., Mekhedov Sergei L., Hartl Daniel L., Koonin Eugene V., Kondrashov Fyodor A. Selection for short introns in highly expressed genes. Nat Genet. 2002 Jul 22;31(4):415–418. doi: 10.1038/ng940. [DOI] [PubMed] [Google Scholar]
  12. Coghlan A., Wolfe K. H. Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast. 2000 Sep 15;16(12):1131–1145. doi: 10.1002/1097-0061(20000915)16:12<1131::AID-YEA609>3.0.CO;2-F. [DOI] [PubMed] [Google Scholar]
  13. Comeron J. M., Aguadé M. An evaluation of measures of synonymous codon usage bias. J Mol Evol. 1998 Sep;47(3):268–274. doi: 10.1007/pl00006384. [DOI] [PubMed] [Google Scholar]
  14. Comeron J. M., Kreitman M., Aguadé M. Natural selection on synonymous sites is correlated with gene length and recombination in Drosophila. Genetics. 1999 Jan;151(1):239–249. doi: 10.1093/genetics/151.1.239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Comeron J. M., Kreitman M. The correlation between intron length and recombination in drosophila. Dynamic equilibrium between mutational and selective forces. Genetics. 2000 Nov;156(3):1175–1190. doi: 10.1093/genetics/156.3.1175. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Comeron Josep M., Kreitman Martin. Population, evolutionary and genomic consequences of interference selection. Genetics. 2002 May;161(1):389–410. doi: 10.1093/genetics/161.1.389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. D'Onofrio Giuseppe. Expression patterns and gene distribution in the human genome. Gene. 2002 Oct 30;300(1-2):155–160. doi: 10.1016/s0378-1119(02)01048-x. [DOI] [PubMed] [Google Scholar]
  18. Dominski Z., Kole R. Selection of splice sites in pre-mRNAs with short internal exons. Mol Cell Biol. 1991 Dec;11(12):6075–6083. doi: 10.1128/mcb.11.12.6075. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Duan Jubao, Wainwright Mark S., Comeron Josep M., Saitou Naruya, Sanders Alan R., Gelernter Joel, Gejman Pablo V. Synonymous mutations in the human dopamine receptor D2 (DRD2) affect mRNA stability and synthesis of the receptor. Hum Mol Genet. 2003 Feb 1;12(3):205–216. doi: 10.1093/hmg/ddg055. [DOI] [PubMed] [Google Scholar]
  20. Duret L., Hurst L. D. The elevated GC content at exonic third sites is not evidence against neutralist models of isochore evolution. Mol Biol Evol. 2001 May;18(5):757–762. doi: 10.1093/oxfordjournals.molbev.a003858. [DOI] [PubMed] [Google Scholar]
  21. Duret L., Mouchiroud D. Determinants of substitution rates in mammalian genes: expression pattern affects selection intensity but not mutation rate. Mol Biol Evol. 2000 Jan;17(1):68–74. doi: 10.1093/oxfordjournals.molbev.a026239. [DOI] [PubMed] [Google Scholar]
  22. Duret L., Mouchiroud D. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci U S A. 1999 Apr 13;96(8):4482–4487. doi: 10.1073/pnas.96.8.4482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Duret L., Mouchiroud D., Gautier C. Statistical analysis of vertebrate sequences reveals that long genes are scarce in GC-rich isochores. J Mol Evol. 1995 Mar;40(3):308–317. doi: 10.1007/BF00163235. [DOI] [PubMed] [Google Scholar]
  24. Duret L. tRNA gene number and codon usage in the C. elegans genome are co-adapted for optimal translation of highly expressed genes. Trends Genet. 2000 Jul;16(7):287–289. doi: 10.1016/s0168-9525(00)02041-2. [DOI] [PubMed] [Google Scholar]
  25. Duret Laurent. Evolution of synonymous codon usage in metazoans. Curr Opin Genet Dev. 2002 Dec;12(6):640–649. doi: 10.1016/s0959-437x(02)00353-2. [DOI] [PubMed] [Google Scholar]
  26. Duret Laurent, Semon Marie, Piganeau Gwenaël, Mouchiroud Dominique, Galtier Nicolas. Vanishing GC-rich isochores in mammalian genomes. Genetics. 2002 Dec;162(4):1837–1847. doi: 10.1093/genetics/162.4.1837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Edgar Ron, Domrachev Michael, Lash Alex E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002 Jan 1;30(1):207–210. doi: 10.1093/nar/30.1.207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Eyre-Walker A. Evidence of selection on silent site base composition in mammals: potential implications for the evolution of isochores and junk DNA. Genetics. 1999 Jun;152(2):675–683. doi: 10.1093/genetics/152.2.675. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Felsenstein J. The evolutionary advantage of recombination. Genetics. 1974 Oct;78(2):737–756. doi: 10.1093/genetics/78.2.737. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Fong Y. W., Zhou Q. Stimulatory effect of splicing factors on transcriptional elongation. Nature. 2001 Dec 20;414(6866):929–933. doi: 10.1038/414929a. [DOI] [PubMed] [Google Scholar]
  31. Green Phil, Ewing Brent, Miller Webb, Thomas Pamela J., NISC Comparative Sequencing Program. Green Eric D. Transcription-associated mutational asymmetry in mammalian evolution. Nat Genet. 2003 Mar 3;33(4):514–517. doi: 10.1038/ng1103. [DOI] [PubMed] [Google Scholar]
  32. Hanawalt P. C. Transcription-coupled repair and human disease. Science. 1994 Dec 23;266(5193):1957–1958. doi: 10.1126/science.7801121. [DOI] [PubMed] [Google Scholar]
  33. Hardison R. C., Oeltjen J., Miller W. Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res. 1997 Oct;7(10):959–966. doi: 10.1101/gr.7.10.959. [DOI] [PubMed] [Google Scholar]
  34. Hartl D. L., Moriyama E. N., Sawyer S. A. Selection intensity for codon bias. Genetics. 1994 Sep;138(1):227–234. doi: 10.1093/genetics/138.1.227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Hey Jody, Kliman Richard M. Interactions between natural selection, recombination and gene density in the genes of Drosophila. Genetics. 2002 Feb;160(2):595–608. doi: 10.1093/genetics/160.2.595. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Hill W. G., Robertson A. The effect of linkage on limits to artificial selection. Genet Res. 1966 Dec;8(3):269–294. [PubMed] [Google Scholar]
  37. Iida K., Akashi H. A test of translational selection at 'silent' sites in the human genome: base composition comparisons in alternatively spliced genes. Gene. 2000 Dec 30;261(1):93–105. doi: 10.1016/s0378-1119(00)00482-0. [DOI] [PubMed] [Google Scholar]
  38. Ikemura T. Codon usage and tRNA content in unicellular and multicellular organisms. Mol Biol Evol. 1985 Jan;2(1):13–34. doi: 10.1093/oxfordjournals.molbev.a040335. [DOI] [PubMed] [Google Scholar]
  39. Jansen R., Gerstein M. Analysis of the yeast transcriptome with structural and functional categories: characterizing highly expressed proteins. Nucleic Acids Res. 2000 Mar 15;28(6):1481–1488. doi: 10.1093/nar/28.6.1481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Jareborg N., Birney E., Durbin R. Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs. Genome Res. 1999 Sep;9(9):815–824. doi: 10.1101/gr.9.9.815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Kataoka N., Yong J., Kim V. N., Velazquez F., Perkinson R. A., Wang F., Dreyfuss G. Pre-mRNA splicing imprints mRNA in the nucleus with a novel RNA-binding protein that persists in the cytoplasm. Mol Cell. 2000 Sep;6(3):673–682. doi: 10.1016/s1097-2765(00)00065-4. [DOI] [PubMed] [Google Scholar]
  42. Kliman R. M., Hey J. Reduced natural selection associated with low recombination in Drosophila melanogaster. Mol Biol Evol. 1993 Nov;10(6):1239–1258. doi: 10.1093/oxfordjournals.molbev.a040074. [DOI] [PubMed] [Google Scholar]
  43. Klinz F. J., Gallwitz D. Size and position of intervening sequences are critical for the splicing efficiency of pre-mRNA in the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 1985 Jun 11;13(11):3791–3804. doi: 10.1093/nar/13.11.3791. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kurland C. G. Translational accuracy and the fitness of bacteria. Annu Rev Genet. 1992;26:29–50. doi: 10.1146/annurev.ge.26.120192.000333. [DOI] [PubMed] [Google Scholar]
  45. Lander E. S., Linton L. M., Birren B., Nusbaum C., Zody M. C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W. Initial sequencing and analysis of the human genome. Nature. 2001 Feb 15;409(6822):860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
  46. Le Hir H., Gatfield D., Izaurralde E., Moore M. J. The exon-exon junction complex provides a binding platform for factors involved in mRNA export and nonsense-mediated mRNA decay. EMBO J. 2001 Sep 3;20(17):4987–4997. doi: 10.1093/emboj/20.17.4987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Le Hir H., Moore M. J., Maquat L. E. Pre-mRNA splicing alters mRNP composition: evidence for stable association of proteins at exon-exon junctions. Genes Dev. 2000 May 1;14(9):1098–1108. [PMC free article] [PubMed] [Google Scholar]
  48. Leicht B. G., Muse S. V., Hanczyc M., Clark A. G. Constraints on intron evolution in the gene encoding the myosin alkali light chain in Drosophila. Genetics. 1995 Jan;139(1):299–308. doi: 10.1093/genetics/139.1.299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Lercher Martin J., Urrutia Araxi O., Hurst Laurence D. Clustering of housekeeping genes provides a unified model of gene order in the human genome. Nat Genet. 2002 May 6;31(2):180–183. doi: 10.1038/ng887. [DOI] [PubMed] [Google Scholar]
  50. Lercher Martin J., Urrutia Araxi O., Pavlícek Adam, Hurst Laurence D. A unification of mosaic structures in the human genome. Hum Mol Genet. 2003 Jul 29;12(19):2411–2415. doi: 10.1093/hmg/ddg251. [DOI] [PubMed] [Google Scholar]
  51. Li W. H., Sadler L. A. Low nucleotide diversity in man. Genetics. 1991 Oct;129(2):513–523. doi: 10.1093/genetics/129.2.513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Llopart A., Aguadé M. Nucleotide polymorphism at the RpII215 gene in Drosophila subobscura. Weak selection on synonymous mutations. Genetics. 2000 Jul;155(3):1245–1252. doi: 10.1093/genetics/155.3.1245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Lynch Michael. Intron evolution as a population-genetic process. Proc Natl Acad Sci U S A. 2002 Apr 30;99(9):6118–6123. doi: 10.1073/pnas.092595699. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Majewski Jacek. Dependence of mutational asymmetry on gene-expression levels in the human genome. Am J Hum Genet. 2003 Jul 24;73(3):688–692. doi: 10.1086/378134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Majewski Jacek, Ott Jurg. Distribution and characterization of regulatory elements in the human genome. Genome Res. 2002 Dec;12(12):1827–1836. doi: 10.1101/gr.606402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. McVean G. A., Charlesworth B. The effects of Hill-Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics. 2000 Jun;155(2):929–944. doi: 10.1093/genetics/155.2.929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Moriyama E. N., Hartl D. L. Codon usage bias and base composition of nuclear genes in Drosophila. Genetics. 1993 Jul;134(3):847–858. doi: 10.1093/genetics/134.3.847. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Moriyama E. N., Powell J. R. Codon usage bias and tRNA abundance in Drosophila. J Mol Evol. 1997 Nov;45(5):514–523. doi: 10.1007/pl00006256. [DOI] [PubMed] [Google Scholar]
  59. Moriyama E. N., Powell J. R. Gene length and codon usage bias in Drosophila melanogaster, Saccharomyces cerevisiae and Escherichia coli. Nucleic Acids Res. 1998 Jul 1;26(13):3188–3193. doi: 10.1093/nar/26.13.3188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Mouchiroud D., D'Onofrio G., Aïssani B., Macaya G., Gautier C., Bernardi G. The distribution of genes in the human genome. Gene. 1991 Apr;100:181–187. doi: 10.1016/0378-1119(91)90364-h. [DOI] [PubMed] [Google Scholar]
  61. Nekrutenko A., Li W. H. Assessment of compositional heterogeneity within and between eukaryotic genomes. Genome Res. 2000 Dec;10(12):1986–1995. doi: 10.1101/gr.10.12.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Ogata H., Fujibuchi W., Kanehisa M. The size differences among mammalian introns are due to the accumulation of small deletions. FEBS Lett. 1996 Jul 15;390(1):99–103. doi: 10.1016/0014-5793(96)00636-9. [DOI] [PubMed] [Google Scholar]
  63. Ophir R., Graur D. Patterns and rates of indel evolution in processed pseudogenes from humans and murids. Gene. 1997 Dec 31;205(1-2):191–202. doi: 10.1016/s0378-1119(97)00398-3. [DOI] [PubMed] [Google Scholar]
  64. Parsch John. Selective constraints on intron evolution in Drosophila. Genetics. 2003 Dec;165(4):1843–1851. doi: 10.1093/genetics/165.4.1843. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Powell J. R., Moriyama E. N. Evolution of codon usage bias in Drosophila. Proc Natl Acad Sci U S A. 1997 Jul 22;94(15):7784–7790. doi: 10.1073/pnas.94.15.7784. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Shabalina S. A., Ogurtsov A. Y., Kondrashov V. A., Kondrashov A. S. Selective constraint in intergenic regions of human and mouse genomes. Trends Genet. 2001 Jul;17(7):373–376. doi: 10.1016/s0168-9525(01)02344-7. [DOI] [PubMed] [Google Scholar]
  67. Sharp P. M., Li W. H. An evolutionary perspective on synonymous codon usage in unicellular organisms. J Mol Evol. 1986;24(1-2):28–38. doi: 10.1007/BF02099948. [DOI] [PubMed] [Google Scholar]
  68. Stephan W., Rodriguez V. S., Zhou B., Parsch J. Molecular evolution of the metallothionein gene Mtn in the melanogaster species group: results from Drosophila ananassae. Genetics. 1994 Sep;138(1):135–143. doi: 10.1093/genetics/138.1.135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Sterner D. A., Carlo T., Berget S. M. Architectural limits on split genes. Proc Natl Acad Sci U S A. 1996 Dec 24;93(26):15081–15085. doi: 10.1073/pnas.93.26.15081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Su Andrew I., Cooke Michael P., Ching Keith A., Hakak Yaron, Walker John R., Wiltshire Tim, Orth Anthony P., Vega Raquel G., Sapinoso Lisa M., Moqrich Aziz. Large-scale analysis of the human and mouse transcriptomes. Proc Natl Acad Sci U S A. 2002 Mar 19;99(7):4465–4470. doi: 10.1073/pnas.012025199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Sullivan D. T. DNA excision repair and transcription: implications for genome evolution. Curr Opin Genet Dev. 1995 Dec;5(6):786–791. doi: 10.1016/0959-437x(95)80012-t. [DOI] [PubMed] [Google Scholar]
  72. Tachida H. Molecular evolution in a multisite nearly neutral mutation model. J Mol Evol. 2000 Jan;50(1):69–81. doi: 10.1007/s002399910008. [DOI] [PubMed] [Google Scholar]
  73. Urrutia A. O., Hurst L. D. Codon usage bias covaries with expression breadth and the rate of synonymous evolution in humans, but this is not evidence for selection. Genetics. 2001 Nov;159(3):1191–1199. doi: 10.1093/genetics/159.3.1191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Wall Jeffrey D. Estimating ancestral population sizes and divergence times. Genetics. 2003 Jan;163(1):395–404. doi: 10.1093/genetics/163.1.395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Wasserman W. W., Palumbo M., Thompson W., Fickett J. W., Lawrence C. E. Human-mouse genome comparisons to locate regulatory sites. Nat Genet. 2000 Oct;26(2):225–228. doi: 10.1038/79965. [DOI] [PubMed] [Google Scholar]
  76. Yu Jun, Yang Zhiyong, Kibukawa Miho, Paddock Marcia, Passey Douglas A., Wong Gane Ka-Shu. Minimal introns are not "junk". Genome Res. 2002 Aug;12(8):1185–1189. doi: 10.1101/gr.224602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Zhou Z., Luo M. J., Straesser K., Katahira J., Hurt E., Reed R. The protein Aly links pre-messenger-RNA splicing to nuclear export in metazoans. Nature. 2000 Sep 21;407(6802):401–405. doi: 10.1038/35030160. [DOI] [PubMed] [Google Scholar]
  78. Zoubak S., Clay O., Bernardi G. The gene distribution of the human genome. Gene. 1996 Sep 26;174(1):95–102. doi: 10.1016/0378-1119(96)00393-9. [DOI] [PubMed] [Google Scholar]