Diversification of transcriptional modulation: Large-scale identification and characterization of putative alternative promoters of human genes (original) (raw)

  1. Kouichi Kimura1,
  2. Ai Wakamatsu2,3,
  3. Yutaka Suzuki4,14,
  4. Toshio Ota2,11,
  5. Tetsuo Nishikawa1,2,3,
  6. Riu Yamashita5,
  7. Jun-ichi Yamamoto2,3,
  8. Mitsuo Sekine6,
  9. Katsuki Tsuritani5,
  10. Hiroyuki Wakaguri4,
  11. Shizuko Ishii2,3,
  12. Tomoyasu Sugiyama2,12,
  13. Kaoru Saito2,
  14. Yuko Isono2,3,
  15. Ryotaro Irie2,
  16. Norihiro Kushida6,
  17. Takahiro Yoneyama6,
  18. Rie Otsuka6,
  19. Katsuhiro Kanda7,
  20. Takahide Yokoi7,
  21. Hiroshi Kondo7,
  22. Masako Wagatsuma7,
  23. Katsuji Murakawa8,
  24. Shinichi Ishida8,
  25. Tadashi Ishibashi8,
  26. Asako Takahashi-Fujii9,13,
  27. Tomoo Tanase9,13,
  28. Keiichi Nagai1,2,10,
  29. Hisashi Kikuchi6,
  30. Kenta Nakai5,
  31. Takao Isogai2,3, and
  32. Sumio Sugano4
  33. 1 Life Science Research Laboratory, Central Research Laboratory, Hitachi, Ltd., Kokubunji, Tokyo, 185-8601, Japan
  34. 2 Helix Research Institute, Kisarazu, Chiba, 292-0812, Japan
  35. 3 Reverse Proteomics Research Institute, Kisarazu, Chiba 292-0818, Japan
  36. 4 Department of Medical Genome Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Minato-ku, Tokyo, 108-8639, Japan
  37. 5 Human Genome Center, Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, 108-8639, Japan
  38. 6 Genome Analysis Center, Department of Biotechnology, National Institute of Technology and Evaluation, Shibuya-ku, Tokyo, 151-0066, Japan
  39. 7 Life Science Group, Hitachi, Ltd., Kawagoe, Saitama, 350-1165, Japan
  40. 8 Hitachi Science Systems, Ltd., Kokubunji, Tokyo, 185-8601, Japan
  41. 9 Takara Shuzo Co., Ltd., Noji-cho, Kusatsu, Shiga, 525-0055, Japan
  42. 10 Advanced Research Laboratory, Hitachi, Ltd., Kokubunji, Tokyo, 185-8601, Japan

Abstract

By analyzing 1,780,295 5′-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by more than 500 bp and thus are very likely to constitute mutually distinct alternative promoters. To our surprise, at least 7674 (52%) human RefSeq genes were subject to regulation by putative alternative promoters (PAPs). On average, there were 3.1 PAPs per gene, with the composition of one CpG-island-containing promoter per 2.6 CpG-less promoters. In 17% of the PAP-containing loci, tissue-specific use of the PAPs was observed. The richest tissue sources of the tissue-specific PAPs were testis and brain. It was also intriguing that the PAP-containing promoters were enriched in the genes encoding signal transduction-related proteins and were rarer in the genes encoding extracellular proteins, possibly reflecting the varied functional requirement for and the restricted expression of those categories of genes, respectively. The patterns of the first exons were highly diverse as well. On average, there were 7.7 different splicing types of first exons per locus partly produced by the PAPs, suggesting that a wide variety of transcripts can be achieved by this mechanism. Our findings suggest that use of alternate promoters and consequent alternative use of first exons should play a pivotal role in generating the complexity required for the highly elaborated molecular systems in humans.

Footnotes