The evolution and genomic landscape of CGB1 and CGB2 genes - PubMed (original) (raw)

The evolution and genomic landscape of CGB1 and CGB2 genes

Pille Hallast et al. Mol Cell Endocrinol. 2007.

Abstract

The origin of completely novel proteins is a significant question in evolution. The luteinizing hormone (LHB)/chorionic gonadotropin (CGB) gene cluster in humans contains a candidate example of this process. Two genes in this cluster (CGB1 and CGB2) exhibit nucleotide sequence similarity with the other LHB/CGB genes, but as a result of frameshifting are predicted to encode a completely novel protein. Our analysis of these genes from humans and related primates indicates a recent origin in the lineage specific to humans and African great apes. While the function of these genes is not yet known, they are strongly conserved between human and chimpanzee and exhibit three-fold lower diversity than LHB across human populations with no mutations that would disrupt the coding sequence. The 5'-upstream region of CGB1/2 contains most of the promoter sequence of hCGbeta plus a novel region proximal to the putative transcription start site. In silico prediction of putative transcription factor binding sites supports the hypothesis that CGB1 and CGB2 gene products are expressed in, and may contribute to, implantation and placental development.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

Genomic context of CGB1 and CGB2. (A) Schematic presentation of the structure of the LHB/CGB gene cluster (covering 39.76 kb from LHB to CGB7) drawn to an approximate scale. Individual LHB/CGB genes (white boxes) cover 1.11–1.466 kb. Arrows indicate the direction of transcription either from a sense or an antisense strand. Experimentally identified hCGβ promoter sequence (Otani et al., 1988; white ovals) is also present, although more distally, upstream of LHB, CGB1 and CGB2 genes. Detailed alignment of the promoter area is shown in C. CGB1 and CGB2 specific insert is divided into a transcribed segment coding for 5′-UTR, exon1 and part of intron 1 of CGB1/CGB2 (black boxes; 255 bp) and an immediate 5′-upstream segment, which could serve as an additional promoter component (black ovals; CGB1 481 bp, CGB2 469 bp). Alignment of the non-coding 5′-upstream part of the insert is in D. Intergenic Neutrophin 6 pseudogenes (psNTF6; striped boxes; <1.15 kb) originate through duplication from Neutrophin 5 (NTF5) exon 3 (Hallast et al., 2005). (B) Structure of CGB1 and CGB2 differs from a consensus hCGβ gene in the following aspects: (1) hCGβ 5′-UTR has been replaced by a CGB1/_2_-specific insert coding for CGB1/2 5′-UTR, exon 1 (diagonally striped box) and part of intron 1 (black box) as well as provides a 481/469 bp upstream fragment, which could function as an additional promoter segment (black oval); (2) hCGβ exon 1 (horizontally striped box) is a part of CGB1/2 intron 1; (3) open reading frame (ORF) of exons 2 and 3 of CGB1/2 (grey boxes) has a-1bp shifted compared to hCGβ coding genes; (4) shifted ORF has lead to earlier STOP codon and shorter exon 3. An alternative exon 1 and shifted ORF for exons 2 and 3 code for a putative CGB1/2 protein with no amino acid similarity to hCGβ-subunit. (C) Alignment of the proximal promoter of hCGβ subunit coding genes (CGB, CGB5, CGB8, CGB7) with the homologous upstream segment of CGB1 and CGB2. cAMP response element has been mapped from −311 to −202 (Albanese et al., 1991; black brackets), trophoblast-specific element TSE from −305 to −279 (Steger et al., 1993; dotted brackets). Other experimentally proven regulatory elements of hCGβ promoter include activating protein 2 (AP2) and selective promoter factor 1 (Sp1) (Johnson and Jameson, 1999) as well as Ets-2 binding sites (Ghosh et al., 2003). *CCAAT box has been identified by Matinspector and Alibaba TFBS prediction softwares. (D). Prediction of transcription factor binding sites (TFBS) onto the 5′-upstream segment unique to CGB1 and CGB2 created by the insertion (B). TFBSs predicted by both MatInspector and Alibaba methods are marked with solid arrows above the aligned sequences of CGB1 and CGB2; TFBSs recognized by MatInspector alone are marked by broken arrows. TFBSs predicted solely based on CGB1 sequence are indicated with (*) and based on CGB2 (**). ATF: activating transcription factor; AP2: activating protein 2; Cdx2: Caudal-related transcription factor; CREB: cAMP responsive element binding protein; ERE: Estrogen response element; HIF: Hypoxia-inducible factor 1; NFkappaB: nuclear factor κB; GATA2: GATA-biding protein 2; SF1: steroidogenic factor 1; Sp1: selective promoter factor 1. Transcription start site has been indicated based on NCBI GenBank locus no NG_000019 information.

Fig. 1

Fig. 1

Genomic context of CGB1 and CGB2. (A) Schematic presentation of the structure of the LHB/CGB gene cluster (covering 39.76 kb from LHB to CGB7) drawn to an approximate scale. Individual LHB/CGB genes (white boxes) cover 1.11–1.466 kb. Arrows indicate the direction of transcription either from a sense or an antisense strand. Experimentally identified hCGβ promoter sequence (Otani et al., 1988; white ovals) is also present, although more distally, upstream of LHB, CGB1 and CGB2 genes. Detailed alignment of the promoter area is shown in C. CGB1 and CGB2 specific insert is divided into a transcribed segment coding for 5′-UTR, exon1 and part of intron 1 of CGB1/CGB2 (black boxes; 255 bp) and an immediate 5′-upstream segment, which could serve as an additional promoter component (black ovals; CGB1 481 bp, CGB2 469 bp). Alignment of the non-coding 5′-upstream part of the insert is in D. Intergenic Neutrophin 6 pseudogenes (psNTF6; striped boxes; <1.15 kb) originate through duplication from Neutrophin 5 (NTF5) exon 3 (Hallast et al., 2005). (B) Structure of CGB1 and CGB2 differs from a consensus hCGβ gene in the following aspects: (1) hCGβ 5′-UTR has been replaced by a CGB1/_2_-specific insert coding for CGB1/2 5′-UTR, exon 1 (diagonally striped box) and part of intron 1 (black box) as well as provides a 481/469 bp upstream fragment, which could function as an additional promoter segment (black oval); (2) hCGβ exon 1 (horizontally striped box) is a part of CGB1/2 intron 1; (3) open reading frame (ORF) of exons 2 and 3 of CGB1/2 (grey boxes) has a-1bp shifted compared to hCGβ coding genes; (4) shifted ORF has lead to earlier STOP codon and shorter exon 3. An alternative exon 1 and shifted ORF for exons 2 and 3 code for a putative CGB1/2 protein with no amino acid similarity to hCGβ-subunit. (C) Alignment of the proximal promoter of hCGβ subunit coding genes (CGB, CGB5, CGB8, CGB7) with the homologous upstream segment of CGB1 and CGB2. cAMP response element has been mapped from −311 to −202 (Albanese et al., 1991; black brackets), trophoblast-specific element TSE from −305 to −279 (Steger et al., 1993; dotted brackets). Other experimentally proven regulatory elements of hCGβ promoter include activating protein 2 (AP2) and selective promoter factor 1 (Sp1) (Johnson and Jameson, 1999) as well as Ets-2 binding sites (Ghosh et al., 2003). *CCAAT box has been identified by Matinspector and Alibaba TFBS prediction softwares. (D). Prediction of transcription factor binding sites (TFBS) onto the 5′-upstream segment unique to CGB1 and CGB2 created by the insertion (B). TFBSs predicted by both MatInspector and Alibaba methods are marked with solid arrows above the aligned sequences of CGB1 and CGB2; TFBSs recognized by MatInspector alone are marked by broken arrows. TFBSs predicted solely based on CGB1 sequence are indicated with (*) and based on CGB2 (**). ATF: activating transcription factor; AP2: activating protein 2; Cdx2: Caudal-related transcription factor; CREB: cAMP responsive element binding protein; ERE: Estrogen response element; HIF: Hypoxia-inducible factor 1; NFkappaB: nuclear factor κB; GATA2: GATA-biding protein 2; SF1: steroidogenic factor 1; Sp1: selective promoter factor 1. Transcription start site has been indicated based on NCBI GenBank locus no NG_000019 information.

Fig. 2

Fig. 2

SNP patterns and fixed differences between human and great apes in CGB1 (A), CGB2 (B) and LHB (C) genes. Human polymorphic positions (vertical black bars) are marked as long bars for common SNPs (minor allele frequency >10%) and short bars for rare SNPs (<10%). For human and great ape comparison fixed differences (black arrows), non-synonymous changes (black arrows with an asterisk), SNPs found in apes (black triangle) and protein altering insertions/deletions are shown (exclamation mark).

References

    1. Albanese C., Kay T.W., Troccoli N.M., Jameson J.L. Novel cyclic adenosine 3′, 5′-monophosphate response element in the human chorionic gonadotropin beta-subunit gene. Mol. Endocrinol. 1991;5:693–702. - PubMed
    1. Barbujani G., Goldstein D.B. Africans and Asians abroad: genetic diversity in Europe. Annu. Rev. Genomics Hum. Genet. 2004;5:119–150. - PubMed
    1. Barrett J.C., Fry B., Maller J., Daly M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21:263–265. - PubMed
    1. Bo M., Boime I. Identification of the transcriptionally active genes of the chorionic gonadotropin beta gene cluster in vivo. J. Biol. Chem. 1992;267:3179–3184. - PubMed
    1. Berger P., Kranewitter W., Madersbacher S., Gerth R., Geley S., Dirnhofer S. Eutopic production of human chorionic gonadotropin beta (hCG beta) and luteinizing hormone beta (hLH beta) in the human testis. FEBS Lett. 1994;343:229–233. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources