Comparative genomic sequence analysis of the human and mouse cystic fibrosis transmembrane conductance regulator genes - PubMed (original) (raw)

Comparative Study

. 2000 Feb 1;97(3):1172-7.

doi: 10.1073/pnas.97.3.1172.

D C Jamison, J W Touchman, S L Chissoe, V V Braden Maduro, G G Bouffard, N L Dietrich, S M Beckstrom-Sternberg, L M Iyer, L A Weintraub, M Cotton, L Courtney, J Edwards, R Maupin, P Ozersky, T Rohlfing, P Wohldmann, T Miner, K Kemp, J Kramer, I Korf, K Pepin, L Antonacci-Fulton, R S Fulton, P Minx, L W Hillier, R K Wilson, R H Waterston, W Miller, E D Green

Affiliations

Comparative Study

Comparative genomic sequence analysis of the human and mouse cystic fibrosis transmembrane conductance regulator genes

R E Ellsworth et al. Proc Natl Acad Sci U S A. 2000.

Abstract

The identification of the cystic fibrosis transmembrane conductance regulator gene (CFTR) in 1989 represents a landmark accomplishment in human genetics. Since that time, there have been numerous advances in elucidating the function of the encoded protein and the physiological basis of cystic fibrosis. However, numerous areas of cystic fibrosis biology require additional investigation, some of which would be facilitated by information about the long-range sequence context of the CFTR gene. For example, the latter might provide clues about the sequence elements responsible for the temporal and spatial regulation of CFTR expression. We thus sought to establish the sequence of the chromosomal segments encompassing the human CFTR and mouse Cftr genes, with the hope of identifying conserved regions of biologic interest by sequence comparison. Bacterial clone-based physical maps of the relevant human and mouse genomic regions were constructed, and minimally overlapping sets of clones were selected and sequenced, eventually yielding approximately 1.6 Mb and approximately 358 kb of contiguous human and mouse sequence, respectively. These efforts have produced the complete sequence of the approximately 189-kb and approximately 152-kb segments containing the human CFTR and mouse Cftr genes, respectively, as well as significant amounts of flanking DNA. Analyses of the resulting data provide insights about the organization of the CFTR/Cftr genes and potential sequence elements regulating their expression. Furthermore, the generated sequence reveals the precise architecture of genes residing near CFTR/Cftr, including one known gene (WNT2/Wnt2) and two previously unknown genes that immediately flank CFTR/Cftr.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Sequence maps of the genomic segments encompassing the human CFTR and mouse Cftr genes. High-resolution BAC/PAC-based physical maps of the human and mouse CFTR/Cftr regions were assembled, with the complete contig maps available at

http://genome.nhgri.nih.gov/chr7/cftr

. From each map, minimal overlapping sets of ordered clones were selected and completely sequenced. (A) The sequence map of the CFTR region on human chromosome 7q31.3 (7) consists of the indicated 16 ordered clones that together span ≈1.6 Mb. Numerous STSs have been mapped to this region (50, 58) (also see

http://genome.nhgri.nih.gov/chr7

), with a small subset indicated relative to their content in particular BACs/PACs. The CFTR gene resides in BACs RG068P20 and RG133K23. Note that the human clones are depicted as barely overlapping, reflecting the actual sequence records in GenBank. Before submission, the sequence generated from each human clone was trimmed to yield the nonredundant sequence from that clone flanked by very small amounts of sequence in common with adjacent clones. Thus, the actual overlaps between adjacent human clones are typically much larger than that reflected by the sequence in their GenBank records. (B) The sequence map of the Cftr region on mouse chromosome 6 consists of the indicated three ordered clones that together span ≈358 kb. Representative STSs used to assemble the mouse contig map are depicted relative to their content in particular BACs. Note that the mouse clones are depicted based on their size and degree of overlap with one another; a single GenBank record (accession no. AF162137) contains one contiguous sequence assembled from all three clones. Information about the indicated human and mouse STSs is available in GenBank.

Figure 2

Figure 2

Long-range organization of the greater human and mouse CFTR/Cftr regions. (A) Schematic overview of the location of genes identified in both the human and mouse genomic sequence: WNT2/Wnt2, HSA_ C7orf7/MMU_ Orf3, CFTR/Cftr, and HSA_ C7orf8/MMU_ Orf4. Arrows indicate the direction of transcription. The available human sequence spans all four genes as well as extensive amounts of flanking DNA whereas the generated mouse sequence ends in the middle of the Wnt2 gene on the centromeric side and in the middle of MMU_ Orf4 on the telomeric side. Higher-resolution comparative depictions of the intron/exon organization of human and mouse CFTR/Cftr (B), WNT2/Wnt2 (C), HSA_ C7orf7/MMU_ Orf3 (D), and HSA_ C7orf8/MMU_ Orf4 (E) are also provided (in each case, with the human gene drawn above the mouse gene). The exon sizes are identical between species whereas intron sizes vary. Note that the CFTR/Cftr exons are numbered as originally designated (2), even though the gene is now known to contain 27 exons (with exons 6A, 6B, 14A, 14B, 17A, and 17B).

Figure 3

Figure 3

Percent identity plots (PIPs) for human and mouse genomic sequences. Percent identity plots (60) (see

http://globin.cse.psu.edu/pipmaker

) for four 14-kb segments are provided: (A) a region containing exons 1–3 of WNT2/Wnt2 (nucleotides 33,000–47,000 of the human sequence in GenBank accession no. AC002465); (B) a region residing between the WNT2/Wnt2 gene and HSA_ C7orf7/MMU_ Orf3 with no known functional elements (nucleotides 81,000–95,000 in GenBank accession no. AC002465); (C) a region immediately upstream of CFTR/Cftr exon 1 (nucleotides 5,425–19,425 in GenBank accession no. AC000111); and (D) a region containing the proximal portion of CFTR/Cftr intron 1 (nucleotides 19,425–33,425 in GenBank accession no. AC000111). In C and D, the vertical stripes are used to highlight the gap-free regions in an ≈28-kb interval encompassing CFTR/Cftr exon 1 that have a higher percent identity than other gap-free regions in that interval of the same or larger length. Features in the PIP: tall black rectangle, exon; white pointed box, L1-type repeat; dark gray pointed box, LTR repeat; black triangle, MIR-type repeat; light gray triangles, other SINE-type repeat; dark gray triangles, all other interspersed repeats; short white rectangle, CpG island where 0.6 ≤ CpG/GpC < 0.75; short dark gray rectangle, CpG island where CpG/GpC ≥ 0.75.

References

    1. Welsh M J, Tsui L-C, Boat T F, Beaudet A L. In: The Metabolic and Molecular Bases of Inherited Disease. Scriver C R, Beaudet A L, Sly W S, Valle D, editors. New York: McGraw–Hill; 1995. pp. 3799–3876.
    1. Rommens J M, Iannuzzi M C, Kerem B, Drumm M L, Melmer G, Dean M, Rozmahel R, Cole J L, Kennedy D, Hidaka N, et al. Science. 1989;245:1059–1065. - PubMed
    1. Riordan J R, Rommens J M, Kerem B, Alon N, Rozmahel R, Grzelczak Z, Zielenski J, Lok S, Plavsic N, Chou J-L, et al. Science. 1989;245:1066–1073. - PubMed
    1. Kerem B, Rommens J M, Buchanan J A, Markiewicz D, Cox T K, Chakravarti A, Buchwald M, Tsui L-C. Science. 1989;245:1073–1080. - PubMed
    1. Collins F S. Nat Genet. 1992;1:3–6. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources