Recent segmental duplications in the working draft assembly of the brown Norway rat - PubMed (original) (raw)
Recent segmental duplications in the working draft assembly of the brown Norway rat
Eray Tuzun et al. Genome Res. 2004 Apr.
Abstract
We assessed the content, structure, and distribution of segmental duplications (> or =90% sequence identity, > or =5 kb length) within the published version of the Rattus norvegicus genome assembly (v.3.1). The overall fraction of duplicated sequence within the rat assembly (2.92%) is greater than that of the mouse (1%-1.2%) but significantly less than that of human ( approximately 5%). Duplications were nonuniformly distributed, occurring predominantly as tandem and tightly clustered intrachromosomal duplications. Regions containing extensive interchromosomal duplications were observed, particularly within subtelomeric and pericentromeric regions. We identified 41 discrete genomic regions greater than 1 Mb in size, termed "duplication blocks." These appear to have been the target of extensive duplication over millions of years of evolution. Gene content within duplicated regions ( approximately 1%) was lower than expected based on the genome representation. Interestingly, sequence contigs lacking chromosome assignment ("the unplaced chromosome") showed a marked enrichment for segmental duplication (45% of 75.2 Mb), indicating that segmental duplications have been problematic for sequence and assembly of the rat genome. Further targeted efforts are required to resolve the organization and complexity of these regions.
Figures
Figure 1
Duplicated fraction in the rat genome. The figure depicts the proportion of the genome that shows duplication (A) when all genomic sequence was compared, and (B) for the rat genome excluding random, unassigned sequence contigs. Various lengths and % identity thresholds are shown. A very small portion of the rat genome shows segmental duplications with ≥99.5% sequence identity. This suggests that the majority of segmental duplications are bona fide and are not the result of missed allelic overlaps during genome assembly.
Figure 1
Duplicated fraction in the rat genome. The figure depicts the proportion of the genome that shows duplication (A) when all genomic sequence was compared, and (B) for the rat genome excluding random, unassigned sequence contigs. Various lengths and % identity thresholds are shown. A very small portion of the rat genome shows segmental duplications with ≥99.5% sequence identity. This suggests that the majority of segmental duplications are bona fide and are not the result of missed allelic overlaps during genome assembly.
Figure 2
Sequence properties of rat segmental duplications. Distributions of the (A) length and (B) percent nucleotide sequence identity for segmental duplications are shown as a function of the number of aligned bp. Interchromosomal duplications (red); intrachromosomal duplications (blue).
Figure 2
Sequence properties of rat segmental duplications. Distributions of the (A) length and (B) percent nucleotide sequence identity for segmental duplications are shown as a function of the number of aligned bp. Interchromosomal duplications (red); intrachromosomal duplications (blue).
Figure 3
Distribution of segmental duplications (≥90% and ≥10 kb) in the rat genome. The pattern of (A) interchromosomal duplications (red) and (B) intrachromosomal duplications (blue) are depicted for all duplications ≥90% sequence identity and ≥10 kb in length. For clarity, interchromosomal distribution patterns with the random, unassigned sequence contigs (chrUn) are not shown for (A). For more detail, including % identity and pairwise relationships of all duplications and alignments, see http://ratparalogy.cwru.edu.
Figure 3
Distribution of segmental duplications (≥90% and ≥10 kb) in the rat genome. The pattern of (A) interchromosomal duplications (red) and (B) intrachromosomal duplications (blue) are depicted for all duplications ≥90% sequence identity and ≥10 kb in length. For clarity, interchromosomal distribution patterns with the random, unassigned sequence contigs (chrUn) are not shown for (A). For more detail, including % identity and pairwise relationships of all duplications and alignments, see http://ratparalogy.cwru.edu.
Figure 4
(A) Segmental duplication content per chromosome. The relative proportion of intrachromosomal and interchromosomal duplications for each chromosome is shown. The above calculations treat the unmapped sequence as a separate chromosome when classifying duplications as inter- or intrachromosomal. Forty-five percent of the unplaced chromosome is made up almost entirely of duplicated sequence. (B) Duplication blocks. Rat segmental duplications clustered into larger regions ranging from 100 to 3000 kb in length. We termed these structures “duplication blocks.” Examples of duplication blocks on chromosomes 1 and 7 are presented (arrows) with the underlying degree of sequence identity for each pairwise depicted below the graph. Chromosome 1, green; chromosome 7, red. A subtelomeric (t) and pericentromeric (p) block are indicated. The regions of the rat genome are typified by low gene density (RefSeq/EST/mRNA), a high frequency of gaps within the assembly, and an excess of pairwise alignments.
Figure 4
(A) Segmental duplication content per chromosome. The relative proportion of intrachromosomal and interchromosomal duplications for each chromosome is shown. The above calculations treat the unmapped sequence as a separate chromosome when classifying duplications as inter- or intrachromosomal. Forty-five percent of the unplaced chromosome is made up almost entirely of duplicated sequence. (B) Duplication blocks. Rat segmental duplications clustered into larger regions ranging from 100 to 3000 kb in length. We termed these structures “duplication blocks.” Examples of duplication blocks on chromosomes 1 and 7 are presented (arrows) with the underlying degree of sequence identity for each pairwise depicted below the graph. Chromosome 1, green; chromosome 7, red. A subtelomeric (t) and pericentromeric (p) block are indicated. The regions of the rat genome are typified by low gene density (RefSeq/EST/mRNA), a high frequency of gaps within the assembly, and an excess of pairwise alignments.
Similar articles
- Recent segmental and gene duplications in the mouse genome.
Cheung J, Wilson MD, Zhang J, Khaja R, MacDonald JR, Heng HH, Koop BF, Scherer SW. Cheung J, et al. Genome Biol. 2003;4(8):R47. doi: 10.1186/gb-2003-4-8-r47. Epub 2003 Jul 9. Genome Biol. 2003. PMID: 12914656 Free PMC article. - Segmental duplications: organization and impact within the current human genome project assembly.
Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Bailey JA, et al. Genome Res. 2001 Jun;11(6):1005-17. doi: 10.1101/gr.gr-1871r. Genome Res. 2001. PMID: 11381028 Free PMC article. - Patterns of segmental duplication in the human genome.
Zhang L, Lu HH, Chung WY, Yang J, Li WH. Zhang L, et al. Mol Biol Evol. 2005 Jan;22(1):135-41. doi: 10.1093/molbev/msh262. Epub 2004 Sep 15. Mol Biol Evol. 2005. PMID: 15371527 - Recent duplication, domain accretion and the dynamic mutation of the human genome.
Eichler EE. Eichler EE. Trends Genet. 2001 Nov;17(11):661-9. doi: 10.1016/s0168-9525(01)02492-1. Trends Genet. 2001. PMID: 11672867 Review. - Lessons from the human genome: transitions between euchromatin and heterochromatin.
Horvath JE, Bailey JA, Locke DP, Eichler EE. Horvath JE, et al. Hum Mol Genet. 2001 Oct 1;10(20):2215-23. doi: 10.1093/hmg/10.20.2215. Hum Mol Genet. 2001. PMID: 11673404 Review.
Cited by
- Genome organization and botanical diversity.
Paterson AH, Queitsch C. Paterson AH, et al. Plant Cell. 2024 May 1;36(5):1186-1204. doi: 10.1093/plcell/koae045. Plant Cell. 2024. PMID: 38382084 Review. - Translational outcomes in a full gene deletion of ubiquitin protein ligase E3A rat model of Angelman syndrome.
Berg EL, Pride MC, Petkova SP, Lee RD, Copping NA, Shen Y, Adhikari A, Fenton TA, Pedersen LR, Noakes LS, Nieman BJ, Lerch JP, Harris S, Born HA, Peters MM, Deng P, Cameron DL, Fink KD, Beitnere U, O'Geen H, Anderson AE, Dindot SV, Nash KR, Weeber EJ, Wöhr M, Ellegood J, Segal DJ, Silverman JL. Berg EL, et al. Transl Psychiatry. 2020 Jan 27;10(1):39. doi: 10.1038/s41398-020-0720-2. Transl Psychiatry. 2020. PMID: 32066685 Free PMC article. - Developmental social communication deficits in the Shank3 rat model of phelan-mcdermid syndrome and autism spectrum disorder.
Berg EL, Copping NA, Rivera JK, Pride MC, Careaga M, Bauman MD, Berman RF, Lein PJ, Harony-Nicolas H, Buxbaum JD, Ellegood J, Lerch JP, Wöhr M, Silverman JL. Berg EL, et al. Autism Res. 2018 Apr;11(4):587-601. doi: 10.1002/aur.1925. Epub 2018 Jan 29. Autism Res. 2018. PMID: 29377611 Free PMC article. - Analysis of Copy Number Variation in the Abp Gene Regions of Two House Mouse Subspecies Suggests Divergence during the Gene Family Expansions.
Pezer Ž, Chung AG, Karn RC, Laukaitis CM. Pezer Ž, et al. Genome Biol Evol. 2017 Jun 1;9(6):evx099. doi: 10.1093/gbe/evx099. Genome Biol Evol. 2017. PMID: 28575204 Free PMC article. - Characterization of genome-wide segmental duplications reveals a common genomic feature of association with immunity among domestic animals.
Feng X, Jiang J, Padhi A, Ning C, Fu J, Wang A, Mrode R, Liu JF. Feng X, et al. BMC Genomics. 2017 Apr 12;18(1):293. doi: 10.1186/s12864-017-3690-x. BMC Genomics. 2017. PMID: 28403820 Free PMC article.
References
- Bailey, J.A., Gu, Z., Clark, R.A., Reinert, K., Samonte, R.V., Schwartz, S., Adams, M.D., Myers, E.W., Li, P.W., and Eichler, E.E. 2002. Recent segmental duplications in the human genome. Science 297: 1003-1007. - PubMed
WEB SITE REFERENCES
- http://ratparalogy.cwru.edu; Segmental Duplication Database for Rat at CWRU.
- http://genome.ucsc.edu; Genome browser at Univ. California–Santa Cruz.
- http://www.hgsc.bcm.tmc.edu/; Human Genome Sequencing Center at Baylor College of Medicine. - PubMed
- http://rgd.mcw.edu/; Rat Genome Database at Medical College of Wisconsin.
Publication types
MeSH terms
Grants and funding
- R01 GM058815/GM/NIGMS NIH HHS/United States
- ER62862/PHS HHS/United States
- GM58815/GM/NIGMS NIH HHS/United States
- HG002318/HG/NHGRI NIH HHS/United States
- R01 HG002318/HG/NHGRI NIH HHS/United States
- CA094816/CA/NCI NIH HHS/United States
- T32 GM007250/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Miscellaneous