CAGGG Repeats and the Pericentromeric Duplication of the Hominoid Genome (original) (raw)

Using a Pericentromeric Interspersed Repeat to Recapitulate the Phylogeny and Expansion of Human Centromeric Segmental Duplications

Molecular Biology and Evolution, 2003

Despite considerable advances in sequencing of the human genome over the past few years, the organization and evolution of human pericentromeric regions have been difficult to resolve. This is due, in part, to the presence of large, complex blocks of duplicated genomic sequence at the boundary between centromeric satellite and unique euchromatic DNA. Here, we report the identification and characterization of an approximately 49-kb repeat sequence that exists in more than 40 copies within the human genome. This repeat is specific to highly duplicated pericentromeric regions with multiple copies distributed in an interspersed fashion among a subset of human chromosomes. Using this interspersed repeat (termed PIR4) as a marker of pericentromeric DNA, we recovered and sequence-tagged 3 Mb of pericentromeric DNA from a variety of human chromosomes as well as nonhuman primate genomes. A global evolutionary reconstruction of the dispersal of PIR4 sequence and analysis of flanking sequence supports a model in which pericentromeric duplications initiated before the separation of the great ape species (.12 MYA). Further, analyses of this duplication and associated flanking duplications narrow the major burst of pericentromeric duplication activity to a time just before the divergence of the African great ape and human species (5 to 7 MYA). These recent duplication exchange events substantially restructured the pericentromeric regions of hominoid chromosomes and created an architecture where large blocks of sequence are shared among nonhomologous chromosomes. This report provides the first global view of the series of historical events that have reshaped human pericentromeric regions over recent evolutionary time.

A study of the evolution of repeated DNA sequences in primates and the existence of a new class of repetitive sequences in primates

Journal of Molecular Biology, 1979

A new class of lowly repetitive DNA sequences has been detected in the primat'e genome. The renaturation rate of this sequence class is practically indistinguishable from the renaturation rate of single-copy sequences. Consequently, this lowly repetitive sequence class has not been previously observed in DNA renaturation rate studies. This new sequence class is significant in that it might occupy a major fraction of the primate genome. Based on a study of the thermal stabilities of DNA heteroduplexes corlstrutted from human DNA a,nd either bonnet monkey or galapo DNAs, we are able to compare the relat)ive mutation rates of repetitive and single-copy sequences in the primate genome. We find that the mutation rate of short, interspersed repetitive sequences is either less t)han or approximately equal t,o the mutation rat,e of single-copy sequences. This implies t,hat the base sequence of these repetitive sequences is important to their biological function. We also find that numerous mutations have accumulated in interspersed repeated sequences since the divergence of galago and human. These mutations are only recognizable because they occur at specific sit,es in the repeated sequence rather than at random sites in the sequence. Although interspersed repetitive sequences from human and galago can readily cross-hybridize. these site-specific mutations identify them as being two distinct classes. In contrast, far fewer sit,e-specific mutations have occurred since the divergence of lluman and monkey.

Ancient Genome Duplications Did Not Structure the Human Hox-Bearing Chromosomes

Genome Research, 2001

The fact that there are four homeobox (Hox) clusters in most vertebrates but only one in invertebrates is often cited as evidence for the hypothesis that two rounds of genome duplication by polyploidization occurred early in vertebrate history. In addition, it has been observed in humans and other mammals that numerous gene families include paralogs on two or more of the four Hox-bearing chromosomes (the chromosomes bearing the Hox clusters; i.e., human chromosomes 2, 7, 12, and 17), and the existence of these paralogs has been taken as evidence that these genes were duplicated along with the Hox clusters by polyploidization. We tested this hypothesis by phylogenetic analysis of 42 gene families including members on two or more of the human Hox-bearing chromosomes. In 32 of these families there was evidence against the hypothesis that gene duplication occurred simultaneously with duplication of the Hox clusters. Phylogenies of 14 families supported the occurrence of one or more gene duplications before the origin of vertebrates, and of 15 gene duplication times estimated for gene families evolving in a clock-like manner, only six were dated to the same time period early in vertebrate history during which the Hox clusters duplicated. Furthermore, of gene families duplicated around the same time as the Hox clusters, the majority showed topologies inconsistent with their having duplicated simultaneously with the Hox clusters. The results thus indicate that ancient events of genome duplication, if they occurred at all, did not play an important role in structuring the mammalian Hox-bearing chromosomes.

Duplication of a gene-rich cluster between 16p11.1 and Xq28: a novel pericentromeric-directed mechanism for paralogous genome evolution

Human Molecular Genetics, 1996

We have identified a 26.5 kb gene-rich duplication shared by human Xq28 and 16p11.1. Complete comparative sequence analysis of cosmids from both loci has revealed identical Xq28 and 16p11.1 genomic structures for both the human creatine transporter gene (SLC6A8) and five exons of the CDM gene (DXS1357E). Overall nucleotide similarity within the duplication was found to be 94.6%, suggesting that this interchromosomal duplication occurred within recent evolutionary time (7-10 mya). Based on comparisons between genomic and cDNA sequence, both the Xq28 creatine transporter and DXS1357E genes are transcriptionally active. Predicted translation of exons and RT-PCR analysis reveal that chromosome 16 paralogs likely represent pseudogenes. Comparative fluorescent in situ hybridization (FISH) analyses of chromosomes from various primates indicate that this gene-rich segment has undergone several duplications. In gorilla and chimpanzee, multiple pericentromeric localizations on a variety of chromosomes were found using probes from the duplicated region. In other species, such as the orangutan and gibbon, FISH signals were only identified at the distal end of the X chromosome, suggesting that the Xq28 locus represents the ancestral copy. Sequencing of the 16p11.1/Xq28 duplication breakpoints has revealed the presence of repetitive immunoglobulin-like CAGGG pentamer sequences at or near the paralogy boundaries. The mobilization and dispersal of this gene-rich 27 kb element to the pericentromeric regions of primate chromosomes defines an unprecedented form of recent genome evolution and a novel mechanism for the generation of genetic diversity among closely related species.

Segmental duplications and the evolution of the primate genome

Nature Reviews Genetics, 2002

PERSPECTIVES which indicates that there is likely to be an important link between the processes of chromosomal rearrangement and duplication. The abundance of segmental duplications, their central role in the emergence of genes with new function and their association with chromosomal instability, have important implications for primate genome organization and evolution. Previously, we summarized the organization and structure of these duplications, as well as their potential role in DOMAIN ACCRETION during vertebrate evolution 8,9. In this Perspective, we discuss evidence that recent duplications have had a vital role in altering the genetic constitution of man and the great apes, both at the level of the gene and the genome.

Evolutionary History of Alpha Satellite DNA Repeats Dispersed within Human Genome Euchromatin

Genome Biology and Evolution, 2020

Major human alpha satellite DNA repeats are preferentially assembled within (peri)centromeric regions but are also dispersed within euchromatin in the form of clustered or short single repeat arrays. To study the evolutionary history of single euchromatic human alpha satellite repeats (ARs), we analyzed their orthologous loci across the primate genomes. The continuous insertion of euchromatic ARs throughout the evolutionary history of primates starting with the ancestors of Simiformes (45–60 Ma) and continuing up to the ancestors of Homo is revealed. Once inserted, the euchromatic ARs were stably transmitted to the descendant species, some exhibiting copy number variation, whereas their sequence divergence followed the species phylogeny. Many euchromatic ARs have sequence characteristics of (peri)centromeric alpha repeats suggesting heterochromatin as a source of dispersed euchromatic ARs. The majority of euchromatic ARs are inserted in the vicinity of other repetitive elements such...

Punctuated duplication seeding events during the evolution of human chromosome 2p11

Genome Research, 2005

Primate genomic sequence comparisons are becoming increasingly useful for elucidating the evolutionary history and organization of our own genome. Such studies are particularly informative within human pericentromeric regions—areas of particularly rapid change in genomic structure. Here, we present a systematic analysis of the evolutionary history of one ∼700-kb region of 2p11, including the first autosomal transition from pericentromeric sequence to higher-order α-satellite DNA. We show that this region is composed of segmental duplications corresponding to 14 ancestral segments ranging in size from 4 kb to ∼115 kb. These duplicons show 94%–98.5% sequence identity to their ancestral loci. Comparative FISH and phylogenetic analysis indicate that these duplicons are differentially distributed in human, chimpanzee, and gorilla genomes, whereas baboon has a single putative ancestral locus for all but one of the duplications. Our analysis supports a model where duplicative transposition...

A burst of segmental duplications in the genome of the African great ape ancestor

Nature, 2009

Wilson and King were among the first to recognize that the extent of phenotypic change between humans and great apes was dissonant with the rate of molecular change. Proteins are virtually identical 1,2 ; cytogenetically there are few rearrangements that distinguish ape-human chromosomes 3 ; rates of single-basepair change 4-7 and retroposon activity 8-10 have slowed particularly within hominid lineages when compared to rodents or monkeys. Here, we perform a systematic analysis of duplication content of four primate genomes (macaque, orangutan, chimpanzee and human) in an effort to understand the pattern and rates of genomic duplication during hominid evolution. We find that the ancestral branch leading to human and African great apes shows the most significant increase in duplication activity both in terms of basepairs and in terms of events. This duplication acceleration within the ancestral species is significant when compared to lineagespecific rate estimates even after accounting for copy-number polymorphism and homoplasy. We discover striking examples of recurrent and independent gene-containing duplications within the gorilla and chimpanzee that are absent in the human lineage. Our results suggest that the evolutionary properties of copy-number mutation differ significantly from other forms of genetic mutation and, in contrast to the hominid slowdown of single basepair mutations, there has been a genomic burst of duplication activity at this period during human evolution.

Ancestral reconstruction of segmental duplications reveals punctuated cores of human genome evolution

Nature genetics, 2007

Human segmental duplications are hotspots for nonallelic homologous recombination leading to genomic disorders, copy-number polymorphisms and gene and transcript innovations. The complex structure and history of these regions have precluded a global evolutionary analysis. Combining a modified A-Bruijn graph algorithm with comparative genome sequence data, we identify the origin of 4,692 ancestral duplication loci and use these to cluster 437 complex duplication blocks into 24 distinct groups. The sequence-divergence data between ancestral-derivative pairs and a comparison with the chimpanzee and macaque genome support a 'punctuated' model of evolution. Our analysis reveals that human segmental duplications are frequently organized around 'core' duplicons, which are enriched for transcripts and, in some cases, encode primate-specific genes undergoing positive selection. We hypothesize that the rapid expansion and fixation of some intrachromosomal segmental duplication...