Mobile elements create structural variation: analysis of a complete human genome - PubMed (original) (raw)

Mobile elements create structural variation: analysis of a complete human genome

Jinchuan Xing et al. Genome Res. 2009 Sep.

Abstract

Structural variants (SVs) are common in the human genome. Because approximately half of the human genome consists of repetitive, transposable DNA sequences, it is plausible that these elements play an important role in generating SVs in humans. Sequencing of the diploid genome of one individual human (HuRef) affords us the opportunity to assess, for the first time, the impact of mobile elements on SVs in an individual in a thorough and unbiased fashion. In this study, we systematically evaluated more than 8000 SVs to identify mobile element-associated SVs as small as 100 bp and specific to the HuRef genome. Combining computational and experimental analyses, we identified and validated 706 mobile element insertion events (including Alu, L1, SVA elements, and nonclassical insertions), which added more than 305 kb of new DNA sequence to the HuRef genome compared with the Human Genome Project (HGP) reference sequence (hg18). We also identified 140 mobile element-associated deletions, which removed approximately 126 kb of sequence from the HuRef genome. Overall, approximately 10% of the HuRef-specific indels larger than 100 bp are caused by mobile element-associated events. More than one-third of the insertion/deletion events occurred in genic regions, and new Alu insertions occurred in exons of three human genes. Based on the number of insertions and the estimated time to the most recent common ancestor of HuRef and the HGP reference genome, we estimated the Alu, L1, and SVA retrotransposition rates to be one in 21 births, 212 births, and 916 births, respectively. This study presents the first comprehensive analysis of mobile element-related structural variants in the complete DNA sequence of an individual and demonstrates that mobile elements play an important role in generating inter-individual structural variation.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

PCR confirmation of the candidate MASVs. Four agarose gel chromatographs of the PCR products from a confirmation panel are shown. The DNA sample in each lane is labeled above the panel. (Arrows) Expected sizes (in bp) of the PCR amplicons. Diagrams representing the structure of each MASV allele are shown on the right of the panel. (Black line) Flanking DNA sequence, (filled arrows) mobile elements. (A) Locus 1104685335585, an Alu insertion that is heterozygous in the HuRef donor and absent in all other samples. The PCR products in the chimpanzee and the rhesus monkey are slightly smaller because of the smaller size of a (CA)n dinucleotide repeat in these genomes. (B) Locus 1104685664564, an Alu insertion that is present in all human samples tested but absent in the chimpanzee and rhesus macaque. (C) Locus 1104685512583, an L1 recombination-mediated indel. Because the HuRef sample is homozygous for the small size allele, as is the chimpanzee and rhesus macaque, this indel is likely to be caused by an insertion in the reference assembly. (Black box) The tandem duplication section inside the L1. (D) Locus 1104685523196, a false-positive Alu recombination-mediated deletion (ARMD) event where HuRef and all other samples are homozygous for the no-deletion allele.

Figure 2.

Figure 2.

Allele frequency distribution of 43 novel Alu insertions in 15 European individuals.

Figure 3.

Figure 3.

Genomic distribution of MASVs. Positions of MASVs are shown on a human ideogram. (Red dots, left side of each chromosome) Positions of insertions, (blue dots, right side of each chromosome) positions of deletions.

Figure 4.

Figure 4.

(A) Size distribution of L1 insertions. The number of insertions in 500-bp bins is shown. (B) Size distribution of _Alu_- and L1-mediated deletions. The percentage of total events in 500-bp bins (except the last one) is shown.

Figure 5.

Figure 5.

Four types of common MASVs in the HuRef genome. (A) Classical retrotransposon insertion; (B) nonclassical insertions; (C) nonallelic homologous recombination-mediated insertion/deletion; (D) nonhomologous end-joining-mediated deletion. (TTAAAA) Standard L1 cleavage site for classical retrotransposition; (black lines) flanking regions, (gray lines) intervening regions, (dotted circles) homologous recombining regions, (red boxes) microhomology regions, (red arrow boxes) TSDs of each element.

Similar articles

Cited by

References

    1. Babushok DV, Kazazian HH., Jr Progress in understanding the biology of the human mutagen LINE-1. Hum Mutat. 2007;28:527–539. - PubMed
    1. Bailey JA, Liu G, Eichler EE. An Alu transposition model for the origin and expansion of human segmental duplications. Am J Hum Genet. 2003;73:823–834. - PMC - PubMed
    1. Batzer MA, Deininger PL. Alu repeats and human genomic diversity. Nat Rev Genet. 2002;3:370–379. - PubMed
    1. Belancio VP, Hedges DJ, Deininger P. Mammalian non-LTR retrotransposons: For better or worse, in sickness and in health. Genome Res. 2008;18:343–358. - PubMed
    1. Bentley J, Diggle CP, Harnden P, Knowles MA, Kiltie AE. DNA double strand break repair in human bladder cancer is error prone and involves microhomology-associated end-joining. Nucleic Acids Res. 2004;32:5249–5259. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources