Natural genetic variation caused by transposable elements in humans - PubMed (original) (raw)

Natural genetic variation caused by transposable elements in humans

E Andrew Bennett et al. Genetics. 2004 Oct.

Abstract

Transposons and transposon-like repetitive elements collectively occupy 44% of the human genome sequence. In an effort to measure the levels of genetic variation that are caused by human transposons, we have developed a new method to broadly detect transposon insertion polymorphisms of all kinds in humans. We began by identifying 606,093 insertion and deletion (indel) polymorphisms in the genomes of diverse humans. We then screened these polymorphisms to detect indels that were caused by de novo transposon insertions. Our method was highly efficient and led to the identification of 605 nonredundant transposon insertion polymorphisms in 36 diverse humans. We estimate that this represents 25-35% of approximately 2075 common transposon polymorphisms in human populations. Because we identified all transposon insertion polymorphisms with a single method, we could evaluate the relative levels of variation that were caused by each transposon class. The average human in our study was estimated to harbor 1283 Alu insertion polymorphisms, 180 L1 polymorphisms, 56 SVA polymorphisms, and 17 polymorphisms related to other forms of mobilized DNA. Overall, our study provides significant steps toward (i) measuring the genetic variation that is caused by transposon insertions in humans and (ii) identifying the transposon copies that produce this variation.

PubMed Disclaimer

Figures

F<sc>igure</sc> 1.—

Figure 1.—

Computational pipeline for indel and transposon polymorphism discovery. A flow chart of the computational steps that were followed for the discovery of indel and transposon polymorphisms is shown on the left (boxes). A breakdown of the number of traces present at the end of each step is shown on the right. The number of indel and transposon polymorphisms identified is listed at the bottom. Note that the numbers are broken down for each of the three populations examined. TSC, the SNP Consortium traces; WGS, whole-genome shotgun traces; WCS, whole-chromosome shotgun traces.

F<sc>igure</sc> 2.—

Figure 2.—

Proposed new evolutionary lineages of Alu. For each Alu subfamily, the number of polymorphic copies retaining the subfamily consensus sequence is compared to groups sharing one or more base pair changes (in parentheses). In several cases, the majority of polymorphic copies of a given subfamily diverge from the subfamily consensus by a few shared changes. An evolutionary progression can be inferred (from left to right) in which new shared base changes appear to have been acquired. The 10 proposed novel evolutionary groups are indicated by braces to the right of the groups. The total number of elements within the group is indicated in the parentheses. These data suggest that a significant number of Alu insertions in the genome can serve as new source genes to produce offspring elements. Only elements that are at least 80% full length are represented.

F<sc>igure</sc> 3.—

Figure 3.—

PCR validation studies. The strategy for the PCR validation assays is shown along with some examples of these assays. (A) The locations of the four primers (A, B, C, and D) that were used to evaluate transposon polymorphisms by PCR are shown. (B) A typical Alu PCR assay is shown in which primers flanking the transposon (A and D) are used to determine whether a given Alu copy is present in the genome of an individual. The larger band is produced when the element is present, whereas the smaller band is produced when the element is absent. Lane M, 1-kb marker; −C, negative control lacking template DNA; lanes 1–12, 12 PCR reactions evaluating DNA samples from the Coriell panel. (C) A typical L1 PCR assay is shown in which two PCRs are performed to identify all of the alleles present in the Coriell panel. The assay on the left uses the A and D primers to identify alleles that lack an L1 insertion, whereas the assay on the right uses primers C and D (or A and B) to identify alleles containing the L1 insertion. These two assays are used together to evaluate whether a given allele is homozygous or heterozygous in a given individual. The lanes are the same as for B. For cases in which the L1 element is relatively short (<2 kb), the allele containing the L1 insertion often was detected in the A plus D assay as well. (D) A typical SVA assay is depicted. These assays are performed the same way as the L1 assays (in C above), using two assays to evaluate all SVA alleles present.

Similar articles

Cited by

References

    1. Abdel-Halim, S., G. E. Kilroy, W. S. Watkins, L. B. Jorde and M. A. Batzer, 2003. Recently integrated Alu elements and human genomic diversity. Mol. Biol. Evol. 20: 1349–1361. - PubMed
    1. Badge, R. M., R. S. Alisch and J. V. Moran, 2003. ATLAS: a system to selectively identify human-specific L1 insertions. Am. J. Hum. Genet. 72: 823–838. - PMC - PubMed
    1. Bailey, J. A., Z. Gu, R. A. Clark, K. Reinert, R. V. Samonte et al., 2002. Recent segmental duplications in the human genome. Science 297: 1003–1007. - PubMed
    1. Batzer, M. A., and P. L. Deininger, 2002. Alu repeats and human genomic diversity. Nat. Rev. Genet. 3: 370–379. - PubMed
    1. Bedell, J. A., I. Korf and W. Gish, 2000. MaskerAid: a performance enhancement to RepeatMasker. Bioinformatics 16: 1040–1041. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources