The insertional history of an active family of L1 retrotransposons in humans - PubMed (original) (raw)

. 2004 Jul;14(7):1221-31.

doi: 10.1101/gr.2326704. Epub 2004 Jun 14.

Affiliations

The insertional history of an active family of L1 retrotransposons in humans

Stéphane Boissinot et al. Genome Res. 2004 Jul.

Abstract

As humans contain a currently active L1 (LINE-1) non-LTR retrotransposon family (Ta-1), the human genome database likely provides only a partial picture of Ta-1-generated diversity. We used a non-biased method to clone Ta-1 retrotransposon-containing loci from representatives of four ethnic populations. We obtained 277 distinct Ta-1 loci and identified an additional 67 loci in the human genome database. This collection represents approximately 90% of the Ta-1 population in the individuals examined and is thus more representative of the insertional history of Ta-1 than the human genome database, which lacked approximately 40% of our cloned Ta-1 elements. As both polymorphic and fixed Ta-1 elements are as abundant in the GC-poor genomic regions as in ancestral L1 elements, the enrichment of L1 elements in GC-poor areas is likely due to insertional bias rather than selection. Although the chromosomal distribution of Ta-1 inserts is generally a function of chromosomal length and gene density, chromosome 4 significantly deviates from this pattern and has been much more hospitable to Ta-1 insertions than any other chromosome. Also, the intra-chromosomal distribution of Ta-1 elements is not uniform. Ta-1 elements tend to cluster, and the maximal gaps between Ta-1 inserts are larger than would be expected from a model of uniform random insertion.

Copyright 2004 Cold Spring Harbor Laboratory Press ISSN

PubMed Disclaimer

Figures

Figure 1

Figure 1

Structure of a typical full-length human L1 element. The 5′ untranslated region (UTR) has a regulatory function; open reading frame 1 (ORF I) encodes an RNA-binding protein; ORF II encodes the L1 replicase and contains an endonuclease (EN) and a reverse transcriptase domain (RT), and the 3′ UTR that contains a conserved G-rich polypurine motif. Genomic copies of L1 usually end in an A-rich stretch (open rectangle, see Moran and Gilbert 2002). The partial alignment of Ta-1, Ta-0, L1PA2, and L1PA5 consensus sequences shows the positions of oligonucleotide 1, which is specific for Ta-1, and of oligonucleotide 2, which includes the ACA trinucleotide, diagnostic of the Ta family (see Methods). The numbers indicate the position of the sequence on alignment ALIGN-000165 at the EMBL_ALIGN database (Boissinot and Furano 2001).

Figure 2

Figure 2

Frequency distribution of Ta-1-containing alleles. The extent of polymorphism of the indicated Ta-1-containing inserts in 141 individuals was determined as described in the Methods. All DNAs for these studies were obtained from the Coriell Institute for Medical Research and included individuals from the following populations: Chinese, Japanese, Druze, Biaka and Mbuti Pygmy, Melanesian, Atayal, Ami, Caucasian American, and African American.

Figure 3

Figure 3

Size distribution of autosomal L1 elements. Elements are grouped in 1000-bp intervals except for the smaller class that shows the frequency of elements 500–1000 bp long. The number of elements analyzed was 183 Ta-1 elements (those whose size was known from the human genome data base), 282 L1PA2 elements, and 305 L1PA5 elements.

Figure 4

Figure 4

Frequency of L1 elements in different GC fractions of the human genome. The % GC was calculated over 20-kb windows. The bins from left to right correspond to an increasing 2% GC fraction. The family-specific L1 curves were built using the DNA flanks of 299 Ta-1 (squares), 300 L1PA2 (triangles), and 324 L1PA5 (diamonds) inserts. The total genome curve (heavy line) was built using Tables “gcPercent” at

http://genome.ucsc.edu/

for all chromosomes.

Figure 5

Figure 5

Insertion sites of Ta-1 elements in the human genome. Ta-1 integration sites are shown as tick marks above each chromosome. Tick marks with thinner lines impinging directly on the chromosome diagrams are those identified in the human genome database. Tick marks with heavier lines offset from the chromosomes are the ones that we cloned. Solid circles indicate polymorphic inserts, open circles indicate fixed ones, and tick marks without either are indeterminate (not amenable to PCR, see Methods and Table 1). The number in parentheses indicates the number of Ta-1 insertion sites on each chromosome. Chromosome 15 only shows the positions of 4 of the 5 Ta-1 elements on this chromosome because one of them was located on an unassigned segment. The shaded boxes indicate the number of known genes per 5 Mb segments. The unadjusted _k_-specific P values for some apparent clusters are given.

Figure 6

Figure 6

Chromosomal distribution of Ta-1 elements. (A) The number of Ta-1 elements per chromosome is positively correlated to the chromosome physical length (P < 0.0001) in megabases (Mb, gaps removed). The solid line corresponds to the line expected if Ta-1 elements are distributed proportionally to the length of the chromosomes. The 95% confidence limits (dashed lines) are calculated as ±1.96 times the square root of the predicted number of sites to account for Poisson counting error. (B) The number of Ta-1 elements per chromosome (Ta-1/Mb) is negatively correlated to the gene density (Genes/Mb); R = –0.49, P < 0.004.

Figure 7

Figure 7

Distribution of P values. The uncorrected P values for the minimal _k_-span (k = 6) and maximal gap size for all chromosomes are shown. Similar results were found for k = 7 and 8 (not shown). The rectangles show the expected distribution of P values for a random uniform insertion of inserts.

Similar articles

Cited by

References

    1. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403–410. - PubMed
    1. Badge, R.M., Alisch, R.S., and Moran, J.V. 2003. ATLAS: A system to selectively identify human-specific L1 insertions. Am. J. Hum. Genet. 72: 823–838. - PMC - PubMed
    1. Boissinot, S. and Furano, A.V. 2001. Adaptive evolution in LINE-1 retrotransposons. Mol. Biol. Evol 18: 2186–2194. - PubMed
    1. Boissinot, S., Chevret, P., and Furano, A.V. 2000. L1 (LINE-1) retrotransposon evolution and amplification in recent human history. Mol. Biol. Evol. 17: 915–928. - PubMed
    1. Boissinot, S., Entezam, A., and Furano, A.V. 2001. Selection against deleterious LINE-1-containing loci in the human lineage. Mol. Biol. Evol. 18: 926–935. - PubMed

WEB SITE REFERENCES

    1. http://genome.ucsc.edu
    1. http://ftp.genome.washington.edu/RM/RepeatMasker.html
    1. http://genome-www5.stanford.edu/
    1. http://www.repeatmasker.org; RepeatMasker.

MeSH terms

Substances

LinkOut - more resources