Characterization of pre-insertion loci of de novo L1 insertions - PubMed (original) (raw)
Characterization of pre-insertion loci of de novo L1 insertions
Stephen L Gasior et al. Gene. 2007.
Abstract
The human Long Interspersed Element-1 (LINE-1) and the Short Interspersed Element (SINE) Alu comprise 28% of the human genome. They share the same L1-encoded endonuclease for insertion, which recognizes an A+T-rich sequence. Under a simple model of insertion distribution, this nucleotide preference would lead to the prediction that the populations of both elements would be biased towards A+T-rich regions. Genomic L1 elements do show an A+T-rich bias. In contrast, Alu is biased towards G+C-rich regions when compared to the genome average. Several analyses have demonstrated that relatively recent insertions of both elements show less G+C content bias relative to older elements. We have analyzed the repetitive element and G+C composition of more than 100 pre-insertion loci derived from de novo L1 insertions in cultured human cancer cells, which should represent an evolutionarily unbiased set of insertions. An A+T-rich bias is observed in the 50 bp flanking the endonuclease target site, consistent with the known target site for the L1 endonuclease. The L1, Alu, and G+C content of 20 kb of the de novo pre-insertion loci shows a different set of biases than that observed for fixed L1s in the human genome. In contrast to the insertion sites of genomic L1s, the de novo L1 pre-insertion loci are relatively L1-poor, Alu-rich and G+C neutral. Finally, a statistically significant cluster of de novo L1 insertions was localized in the vicinity of the c-myc gene. These results suggest that the initial insertion preference of L1, while A+T-rich in the initial vicinity of the break site, can be influenced by the broader content of the flanking genomic region and have implications for understanding the dynamics of L1 and Alu distributions in the human genome.
Figures
Fig 1
Alu, L1, and G+C content of de novo HeLa L1 pre-insertion loci and random sites. The 10 kb flanks on each side of de novo L1 pre-insertion sites (closed squares) and simulated sites (open circles) were characterized for frequency containing A) %Alu content, B) %L1 content, C) %G+C content for 10 kb on either side of the endonuclease site/random site, and D) %G+C content for 25 bp on either side of the endonuclease site/random site. X-axis values are binned by 5.
Fig 2
Chromosome distribution of de novo L1 and SINE pre-insertion loci A) de novo L1 and SINE insertion sites were mapped to individual chromosomes according to the chromosome assignment of the highest contig BLAST hit and each chromosome is presented as a % of the total (white). This was compared to the percent chromosome content with corrections for the HeLa karyotype (black). The chromosome distributions of >1000 random insertion sites from a computer simulation with corrections for the HeLa karyotype are also shown (grey). B) Plot and curve fits of chromosome distributions of de novo L1 and SINE insertions (x-axis, as % of total) versus % chromosome content as calculated previously (squares’s, black line) and versus % ASV insertions (grey diamonds, grey line, right Y-axis. C) Plot and curve fit of chromosome distributions of de novo L1 and Alu insertions (x-axis, as % of total) versus a computer simulation of random sites in HeLa total (y- axis).
Fig 3
Insertions near the c-myc locus. A schematic of the c-myc locus with 5’ flanking pseudogene POU5F1P1 and 3’ flanking PVRT gene is presented. The locations of 4 de novo L1 insertions are marked with arrows above the genes pointing down. The locations into c-myc of a somatic L1 insertion/rearrangement from a breast cancer and the site of a canine L1 insertion shown with arrows pointing up.
References
- Batzer MA, Deininger PL. Alu repeats and human genomic diversity. Nat Rev Genet. 2002;3:370–379. - PubMed
- Boissinot S, Chevret P, Furano AV. L1 (LINE-1) retrotransposon evolution and amplification in recent human history. Mol Biol Evol. 2000;17:915–928. - PubMed
- Boissinot S, Entezam A, Furano AV. Selection against deleterious LINE-1-containing loci in the human lineage. Mol Biol Evol. 2001;18:926–935. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
- R01 GM045668/GM/NIGMS NIH HHS/United States
- P20 RR020152/RR/NCRR NIH HHS/United States
- R01 GM060518/GM/NIGMS NIH HHS/United States
- R01GM45668/GM/NIGMS NIH HHS/United States
- P20RR020152/RR/NCRR NIH HHS/United States
- GM60518/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources