Natural mutagenesis of human genomes by endogenous retrotransposons - PubMed (original) (raw)

Natural mutagenesis of human genomes by endogenous retrotransposons

Rebecca C Iskow et al. Cell. 2010.

Abstract

Two abundant classes of mobile elements, namely Alu and L1 elements, continue to generate new retrotransposon insertions in human genomes. Estimates suggest that these elements have generated millions of new germline insertions in individual human genomes worldwide. Unfortunately, current technologies are not capable of detecting most of these young insertions, and the true extent of germline mutagenesis by endogenous human retrotransposons has been difficult to examine. Here, we describe technologies for detecting these young retrotransposon insertions and demonstrate that such insertions indeed are abundant in human populations. We also found that new somatic L1 insertions occur at high frequencies in human lung cancer genomes. Genome-wide analysis suggests that altered DNA methylation may be responsible for the high levels of L1 mobilization observed in these tumors. Our data indicate that transposon-mediated mutagenesis is extensive in human genomes and is likely to have a major impact on human biology and diseases.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Strategy to sequence retrotransposon insertion junctions

The approaches that were used to sequence retrotransposon insertion junctions are depicted. ABI (left side) and pyrosequencing (right side). The “transposon” is either L1-Ta or Alu. Human genomic DNA is digested with a restriction endonuclease and ligated to a linker. The linker is partially double-stranded with a 3′ amine group on the short strand. This prevents amplification of random genomic DNA from the linker primers. Amplification only occurs if there is extension from the transposon-specific primer. This completes the double-stranded linker and creates the sequence for the linker-specific primer to anneal to, thus allowing the PCR reaction to proceed. Left side-ABI sequencing. After an initial PCR amplification, a second round of PCR is performed with nested primers. Second round PCR products are cloned into pBLUESCRIPT using a restriction site in the nested retrotransposon primer and another restriction site in the linker. Resulting colonies are sent for ABI Sanger capillary sequencing. Right side-pyrosequencing. After an initial PCR amplification, a second PCR is performed with nested primers. The retrotransposon nested primer contains the “A” adapter sequence for 454 sequencing whereas the linker-specific primer contains an 8 bp unique barcode for each sample and the “B” adapter sequence. Samples are pooled in equal molar ratios for emulsion PCR with beads binding only the “A” end. Thus sequencing occurs from the “B” end only, avoiding possible problems with sequencing through the poly (A) tail of L1. A similar approach was used with Alu except the 5’ junctions were amplified and sequenced. Please see Experimental Procedures and Extended Experimental Procedures for primers and additional details.

Figure 2

Figure 2. Characterization of novel L1 elements

A) For both ABI sequencing and pyrosequencing experiments, L1 elements that were not present in the human reference sequence (hg18) were compared with L1 polymorphisms that have been deposited to dbRIP (Wang et al., 2006). The Venn diagram depicts the relationships between our candidates and L1 insertions that have been deposited to dbRIP but are absent from hg18. Note the minimal overlap between our data sets and the L1 polymorphisms in dbRIP, indicating that our L1-seq method has recovered many novel insertions. Our assays predominantly detected L1-Ta and pre-Ta elements (Table S5). B) Correlation of estimated and previously determined allelic frequencies. For the pyrosequencing experiment, allelic frequencies were estimated using sequence data and were compared to known allelic frequencies of polymorphic L1 elements (n = 63). C) A subset of L1 elements that appeared to be "singletons" based on their presence in a single individual (from sequence data) and absence from hg18 and dbRIP were verified by PCR in pools of diverse human DNA. The Individual lane is the individual from whom the L1 was initially sequenced. Pool lanes contain DNA from 15 diverse humans. Chimp is Coriell #NA03448A and Negative is a control PCR with no template. D) Histogram of allelic frequencies for polymorphic L1s in dbRIP (Wang et al., 2006). Allelic frequency data was extracted from dbRIP (n = 153). E) Allelic frequencies of L1’s identified by ABI Sanger dideoxy sequencing. Allelic frequencies were determined by PCR in a panel of 46 diverse individuals (n = 46). F) Allelic frequencies of L1’s identified by pyrosequencing. Allelic frequencies were determined as depicted in B (n = 650).

Figure 3

Figure 3. PCR validation of somatic insertions and identification of a hypomethylation signature in tumors with new L1 insertions

A) Nine somatic L1 insertions were identified by screening low frequency L1 insertions that were identified initially from our pyrosequencing data (Table S2) with PCR assays. Shown here are those verified as somatic insertions by their presence in the tumor tissue and absence from adjacent normal tissue. Negative lanes are control PCRs with no template. Primers flanking the putative insertion sites were used to amplify the pre-insertion alleles. Primers within the 3′ end of an L1 consensus sequence and downstream of the putative insertion sites were used to verify the presence of each L1. Anonymous patient identifiers are on the left. NCI-H1395 is a tumor-derived cell line matched with its normal B-lymphocyte-control (both from ATCC). The three somatic L1 insertion candidates that have been reported previously in the literature are shown in Figure S1. B) The methylation status of the 20 lung tumor and normal adjacent tissues used in the pyrosequencing experiment were analyzed by Illumina Infinium analysis. Fifty-nine probes were identified whose changes in methylation status in the tumor specimens relative to matched normal tissues were tightly correlated with somatic L1 retrotransposition (Table S6). The six tumors that were positive for somatic L1 insertions (blue) clustered together along with one of the remaining tumor samples (i.e., sample 119), which did not have a somatic L1 insertion (orange). Tumor ANCO119 (seventh from the left) clustered with the L1-positive tumors but lacked an L1 insertion, suggesting that it might also have an L1-permissive state. This signature expands to 1,928 correlated probes if tumor 119 is included in the L1 permissive class and the least correlated tumor of the six L1-positive tumors (ANCO106) is shifted to the L1-negative class (see Experimental Procedures).

Comment in

Similar articles

Cited by

References

    1. Badge RM, Alisch RS, Moran JV. ATLAS: a system to selectively identify human-specific L1 insertions. Am J Hum Genet. 2003;72:823–838. - PMC - PubMed
    1. Batzer MA, Deininger PL. Alu repeats and human genomic diversity. Nat Rev Genet. 2002;35:501–538. - PubMed
    1. Bestor TH. The host defense function of genomic methylation patterns. Novartis Found Symp. 1998;214:187–195. - PubMed
    1. Boissinot S, Chevret P, Furano AV. L1 (LINE-1) retrotransposon evolution and amplification in recent human history. Mol Biol Evol. 2000;17:915–928. - PubMed
    1. Boissinot S, Entezam A, Furano AV. Selection against deleterious LINE-1- containing loci in the human lineage. Mol Biol Evol. 2001;18:926–935. - PubMed

Publication types

MeSH terms

Grants and funding

LinkOut - more resources