The Human Genome Project Reveals a Continuous Transfer of Large Mitochondrial Fragments to the Nucleus (original) (raw)

Journal Article

Tobias Mourier ,

Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Copenhagen, Denmark

Search for other works by this author on:

Anders J. Hansen ,

Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Copenhagen, Denmark

Search for other works by this author on:

Eske Willerslev ,

Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Copenhagen, Denmark

Search for other works by this author on:

Peter Arctander

Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Copenhagen, Denmark

Search for other works by this author on:

Published:

01 September 2001

Cite

Tobias Mourier, Anders J. Hansen, Eske Willerslev, Peter Arctander, The Human Genome Project Reveals a Continuous Transfer of Large Mitochondrial Fragments to the Nucleus, Molecular Biology and Evolution, Volume 18, Issue 9, September 2001, Pages 1833–1837, https://doi.org/10.1093/oxfordjournals.molbev.a003971
Close

Navbar Search Filter Mobile Enter search term Search

Mitochondrial genomes are believed to gradually transfer DNA fragments (numts) into the nuclear chromosomes of eukaryotic cells during evolution (reviewed in Zhang and Hewitt 1996 ). This assumption relies on hybridization studies of mitochondrial DNA sequences (mtDNA) (Tsuzuki et al. 1983 ), sequencing of numts (e.g., Lopez et al. 1994 ; Arctander 1995 ; Zischler et al. 1995 ; Herrnstadt et al. 1999 ), and similarity searches in sequence databases (Blanchard and Schmidt 1996 ; Bensasson et al. 2001 ). Here we present the first extensive analysis of numts in the human nuclear genome. Through a combination of conventional BLAST alignment (Altschul et al. 1997 ) and a DNA block aligning (DBA) algorithm (Jareborg, Birney, and Durbin 1999 ), we searched roughly 93.5% of the human genome (http://www.ncbi.nlm.nih.gov/genome/seq/) for numts. This approach revealed three notable findings. First, several numts exceed the size of the longest human numt reported to date (Herrnstadt et al. 1999 ). Second, all parts of the mitochondrial DNA are represented in the nuclear genome. Finally, the integration of mtDNAs into the nucleus is a continuous evolutionary process, thereby verifying previous beliefs (Zhang and Hewitt 1996 ; Wallace et al. 1997 ; Herrnstadt et al. 1999 ).

Through the web service provided by NCBI (http://www.ncbi.nlm.nih.gov/), we compared the complete human mitochondrial DNA and the working draft of the human nuclear genome (as of mid-April 2001) using BLAST. This procedure was followed by alignment using the DBA algorithm (Jareborg, Birney, and Durbin 1999 ), which found collinear blocks of conserved sequence allowing for indels between blocks. The rationale for this twofold alignment procedure stems from the assumption that two mechanisms may obscure the BLAST alignment. First, the extant mtDNA will have diverged from the ancestral sequence. Second, as the numts are presumably released from selection, larger deletions and insertions may take place.

Hits from the BLAST search (default settings) in the same sense and within the vicinity (4–6,128 bp) of each other were assessed to potentially stem from a single insertion event. If such a group of hits involved more than 100 identical positions, the genomic sequence covering all the hits and their intervening sequences were retrieved. This sequence was aligned to the corresponding mtDNA sequence using the DBA algorithm. The sequences were considered a result of a single insertion event if the DBA algorithm was able to align more than 80% of the mtDNA sequence in a collinear way.

Following the above criteria, we found 296 numts ranging between 106 and 14,654 bp in size (table 1 ). Fifteen of these were found to be longer than 5,842 bp, previously reported by Herrnstadt et al. (1999) as the length of the longest human numt.

Furthermore, we found that all positions of the mitochondrial genome are represented in the nuclear DNA, with the domain comprising the control region being relatively underrepresented (fig. 1 ). As this could be an artifact caused by the distal position of the control region in the linear mtDNA sequence, we constructed an alternative representation in which the control region was central. Neither this nor the removal of the low-complexity filter of BLAST produced additional hits to this region (not shown). The deficiency of numts from the control region probably results from the significantly higher evolutionary rate of extant mtDNA in this region (Saccone, Pesole, and Sbisá 1991 ). This hypothesis is further supported by the increased number of numts in the region comprising the central conserved domain (fig. 1 ).

Interestingly, we found 4 numts covering the complete control region (table 1 ), signifying that these are at least the result of a DNA-based transfer (for a discussion see Shay and Werbin [1992] and references therein).

To estimate the time of insertion of the numts, we collected all numt-mitochondria alignments longer than 2,000 bp (i.e., either complete numts, if they were completely alignable, or subsets of numts of which DBA blocks exceeded 2,000 bp) and aligned these with the corresponding mtDNA sequences from a variety of mammals. The phylogenetic analysis supported the general conviction that numt DNAs are continually integrated into the nuclear genome as a result of several independent evolutionary events (fig. 2 ).

Since we used the working draft of the human nuclear genome for analysis, we cannot exclude that some of the recent integration events are simply due to erroneous sequencing of mitochondrial contamination. However, this will not change the above conclusions. On the contrary, the above findings may be an underestimate, since recently transferred numts may not have reached fixation (e.g., Zischler et al. 1995 ) and therefore may not be present in the available human genome draft.

This study presents the first extensive large-scale survey of human numts based on the human genome project—an initial step on the way to a complete catalog of human numts.

As previously stated (Perna and Kocher 1996 ), human numts may serve as both obstacles and tools in understanding the evolution of the human mitochondria. For example, the large number of long numts can confound studies on mitochondrial heteroplasmy as well as phylogenetic and population studies using mtDNA markers. For these studies, decisive knowledge of human numts may be crucial in detecting erroneous results due to false amplification of nuclear homologs.

On the contrary, since numts may be regarded as “molecular fossils” of mtDNA (Zischler, Geisert, and Castresana 1998 ), they may provide fruitful insight into the evolution of modern human mitochondria and help to uncover the evolutionary basis of contemporary human diseases related to the genetics of the mitochondria.

Supplementary Materials

A table of all 296 human numts is provided on the Molecular Biology and Evolution web site.

Pekka Pamilo, Reviewing Editor

1

Keywords: mitochondrial DNA nuclear insertions human genome

2

Address for correspondence and reprints: Tobias Mourier, Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark. [email protected] .

Table 1 The 60 Longest Human Numts

Table 1 The 60 Longest Human Numts

Table 1 Continued

Table 1 Continued

Fig. 1.—Circular diagram of the number of numts descending from a given position in the mitochondria (thick line). The inner hatched circle depicts the mitochondria, with the two hypervariable segments of the control region (encompassing the central conserved domain) highlighted (black)

Fig. 1.—Circular diagram of the number of numts descending from a given position in the mitochondria (thick line). The inner hatched circle depicts the mitochondria, with the two hypervariable segments of the control region (encompassing the central conserved domain) highlighted (black)

Fig. 2.—Consensus tree of the phylogenetic positions of human numts, based on 35 individual bootstrap analyses of all blocks from the DBA alignment longer than 2,000 bp. The trees were constructed in PAUP*, version 4.0b4a (Swofford 1998 ), using the neighbor-joining algorithm based on maximum-likelihood (ML) distance measures. The shape parameters of the gamma distributions, ∝ (0.28–0.43), and the transition-transversion rates (1.6–2.9) were estimated using ML. Six branching points are depicted on the tree (A–F). On the basis of 100% support with 100 bootstrap replicates and Platypus as the outgroup, numts could be confined to one or more of the branching points, as shown below the tree. For example, numts listed in the gray box (A) have 100% bootstrap support positioned at branching point A, whereas numts listed in the box (A–B) with the same support only can be confined to either branching point A or branching point B. Needless to say, numts in the box covering all positions (A–F) are restricted to the primate clade, but their exact position is undetermined. If two or more alignment blocks come from the same numt, these have letter suffixes (see table 1 for details). The following mtDNA sequences were used (GenBank accession numbers in parentheses): human (Homo sapiens; NC_001807), chimpanzee (Pan troglodytes; NC_001643), gorilla (Gorilla gorilla; NC_001645), Orangutan (Pongo pygmaeus; NC_001646), gibbon (Hylobates lar; NC_002082), baboon (Papio hamadryas; NC_001992), wallaroo (Macropus robustus; NC_001794), opossum (Didelphis virginiana; NC_001610), and platypus (Ornithorhynchus anatinus; NC_000891). Nonprimate placentals: alpaca (Lama pacos; NC_002504), armadillo (Dasypus novemcinctus; NC_001821), bat (Chalinolobus tuberculatus; NC_002626), cat (Felis catus; NC_001700), cow (Bos taurus; NC_001567), European hedgehog (Erinaceus europaeus; NC_002080), flying fox (Pteropus scapulatus; NC_002619), guinea pig (Cavia porcellus; NC_000884), Madagascar hedgehog (Echinops telfairi; NC_002631), rabbit (Oryctolagus cuniculus; NC_001913), squirrel (Sciurus vulgaris; NC_002369), and tree shrew (Tupaia belangeri; NC_002521)

Fig. 2.—Consensus tree of the phylogenetic positions of human numts, based on 35 individual bootstrap analyses of all blocks from the DBA alignment longer than 2,000 bp. The trees were constructed in PAUP*, version 4.0b4a (Swofford 1998 ), using the neighbor-joining algorithm based on maximum-likelihood (ML) distance measures. The shape parameters of the gamma distributions, ∝ (0.28–0.43), and the transition-transversion rates (1.6–2.9) were estimated using ML. Six branching points are depicted on the tree (A–F). On the basis of 100% support with 100 bootstrap replicates and Platypus as the outgroup, numts could be confined to one or more of the branching points, as shown below the tree. For example, numts listed in the gray box (A) have 100% bootstrap support positioned at branching point A, whereas numts listed in the box (A–B) with the same support only can be confined to either branching point A or branching point B. Needless to say, numts in the box covering all positions (A–F) are restricted to the primate clade, but their exact position is undetermined. If two or more alignment blocks come from the same numt, these have letter suffixes (see table 1 for details). The following mtDNA sequences were used (GenBank accession numbers in parentheses): human (Homo sapiens; NC_001807), chimpanzee (Pan troglodytes; NC_001643), gorilla (Gorilla gorilla; NC_001645), Orangutan (Pongo pygmaeus; NC_001646), gibbon (Hylobates lar; NC_002082), baboon (Papio hamadryas; NC_001992), wallaroo (Macropus robustus; NC_001794), opossum (Didelphis virginiana; NC_001610), and platypus (Ornithorhynchus anatinus; NC_000891). Nonprimate placentals: alpaca (Lama pacos; NC_002504), armadillo (Dasypus novemcinctus; NC_001821), bat (Chalinolobus tuberculatus; NC_002626), cat (Felis catus; NC_001700), cow (Bos taurus; NC_001567), European hedgehog (Erinaceus europaeus; NC_002080), flying fox (Pteropus scapulatus; NC_002619), guinea pig (Cavia porcellus; NC_000884), Madagascar hedgehog (Echinops telfairi; NC_002631), rabbit (Oryctolagus cuniculus; NC_001913), squirrel (Sciurus vulgaris; NC_002369), and tree shrew (Tupaia belangeri; NC_002521)

We thank Douda Bensasson, Kasi B. Desfor, Sylvia Mathiasen, and Seirian Sumner for help and discussions. A.J.H. and E.W. were supported by the VELUX foundation of 1981, Denmark. A.J.H. and E.W. contributed equally to this work and should be regarded as joint authors.

References

Altschul S. F., T. L. Madden, A. A. Schäffer, J. Zhang, Z. Zhang, W. Miller, D. J. Lipman,

1997

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

Nucleic Acids Res

25

:

3389

-3402

Arctander P.,

1995

Comparison of a mitochondrial gene and a corresponding nuclear pseudogene

Proc. R. Soc. Lond. B Biol. Sci

262

:

13

-19

Bensasson D., D.-X. Zhang, D. Hartl, G. Hewitt,

2001

Mitochondrial pseudogenes: evolution's misplaced witnesses

Trends Ecol. Evol

16

:

314

-321

Blanchard J. L., G. W. Schmidt,

1996

Mitochondrial DNA migration events in yeast and humans: integration by a common end-joining mechanism and alternative perspectives on nucleotide substitution patterns

Mol. Biol. Evol

13

:

537

-548

Herrnstadt C., W. Clevenger, S. S. Ghosh, C. Anderson, E. Fahy, S. Miller, N. Howell, R. E. Davis,

1999

A novel mitochondrial DNA-like sequence in the human nuclear genome

Genomics

60

:

67

-77

Jareborg N., E. Birney, R. Durbin,

1999

Comparative analysis of noncoding regions of 77 orthologous mouse and human gene pairs

Genome Res

9

:

815

-824

Lopez J. V., N. Yuhki, R. Masuda, W. Modi, S. J. O'Brien,

1994

Numt, a recent transfer and tandem amplification of mitochondrial DNA to the nuclear genome of the domestic cat

J. Mol. Evol

39

:

174

-190

Perna N. T., T. D. Kocher,

1996

Mitochondrial DNA: molecular fossils in the nucleus

Curr. Biol

6

:

128

-129

Saccone C., G. Pesole, E. Sbis,

1991

The main regulatory region of mammalian mitochondrial DNA: structure-function model and evolutionary pattern

J. Mol. Evol

33

:

83

-91

Shay J. W., H. Werbin,

1992

New evidence for the insertion of mitochondrial DNA into the human genome: significance for cancer and aging

Mutat. Res

275

:

227

-235

Swofford D. L.,

1998

PAUP*: phylogenetic analysis using parsimony (*and other methods). Version 4 Sinauer, Sunderland, Mass

Tsuzuki T., H. Nomiyama, C. Setoyama, S. Maeda, K. Shimada,

1983

Presence of mitochondrial-DNA-like sequences in the human nuclear DNA

Gene

25

:

223

-229

Wallace D. C., C. Stugard, D. Murdock, T. Schurr, M. D. Brown,

1997

Ancient mtDNA sequences in the human nuclear genome: a potential source of errors in identifying pathogenic mutations

Proc. Natl. Acad. Sci. USA

94

:

14900

-14905

Zhang D.-X., G. M. Hewitt,

1996

Nuclear integrations: challenges for mitochondrial DNA markers

Trends Ecol. Evol

11

:

247

-251

Zischler H., H. Geisert, A. von Haeseler, S. Pääbo,

1995

A nuclear ‘fossil’ of the mitochondrial D-loop and the origin of modern humans

Nature

378

:

489

-492

Zischler H., H. Geisert, J. Castresana,

1998

A hominoid-specific nuclear insertion of the mitochondrial d-loop: implications for reconstructing ancestral mitochondrial sequences

Mol. Biol. Evol

15

:

463

-469

Citations

Views

Altmetric

Email alerts

Email alerts

Citing articles via

More from Oxford Academic