Replication-associated strand asymmetries in mammalian genomes: toward detection of replication origins - PubMed (original) (raw)
Comparative Study
. 2005 Jul 12;102(28):9836-41.
doi: 10.1073/pnas.0500577102. Epub 2005 Jun 28.
Affiliations
- PMID: 15985556
- PMCID: PMC1174978
- DOI: 10.1073/pnas.0500577102
Comparative Study
Replication-associated strand asymmetries in mammalian genomes: toward detection of replication origins
Marie Touchon et al. Proc Natl Acad Sci U S A. 2005.
Abstract
In the course of evolution, mutations do not affect both strands of genomic DNA equally. This imbalance mainly results from asymmetric DNA mutation and repair processes associated with replication and transcription. In prokaryotes, prevalence of G over C and T over A is frequently observed in the leading strand. The sign of the resulting TA and GC skews changes abruptly when crossing replication-origin and termination sites, producing characteristic step-like transitions. In mammals, transcription-coupled skews have been detected, but so far, no bias has been associated with replication. Here, analysis of intergenic and transcribed regions flanking experimentally identified human replication origins and the corresponding mouse and dog homologous regions demonstrates the existence of compositional strand asymmetries associated with replication. Multiscale analysis of human genome skew profiles reveals numerous transitions that allow us to identify a set of 1,000 putative replication initiation zones. Around these putative origins, the skew profile displays a characteristic jagged pattern also observed in mouse and dog genomes. We therefore propose that in mammalian cells, replication termination sites are randomly distributed between adjacent origins. Taken together, these analyses constitute a step toward genome-wide studies of replication mechanisms.
Figures
Fig. 1.
TA and GC skew profiles around experimentally determined human replication origins. (a) The skew profiles were determined in 1-kbp windows in regions surrounding (±100 kbp without repeats) experimentally determined human replication origins (see Data and Methods). (Upper) TA and GC cumulated skew profiles Σ_TA_ (thick line) and Σ_GC_ (thin line). (Lower) Skew S calculated in the same regions. The Δ_S_ amplitude associated with these origins, calculated as the difference of the skews measured in 20-kbp windows on both sides of the origins, are: MCM4 (31%), HSPA4 (29%), TOP1 (18%), MYC (14%), SCA7 (38%), and AR (14%). (b) Cumulated skew profiles calculated in the six regions of the mouse genome homologous to the human regions analyzed in a.(c) Cumulated skew profiles in the six regions of the dog genome homologous to human regions analyzed in a. The abscissa (x) represents the distance (in kilobase pairs) of a sequence window to the corresponding origin; the ordinate represents the values of S given in percent. Red, (+) genes (coding strand identical to the Watson strand); blue, (–) genes (coding strand opposite to the Watson strand); black, intergenic regions. In c, genes are not represented.
Fig. 2.
Skew S in regions situated on both sides of human replication origins. The mean values of S were calculated in intergenic regions and in intronic regions situated 5′ (Left) and 3′ (Right) of the six origins analyzed in Fig. 1_a_. Colors are as in Fig. 1; mean values are in percent ± SEM.
Fig. 3.
Histograms of the |Δ_S_| amplitudes of the jumps in the S profile. Using the wavelet transform, a set of 5,101 discontinuities was detected (2,415 upward jumps and 2,686 downward jumps; see Data and Methods). The |Δ_S_| amplitude was calculated as in Fig. 1_a_. (a) |Δ_S_| distributions of the jumps presenting G + C < 42%, corresponding to 1,647 upward jumps and 1,755 downward jumps; the threshold |Δ_S_| ≥ 12.5% (vertical line) corresponded to 1,012 upward jumps that were retained as putative replication origins and to 211 downward jumps (r_ = 0.21). (b) |Δ_S_| distributions of the jumps presenting G + C > 42%, with |Δ_S| ≥ 12.5% corresponding to 528 upward jumps and 280 downward jumps (r = 0.53). The G+C content was measured in the 100-kbp window surrounding the jump position. Upward jumps are shown in black, and downward jumps are shown with dots. The abscissa represents the values of the |Δ_S_| amplitudes calculated in percent.
Fig. 4.
Mean skew profile of intergenic regions around putative replication origins. The skew S was calculated in 1-kbp windows (Watson strand) around the position (±300 kbp without repeats) of the 1,012 upward jumps (Fig. 3); 5′ and 3′ transcript extremities were extended by 0.5 and 2 kbp, respectively (filled circles), or by 10 kbp at both ends (stars) (see Data and Methods). The abscissa represents the distance (in kilobase pairs) to the corresponding origin; the ordinate represents the skews calculated for the windows situated in intergenic regions (mean values for all discontinuities and for 10 consecutive 1-kbp window positions). The skews are given in percent (vertical bars, SEM). The lines correspond to linear fits of the values of the skew (stars) for x <–100 kbp and _x_ > 100 kbp.
Fig. 5.
S profiles along mammalian genome fragments. (a) Fragment of chromosome 20 including the TOP1 origin (red vertical line). (b and c) Chromosome 4 and chromosome 9 fragments, respectively, with low G+C content (36%). (d) Chromosome 22 fragment with larger G+C content (48%). In a and b, vertical lines correspond to selected putative origins; yellow lines are linear fits of the S values between successive putative origins. Black, intergenic regions; red, (+) genes; blue, (–) genes. Note the fully intergenic regions upstream of TOP1 in a and from positions 5,290–6,850 kbp in c.(e) Fragment of mouse chromosome 4 homologous to the human fragment shown in c.(f) Fragment of dog chromosome 5 syntenic to the human fragment shown in c. In e and f, genes are not represented.
Fig. 6.
Model of replication termination. Schematic representation of the skew profiles associated with three replication origins _O_1, _O_2, and _O_3; we suppose that these replication origins are adjacent, bidirectional origins with similar replication efficiency. The abscissae represent the sequence positions; the ordinates represent the S values (arbitrary units). Upward (or downward) steps correspond to origin (or termination) positions. For convenience, the termination sites are symmetric relative to _O_2.(Left) Three different termination positions _T_i, _T_j, and _T_k, leading to elementary skew profiles _S_i, _S_j, and _S_k.(Center) Superposition of these three profiles. (Right) Superposition of a large number of elementary profiles leading to the final factory-roof pattern. In the simple model, termination occurs with equal probability on both sides of the origins, leading to the linear profile (thick line). In the alternative model, replication termination is more likely to occur at lower rates close to the origins, leading to a flattening of the profile (gray line).
Similar articles
- Replication-associated mutational asymmetry in the human genome.
Chen CL, Duquenne L, Audit B, Guilbaud G, Rappailles A, Baker A, Huvet M, d'Aubenton-Carafa Y, Hyrien O, Arneodo A, Thermes C. Chen CL, et al. Mol Biol Evol. 2011 Aug;28(8):2327-37. doi: 10.1093/molbev/msr056. Epub 2011 Mar 2. Mol Biol Evol. 2011. PMID: 21368316 - DNA replication induces compositional biases in yeast.
Marsolier-Kergoat MC, Goldar A. Marsolier-Kergoat MC, et al. Mol Biol Evol. 2012 Mar;29(3):893-904. doi: 10.1093/molbev/msr240. Epub 2011 Sep 26. Mol Biol Evol. 2012. PMID: 21948086 - Replication-associated strand asymmetries in vertebrate genomes and implications for replicon size, DNA replication origin, and termination.
Hou WR, Wang HF, Niu DK. Hou WR, et al. Biochem Biophys Res Commun. 2006 Jun 16;344(4):1258-62. doi: 10.1016/j.bbrc.2006.04.039. Epub 2006 Apr 24. Biochem Biophys Res Commun. 2006. PMID: 16650814 - From GC skews to wavelets: a gentle guide to the analysis of compositional asymmetries in genomic data.
Touchon M, Rocha EP. Touchon M, et al. Biochimie. 2008 Apr;90(4):648-59. doi: 10.1016/j.biochi.2007.09.015. Epub 2007 Sep 29. Biochimie. 2008. PMID: 17988781 Review. - [How genomic approaches help the understanding of the initiation of DNA replication].
Miotto B. Miotto B. Med Sci (Paris). 2017 Feb;33(2):143-150. doi: 10.1051/medsci/20173302009. Epub 2017 Feb 27. Med Sci (Paris). 2017. PMID: 28240205 Review. French.
Cited by
- Transcription-Replication Conflicts as a Source of Genome Instability.
Goehring L, Huang TT, Smith DJ. Goehring L, et al. Annu Rev Genet. 2023 Nov 27;57:157-179. doi: 10.1146/annurev-genet-080320-031523. Epub 2023 Aug 8. Annu Rev Genet. 2023. PMID: 37552891 Free PMC article. Review. - Strand asymmetries across genomic processes.
Moeckel C, Zaravinos A, Georgakopoulos-Soares I. Moeckel C, et al. Comput Struct Biotechnol J. 2023 Mar 11;21:2036-2047. doi: 10.1016/j.csbj.2023.03.007. eCollection 2023. Comput Struct Biotechnol J. 2023. PMID: 36968020 Free PMC article. Review. - Genome-wide measurement of DNA replication fork directionality and quantification of DNA replication initiation and termination with Okazaki fragment sequencing.
Wu X, Liu Y, d'Aubenton-Carafa Y, Thermes C, Hyrien O, Chen CL, Petryk N. Wu X, et al. Nat Protoc. 2023 Apr;18(4):1260-1295. doi: 10.1038/s41596-022-00793-5. Epub 2023 Jan 18. Nat Protoc. 2023. PMID: 36653528 Review. - DNA replication initiation shapes the mutational landscape and expression of the human genome.
Murat P, Perez C, Crisp A, van Eijk P, Reed SH, Guilbaud G, Sale JE. Murat P, et al. Sci Adv. 2022 Nov 11;8(45):eadd3686. doi: 10.1126/sciadv.add3686. Epub 2022 Nov 9. Sci Adv. 2022. PMID: 36351018 Free PMC article. - Determination of human DNA replication origin position and efficiency reveals principles of initiation zone organisation.
Guilbaud G, Murat P, Wilkes HS, Lerner LK, Sale JE, Krude T. Guilbaud G, et al. Nucleic Acids Res. 2022 Jul 22;50(13):7436-7450. doi: 10.1093/nar/gkac555. Nucleic Acids Res. 2022. PMID: 35801867 Free PMC article.
References
- Freeman, J. M., Plasterer, T. N., Smith, T. F. & Mohr, S. C. (1998) Science 279, 1827–1830.
- Beletskii, A., Grigoriev, A., Joyce, S. & Bhagwat, A. S. (2000) J. Mol. Biol. 300, 1057–1065. - PubMed
- Francino, M. P. & Ochman, H. (2001) Mol. Biol. Evol. 18, 1147–1150. - PubMed
- Green, P., Ewing, B., Miller, W., Thomas, P. J. & Green, E. D. (2003) Nat. Genet. 33, 514–517. - PubMed
- Touchon, M., Nicolay, S., Arneodo, A., d'Aubenton-Carafa, Y. & Thermes, C. (2003) FEBS Lett. 555, 579–582. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous