On the origin and continuing evolution of SARS-CoV-2 - PubMed (original) (raw)
. 2020 Jun;7(6):1012-1023.
doi: 10.1093/nsr/nwaa036. Epub 2020 Mar 3.
Changcheng Wu 1, Xiang Li 2, Yuhe Song 2, Xinmin Yao 1, Xinkai Wu 1, Yuange Duan 1, Hong Zhang 1, Yirong Wang 1, Zhaohui Qian 3, Jie Cui 2, Jian Lu 1
Affiliations
- PMID: 34676127
- PMCID: PMC7107875
- DOI: 10.1093/nsr/nwaa036
On the origin and continuing evolution of SARS-CoV-2
Xiaolu Tang et al. Natl Sci Rev. 2020 Jun.
Abstract
The SARS-CoV-2 epidemic started in late December 2019 in Wuhan, China, and has since impacted a large portion of China and raised major global concern. Herein, we investigated the extent of molecular divergence between SARS-CoV-2 and other related coronaviruses. Although we found only 4% variability in genomic nucleotides between SARS-CoV-2 and a bat SARS-related coronavirus (SARSr-CoV; RaTG13), the difference at neutral sites was 17%, suggesting the divergence between the two viruses is much larger than previously estimated. Our results suggest that the development of new variations in functional sites in the receptor-binding domain (RBD) of the spike seen in SARS-CoV-2 and viruses from pangolin SARSr-CoVs are likely caused by natural selection besides recombination. Population genetic analyses of 103 SARS-CoV-2 genomes indicated that these viruses had two major lineages (designated L and S), that are well defined by two different SNPs that show nearly complete linkage across the viral strains sequenced to date. We found that L lineage was more prevalent than the S lineage within the limited patient samples we examined. The implication of these evolutionary changes on disease etiology remains unclear. These findings strongly underscores the urgent need for further comprehensive studies that combine viral genomic data, with epidemiological studies of coronavirus disease 2019 (COVID-19).
Keywords: SARS-CoV-2; molecular evolution; population genetics; virus.
© The Author(s) 2020. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd.
Figures
Figure 1.
Molecular divergence and selective pressures during the evolution of SARS-CoV-2 and related viruses. (A) The phylogenetic tree of SARS-CoV-2 and the related Coronaviruses. The branch length (dS) is presented, and the dN/dS (ω) value is given in the parenthesis. The phylogenetic tree was reconstructed with the synonymous sites in the concatenated CDSs of nine conserved ORFs (orf1ab, E, M, N, S, ORF3a, ORF6, ORF7a and ORF7b). (B) Conservation of 6 critical amino acid residues in the spike (S) protein. The critical active sites are Y442, L472, N479, D480, T487, and Y491 in SARS-CoV, and they correspond to L455, F486, Q493, S494, N501, and Y505 in SARS-CoV-2 (marked with inverted triangles), respectively. (C) Three candidate positively selected sites (marked with inverted triangles) in the receptor-binding domain (RBD) of spike protein (S:439 N, S:483 V and S:493Q) and the surrounding 10 amino acids.
Figure 2.
The frequency spectra of derived mutations in 103 SARS-CoV-2 viruses. Note the derived alleles of synonymous mutations are skewed towards higher frequencies than those of nonsynonymous mutations.
Figure 3.
Linkage disequilibrium between SNPs in the SARS-CoV-2 viruses. (A) LD plot of any two SNP pairs among the 29 sites that have minor alleles in at least two strains. The number near slashes at the top of the image shows the coordinate of sites in the genome. Color in the square is given by standard (D'/LOD), and the number in square is r2 value. (B) The r2 of each pair of SNPs (_y_-axis) against the genomic distance between that pair (_x_-axis). (C) The LOD of each pair of SNPs (_y_-axis) against the genomic distance between that pair (_x_-axis). Note that in both (B) and (C), the red point represents the LD between SNPs at 8,782 and 28,144.
Figure 4.
Haplotype analysis of SARS-CoV-2 viruses. (A) The haplotype networks of SARS-CoV-2 viruses. Blue represents the L lineage, and red is the S lineage. Note that in this study, we marked each sample with a unique ID that starting with the geological location, followed by the date the virus was isolated (see Table S1 for details). Each ID did not contain information of the patient's race or ethnicity. ZJ, Zhejiang; YN, Yunnan; WH, Wuhan; USA, United States of America; TW, Taiwan; SZ, Shenzhen; SD, Shandong; SC, Sichuan; JX, Jiangxi; JS, Jiangsu; HZ, Hangzhou; GZ, Guangzhou; GD, Guangdong; FS, Foshan; CQ, Chongqing. (B) Evolution of the L and S lineages of SARS-CoV-2 viruses. ‘.’, The nucleotide sequence is identical; ‘-’, gap.
Figure 5.
The unrooted phylogenetic tree of the 103 SARS-CoV-2 genomes. The ID of each sample is the same as in Fig. 4A. Note WH_2019/12/31.a represents the reference genome (NC_045512). Note SZ_2020/01/13.a had C at both positions 8,782 and 28,144 in the genome, belonging to neither L nor S lineage.
Figure 6.
The heteroplasmy of SARS-CoV-2 viruses in human patients. The viruses isolated from a patient that lived in the United States (USA_2020/01/21.a, GISAID ID: EPI_ISL_404253) had the genotype Y (C or T) at both 8,782 and 28,144. The most likely explanation is that this patient was infected by both the L and S lineages. Note the reference is L lineage.
Similar articles
- Severe Acute Respiratory Syndrome (SARS) Coronavirus ORF8 Protein Is Acquired from SARS-Related Coronavirus from Greater Horseshoe Bats through Recombination.
Lau SK, Feng Y, Chen H, Luk HK, Yang WH, Li KS, Zhang YZ, Huang Y, Song ZZ, Chow WN, Fan RY, Ahmed SS, Yeung HC, Lam CS, Cai JP, Wong SS, Chan JF, Yuen KY, Zhang HL, Woo PC. Lau SK, et al. J Virol. 2015 Oct;89(20):10532-47. doi: 10.1128/JVI.01048-15. Epub 2015 Aug 12. J Virol. 2015. PMID: 26269185 Free PMC article. - Epidemiology and Genomic Characterization of Two Novel SARS-Related Coronaviruses in Horseshoe Bats from Guangdong, China.
Li L, Zhang L, Zhou J, He X, Yu Y, Liu P, Huang W, Xiang Z, Chen J. Li L, et al. mBio. 2022 Jun 28;13(3):e0046322. doi: 10.1128/mbio.00463-22. Epub 2022 Apr 25. mBio. 2022. PMID: 35467426 Free PMC article. - An update on the origin of SARS-CoV-2: Despite closest identity, bat (RaTG13) and pangolin derived coronaviruses varied in the critical binding site and O-linked glycan residues.
Malaiyan J, Arumugam S, Mohan K, Gomathi Radhakrishnan G. Malaiyan J, et al. J Med Virol. 2021 Jan;93(1):499-505. doi: 10.1002/jmv.26261. Epub 2020 Jul 14. J Med Virol. 2021. PMID: 32633815 Free PMC article. - Molecular epidemiology, evolution and phylogeny of SARS coronavirus.
Luk HKH, Li X, Fung J, Lau SKP, Woo PCY. Luk HKH, et al. Infect Genet Evol. 2019 Jul;71:21-30. doi: 10.1016/j.meegid.2019.03.001. Epub 2019 Mar 4. Infect Genet Evol. 2019. PMID: 30844511 Free PMC article. Review. - SARS-CoV-2: Camazotz's Curse.
Anand KB, Karade S, Sen S, Gupta RM. Anand KB, et al. Med J Armed Forces India. 2020 Apr;76(2):136-141. doi: 10.1016/j.mjafi.2020.04.008. Epub 2020 Apr 27. Med J Armed Forces India. 2020. PMID: 32341622 Free PMC article. Review.
Cited by
- Genome-wide association study between SARS-CoV-2 single nucleotide polymorphisms and virus copies during infections.
Li K, Chaguza C, Stamp J, Chew YT, Chen NFG, Ferguson D, Pandya S, Kerantzas N, Schulz W; Yale SARS-CoV-2 Genomic Surveillance Initiative; Hahn AM, Ogbunugafor CB, Pitzer VE, Crawford L, Weinberger DM, Grubaugh ND. Li K, et al. PLoS Comput Biol. 2024 Sep 17;20(9):e1012469. doi: 10.1371/journal.pcbi.1012469. eCollection 2024 Sep. PLoS Comput Biol. 2024. PMID: 39288189 Free PMC article. - Emergence of Recombinant SARS-CoV-2 Variants in California from 2020 to 2022.
Ryder R, Smith E, Borthwick D, Elder J, Panditrao M, Morales C, Wadford DA. Ryder R, et al. Viruses. 2024 Jul 27;16(8):1209. doi: 10.3390/v16081209. Viruses. 2024. PMID: 39205183 Free PMC article. - Genome analysis of SARS-CoV-2 haplotypes: separation and parallel evolution of the major haplotypes occurred considerably earlier than their emergence in China.
Guan S, Hu X, Yi G, Yao L, Zhang J. Guan S, et al. Sci One Health. 2023 Sep 28;2:100041. doi: 10.1016/j.soh.2023.100041. eCollection 2023. Sci One Health. 2023. PMID: 39077033 Free PMC article. - Immunomodulatory effects of cannabinoids against viral infections: a review of its potential use in SARS-CoV2 infection.
Hassan Kalantar Neyestanaki M, Gholizadeh O, Hosseini Tabatabaie F, Akbarzadeh S, Yasamineh S, Afkhami H, Sedighi S. Hassan Kalantar Neyestanaki M, et al. Virusdisease. 2024 Jun;35(2):342-356. doi: 10.1007/s13337-024-00871-0. Epub 2024 May 27. Virusdisease. 2024. PMID: 39071880 Review. - Molecular Evolution of Protein Sequences and Codon Usage in Monkeypox Viruses.
Shan KJ, Wu C, Tang X, Lu R, Hu Y, Tan W, Lu J. Shan KJ, et al. Genomics Proteomics Bioinformatics. 2024 May 9;22(1):qzad003. doi: 10.1093/gpbjnl/qzad003. Genomics Proteomics Bioinformatics. 2024. PMID: 38862422 Free PMC article.
References
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous