On the origin and continuing evolution of SARS-CoV-2 - PubMed (original) (raw)

. 2020 Jun;7(6):1012-1023.

doi: 10.1093/nsr/nwaa036. Epub 2020 Mar 3.

Changcheng Wu 1, Xiang Li 2, Yuhe Song 2, Xinmin Yao 1, Xinkai Wu 1, Yuange Duan 1, Hong Zhang 1, Yirong Wang 1, Zhaohui Qian 3, Jie Cui 2, Jian Lu 1

Affiliations

On the origin and continuing evolution of SARS-CoV-2

Xiaolu Tang et al. Natl Sci Rev. 2020 Jun.

Abstract

The SARS-CoV-2 epidemic started in late December 2019 in Wuhan, China, and has since impacted a large portion of China and raised major global concern. Herein, we investigated the extent of molecular divergence between SARS-CoV-2 and other related coronaviruses. Although we found only 4% variability in genomic nucleotides between SARS-CoV-2 and a bat SARS-related coronavirus (SARSr-CoV; RaTG13), the difference at neutral sites was 17%, suggesting the divergence between the two viruses is much larger than previously estimated. Our results suggest that the development of new variations in functional sites in the receptor-binding domain (RBD) of the spike seen in SARS-CoV-2 and viruses from pangolin SARSr-CoVs are likely caused by natural selection besides recombination. Population genetic analyses of 103 SARS-CoV-2 genomes indicated that these viruses had two major lineages (designated L and S), that are well defined by two different SNPs that show nearly complete linkage across the viral strains sequenced to date. We found that L lineage was more prevalent than the S lineage within the limited patient samples we examined. The implication of these evolutionary changes on disease etiology remains unclear. These findings strongly underscores the urgent need for further comprehensive studies that combine viral genomic data, with epidemiological studies of coronavirus disease 2019 (COVID-19).

Keywords: SARS-CoV-2; molecular evolution; population genetics; virus.

© The Author(s) 2020. Published by Oxford University Press on behalf of China Science Publishing & Media Ltd.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Molecular divergence and selective pressures during the evolution of SARS-CoV-2 and related viruses. (A) The phylogenetic tree of SARS-CoV-2 and the related Coronaviruses. The branch length (dS) is presented, and the dN/dS (ω) value is given in the parenthesis. The phylogenetic tree was reconstructed with the synonymous sites in the concatenated CDSs of nine conserved ORFs (orf1ab, E, M, N, S, ORF3a, ORF6, ORF7a and ORF7b). (B) Conservation of 6 critical amino acid residues in the spike (S) protein. The critical active sites are Y442, L472, N479, D480, T487, and Y491 in SARS-CoV, and they correspond to L455, F486, Q493, S494, N501, and Y505 in SARS-CoV-2 (marked with inverted triangles), respectively. (C) Three candidate positively selected sites (marked with inverted triangles) in the receptor-binding domain (RBD) of spike protein (S:439 N, S:483 V and S:493Q) and the surrounding 10 amino acids.

Figure 2.

Figure 2.

The frequency spectra of derived mutations in 103 SARS-CoV-2 viruses. Note the derived alleles of synonymous mutations are skewed towards higher frequencies than those of nonsynonymous mutations.

Figure 3.

Figure 3.

Linkage disequilibrium between SNPs in the SARS-CoV-2 viruses. (A) LD plot of any two SNP pairs among the 29 sites that have minor alleles in at least two strains. The number near slashes at the top of the image shows the coordinate of sites in the genome. Color in the square is given by standard (D'/LOD), and the number in square is r2 value. (B) The r2 of each pair of SNPs (_y_-axis) against the genomic distance between that pair (_x_-axis). (C) The LOD of each pair of SNPs (_y_-axis) against the genomic distance between that pair (_x_-axis). Note that in both (B) and (C), the red point represents the LD between SNPs at 8,782 and 28,144.

Figure 4.

Figure 4.

Haplotype analysis of SARS-CoV-2 viruses. (A) The haplotype networks of SARS-CoV-2 viruses. Blue represents the L lineage, and red is the S lineage. Note that in this study, we marked each sample with a unique ID that starting with the geological location, followed by the date the virus was isolated (see Table S1 for details). Each ID did not contain information of the patient's race or ethnicity. ZJ, Zhejiang; YN, Yunnan; WH, Wuhan; USA, United States of America; TW, Taiwan; SZ, Shenzhen; SD, Shandong; SC, Sichuan; JX, Jiangxi; JS, Jiangsu; HZ, Hangzhou; GZ, Guangzhou; GD, Guangdong; FS, Foshan; CQ, Chongqing. (B) Evolution of the L and S lineages of SARS-CoV-2 viruses. ‘.’, The nucleotide sequence is identical; ‘-’, gap.

Figure 5.

Figure 5.

The unrooted phylogenetic tree of the 103 SARS-CoV-2 genomes. The ID of each sample is the same as in Fig. 4A. Note WH_2019/12/31.a represents the reference genome (NC_045512). Note SZ_2020/01/13.a had C at both positions 8,782 and 28,144 in the genome, belonging to neither L nor S lineage.

Figure 6.

Figure 6.

The heteroplasmy of SARS-CoV-2 viruses in human patients. The viruses isolated from a patient that lived in the United States (USA_2020/01/21.a, GISAID ID: EPI_ISL_404253) had the genotype Y (C or T) at both 8,782 and 28,144. The most likely explanation is that this patient was infected by both the L and S lineages. Note the reference is L lineage.

Similar articles

Cited by

References

    1. Lu R, Zhao X, Li Jet al. . Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet 2020; 395: 565–74. - PMC - PubMed
    1. Zhou P, Yang XL, Wang XGet al. . A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 2020; 579: 270–3. - PMC - PubMed
    1. Ren L-L, Wang Y-M, Wu Z-Qet al. . Identification of a novel coronavirus causing severe pneumonia in human. Chin Med J 2020; 133: 1015–24. - PMC - PubMed
    1. Cui J, Li F, Shi Z-L. Origin and evolution of pathogenic coronaviruses. Nat Rev Microbiol 2019; 17: 181–92. - PMC - PubMed
    1. Li X, Song Y, Wong Get al. . Bat origin of a new human coronavirus: there and back again. Sci China Life Sci 2020; 63: 461–2. - PMC - PubMed

LinkOut - more resources