Global landscape of SARS-CoV-2 mutations and conserved regions (original) (raw)
Related papers
2021
Background: Late December 2019, an unknown incidence of Pneumonia was observed among some residents of Wuhan city, China. The disease named coronavirus disease 2019 (COVID-19) and declared as a pandemic by the WHO on the March 11th, 2020 by the World Health Organization (WHO) has resulted to the death of million people across the globe. Prior to the current COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), two other outbreaks of coronaviruses namely severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) have been experienced within the last few decades. This review looks at the unique characteristics of SARS-CoV-2 to the other coronaviruses (SARS-CoV and MERS-CoV) and its significance(s) in the control strategies including diagnostics. Materials and Methods: Using the keywords “coronavirus mutation”, “nucleotide substitution”, “coronavirus evolution”, “SARS-CoV-2”, “COVID-19” publis...
Pathogens
In December 2019, the first cases of the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) were identified in the city of Wuhan, China. Since then, it has spread worldwide with new mutations being reported. The aim of the present study was to monitor the changes in genetic diversity and track non-synonymous substitutions (dN) that could be implicated in the fitness of SARS-CoV-2 and its spread in different regions between December 2019 and November 2020. We analyzed 2213 complete genomes from six geographical regions worldwide, which were downloaded from GenBank and GISAID databases. Although SARS-CoV-2 presented low genetic diversity, there has been an increase over time, with the presence of several hotspot mutations throughout its genome. We identified seven frequent mutations that resulted in dN substitutions. Two of them, C14408T>P323L and A23403G>D614G, located in the nsp12 and Spike protein, respectively, emerged early in the pandemic and showed a consi...
SARS‑CoV‑2 mutation hotspots incidence in different geographic regions
Microbial Biosystems
SARS-CoV-2 (Severe Acute Respiratory Syndrome Coronavirus 2) is RNA virus with a positive-sense single-strand that belongs to the beta-coronavirus group that causes COVID-19 (Coronavirus Disease 2019) which originally emerged in China. Viruses with RNA genomes are known by a high mutation rate potential. The mutation rate determines genome variability and evolution of the virus; therefore, allowing viruses to evade the immune system, gain more infectivity potentials, virulence modifications, and probably resistance development to antivirals. A total of 311 SARS-CoV-2 virus whole genome sequences have been retrieved from the GISAID database from 1 st of January 2020 to 31 th of August 2020. The sequences were analyzed for sequence purity and multiple sequence alignment together with reference sequence was conducted through using Clustal Omega that is imbedded in Jalview software and Blast tools. We recorded the occurrence of 4 newly incident high frequently occurring mutations in all six geographic regions, namely at positions 2416, 18877, 23401, and 27964. The majority of all recorded hotspots were detected in Asia, Europe, and North America. The findings of our study suggest that the SARS-CoV-2 is in continuous evolution. For the impact of these mutations, further investigations are required and to understand whether these mutations would lead to the appearance of Drug-resistance viral strains, strains with increased infectivity and pathogenicity, and also their effect on the vaccine development and immunogenesis.
Assessment of intercontinents mutation hotspots and conserved domains within SARS-CoV-2 genome
Infection, Genetics and Evolution, 2021
Coronavirus disease 2019 (COVID-19), caused by SARS-CoV-2 pathogen, has led to waves of global pandemic claiming lives and posing a serious threat to public health and social cum physical interactions. To evaluate the mutational landscape and conserved regions in the genome of the causative pathogen, we analysed 7213 complete SARS-CoV-2 protein sequences mined from the Global Initiative on Sharing All Influenza Data (GISAID) repository from infected patients across all regions on the EpiCov web interface. Regions of origin and the corresponding number of sequences mined are as follows: Asia-2487; Oceania-2027; Europe-1240; Africa-717; South America-391; and North America-351. High recurrent mutations, namely: T265I in non-structural protein 2 (nsp2), L3606F in nsp6, P4715L in RNA-dependent RNA polymerase (RdRp), D614G in spike glycoprotein, R203K and G204R in nucleocapsid phosphoprotein and Q57H in ORF3a with well-conserved envelope and membrane proteins, 3CLpro and spike S2 domains across regions were observed. Comparative analyses of the viral sequences reveal the prevalence P4715L and D614G mutations as the most recurrent and concurrent in Africa (97.20%), Europe (89.83%) and moderately in Asia (61.60%). Mutation rates are central to viral transmissibility, evolution and virulence, which help them to invade host immunity and develop drug resistance. Based on the foregoing, it is important to understand the mutational spectra of SARS-CoV-2 genome across regions. This will help in identifying specific genomic sites as potential targets for drug design and vaccine development, monitoring the spread of the virus and unraveling its evolution, virulence and transmissibility.
In late December 2019, an emerging viral infection COVID-19 was identified in Wuhan, China, and became a global pandemic. Characterization of the genetic variants of SARS-CoV-2 is crucial in following and evaluating it spread across countries. In this study, we collected and analyzed 3,067 SARS-CoV-2 genomes isolated from 55 countries during the first three months after the onset of this virus. Using comparative genomics analysis, we traced the profiles of the whole-genome mutations and compared the frequency of each mutation in the studied population. The accumulation of mutations during the epidemic period with their geographic locations was also monitored. The results showed 782 variant sites, of which 512 (65.47%) had a non-synonymous effect. Frequencies of mutated alleles revealed the presence of 38 recurrent non-synonymous mutations, including ten hotspot mutations with a prevalence higher than 0.10 in this population and distributed in six SARS-CoV-2 genes. The distribution o...
Genomic, geographic and temporal distributions of SARS-CoV-2 mutations
The COVID-19 pandemic is the most significant public health issue in recent history. Its causal agent, SARS-CoV-2, has evolved rapidly since its first emergence in December 2019. Mutations in the viral genome have critical impacts on the adaptation of viral strains to the local environment, and may alter the characteristics of viral transmission, disease manifestation, and the efficacy of treatment and vaccination. Using the complete sequences of 1,932 SARS-CoV-2 genomes, we examined the genomic, geographic and temporal distributions of aged, new, and frequent mutations of SARS-CoV-2, and identified six phylogenetic clusters of the strains, which also exhibit a geographic preference in different continents. Mutations in the form of single nucleotide variations (SNVs) provide a direct interpretation for the six phylogenetic clusters. Linkage disequilibrium, haplotype structure, evolutionary process, global distribution of mutations unveiled a sketch of the mutational history. Additionally, we found a positive correlation between the average mutation count and case fatality, and this correlation had strengthened with time, suggesting an important role of SNVs on disease outcomes. This study suggests that SNVs may become an important consideration in virus detection, clinical treatment, drug design, and vaccine development to avoid target shifting, and that continued isolation and sequencing is a crucial component in the fight against this pandemic. Significance Statement Mutation is the driving force of evolution for viruses like SARS-CoV-2, the causal agent of COVID-19. In this study, we discovered that the genome of SARS-CoV-2 is changing rapidly from the originally isolated form. These mutations have been spreading around the world and caused more than 2.5 million of infected cases and 170 thousands of deaths. We found that fourteen frequent mutations identified in this study can characterize the six main clusters of SARS-CoV-2 strains. In addition, we found the mutation burden is positively correlated with the fatality of COVID-19 patients. Understanding mutations in the SARS-CoV-2 genome will provide useful insight for the design of treatment and vaccination.
2020
The all-pervasiveness and dynamic nature of the COVID-19 pandemic warrants comprehensive and constant surveillance of the numerous mutations that are accumulating in global SARS-CoV-2 genomes and contributing to the microevoution of the various lineages of the novel coronavirus. This would help us gain insights into the evolving pathogenicity of the virus, and thereby improvise our control and therapeutic strategies. This study explores the genome-wide frequency, gene-wise distribution, and molecular nature, of the large repertoire of point mutations detected across the global dataset of 3,608 SARS-CoV-2 RNA-genomes short-listed from a total 5,485 whole genome sequences deposited in GenBank till 4 June 2020 using a download filter that eliminated all incomplete/gapped sequences. Phylogenomic analysis involving all existing SARS-CoV-2 lineages, represented by 3,740 whole genome sequences from human-source (out of a total of 63,894 sequences stored in the GISAID repository, as on 15 J...
Scientific Reports
The identification of deleterious mutations in different variants of SARS-CoV-2 and their roles in the morbidity of COVID-19 patients has yet to be thoroughly investigated. To unravel the spectrum of mutations and their effects within SARS-CoV-2 genomes, we analyzed 5,724 complete genomes from deceased COVID-19 patients sourced from the GISAID database. This analysis was conducted using the Nextstrain platform, applying a generalized time-reversible model for evolutionary phylogeny. These genomes were compared to the reference strain (hCoV-19/Wuhan/WIV04/2019) using MAFFT v7.470. Our findings revealed that SARS-CoV-2 genomes from deceased individuals belonged to 21 Nextstrain clades, with clade 20I (Alpha variant) being the most predominant, followed by clade 20H (Beta variant) and clade 20J (Gamma variant). The majority of SARS-CoV-2 genomes from deceased patients (33.4%) were sequenced in North America, while the lowest percentage (0.98%) came from Africa. The ‘G’ clade was domina...
Identification of Novel Missense Mutations in a Large Number of Recent SARS-CoV-2 Genome Sequences
Background: SARS-CoV-2 infection has spread to over 200 countries since it was first reported in December of 2019. Significant country-specific variations in infection and mortality rate have been noted. Although country-specific differences in public health response have had a large impact on infection rate control, it is currently unclear as to whether evolution of the virus itself has also contributed to variations in infection and mortality rate. Previous studies on SARS-CoV-2 mutations were based on the analysis of ~ 160 SARS-CoV-2 sequences available until mid-February 2020. By mid-April, > 550 SARS-CoV-2 sequences had been deposited in GenBank, and over 8,200 in the GISAID database. Methods: We performed a sequence analysis on 474 SARS-CoV-2 genomes submitted to GenBank up to April 11, 2020 by multiple alignment using Map to a Reference Assembly and Variants/SNP identification. The results were verified on a larger scale, 8,126 hCoV-19 (SARS-CoV-2) sequences from GISAID database. Results: We identified 5 recently emerged mutations in many isolates (up to 40%). Our analysis highlights 5
2021
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) as the current coronavirus pandemic is an infectious disease that was initially confirmed in China (December 2019). In the current study, we assessed the genome variation of the SARS-CoV-2 viruses circulated in Asia in the first months of the pandemic. We randomly analyzed 131 complete sequences of SARS-CoV-2 from December 2019 to April 2020. The results showed that there were fifteen major mutations in Asia which most of them were co-evolved. These prevalent co-mutations resulted in clade G, GH, GR, S and O. Furthermore, sequences within 26144G>T point mutation had low variability without any co-mutation which formed clade V. Our results indicate that most of the circulated viruses in Asia in the early time of the pandemic had collected in five co-mutation groups.