Comparative analysis of twelve genomes of three novel group 2c and group 2d coronaviruses reveals unique group and subgroup features - PubMed (original) (raw)

Comparative Study

. 2007 Feb;81(4):1574-85.

doi: 10.1128/JVI.02182-06. Epub 2006 Nov 22.

Ming Wang, Susanna K P Lau, Huifang Xu, Rosana W S Poon, Rongtong Guo, Beatrice H L Wong, Kai Gao, Hoi-Wah Tsoi, Yi Huang, Kenneth S M Li, Carol S F Lam, Kwok-Hung Chan, Bo-Jian Zheng, Kwok-Yung Yuen

Affiliations

Comparative Study

Comparative analysis of twelve genomes of three novel group 2c and group 2d coronaviruses reveals unique group and subgroup features

Patrick C Y Woo et al. J Virol. 2007 Feb.

Abstract

Twelve complete genomes of three novel coronaviruses-bat coronavirus HKU4 (bat-CoV HKU4), bat-CoV HKU5 (putative group 2c), and bat-CoV HKU9 (putative group 2d)-were sequenced. Comparative genome analysis showed that the various open reading frames (ORFs) of the genomes of the three coronaviruses had significantly higher amino acid identities to those of other group 2 coronaviruses than group 1 and 3 coronaviruses. Phylogenetic trees constructed using chymotrypsin-like protease, RNA-dependent RNA polymerase, helicase, spike, and nucleocapsid all showed that the group 2a and 2b and putative group 2c and 2d coronaviruses are more closely related to each other than to group 1 and 3 coronaviruses. Unique genomic features distinguishing between these four subgroups, including the number of papain-like proteases, the presence or absence of hemagglutinin esterase, small ORFs between the membrane and nucleocapsid genes and ORFs (NS7a and NS7b), bulged stem-loop and pseudoknot structures downstream of the nucleocapsid gene, transcription regulatory sequence, and ribosomal recognition signal for the envelope gene, were also observed. This is the first time that NS7a and NS7b downstream of the nucleocapsid gene has been found in a group 2 coronavirus. The high Ka/Ks ratio of NS7a and NS7b in bat-CoV HKU9 implies that these two group 2d-specific genes are under high selective pressure and hence are rapidly evolving. The four subgroups of group 2 coronaviruses probably originated from a common ancestor. Further molecular epidemiological studies on coronaviruses in the bats of other countries, as well as in other animals, and complete genome sequencing will shed more light on coronavirus diversity and their evolutionary histories.

PubMed Disclaimer

Figures

FIG. 1.

FIG. 1.

Phylogenetic analysis of amino acid sequences of the 393-bp fragment of RNA-dependent RNA polymerase of coronaviruses identified from bats in the present study. The tree was constructed by the neighbor-joining method using the Jukes-Cantor correction and bootstrap values calculated from 1,000 trees. The scale bar indicates the estimated number of substitutions per 50 amino acids. Coronaviruses identified in the present study are shown in boldface. Coronaviruses from bats are shaded in gray. HCoV-229E (NC_002645); PEDV, porcine epidemic diarrhea virus (NC_003436); TGEV(NC_002306); FIPV (AY994055); HCoV-NL63 NL63 (NC_005831); bat-CoV HKU2 (DQ249235), HKU4 (DQ074652), HKU5 (DQ249219), HKU6 (DQ249224), HKU7 (DQ249226), and HKU8 (DQ249228); CoV-HKU1 (NC_006577); HCoV-OC43 (NC_005147); MHV, murine hepatitis virus (NC_006852); BCoV, bovine coronavirus (NC_003045); PHEV, porcine hemagglutinating encephalomyelitis virus (NC_007732); SDAV; SARS-CoV (human), human SARS coronavirus (NC_004718); SARS-CoV (Civet), civet SARS-like coronavirus (AY304488); bat-SARS-CoV HKU3, bat-SARS-like coronavirus HKU3 (DQ022305); IBV, infectious bronchitis virus (NC_001451); TCoV, turkey coronavirus (AF124991); IBV-like, IBV isolated from peafowl (AY641576). Other abbreviations are as defined in the text.

FIG. 2.

FIG. 2.

Genome organizations of bat-CoV HKU4, bat-CoV HKU5, bat-CoV HKU9, and representative coronaviruses from each group. Papain-like proteases (PL1, PL2, and PL) and the nonstructural proteins are represented by white boxes. Hemagglutinin esterase (HE), spike (S), envelope (E), membrane (M), and nucleocapsid (N) are represented by gray boxes.

FIG. 3.

FIG. 3.

Multiple alignments of PLpro of SARS-CoV, btCoV/133/05 (NC_008315), bat-CoV HKU4, bat-CoV HKU5, bat-CoV HKU9, and IBV and PL2pro of HCoV-229E, TGEV, HCoV-OC43, and MHV. Amino acids conserved across all coronaviruses are highlighted in black. Amino acids conserved in 60 to 90% of the coronaviruses are highlighted in gray. The conserved Cys and His amino acid residues of the catalytic dyad are marked with an asterisk, the conserved postulated metal-chelating Cys and His residues are marked with a “#” symbol, and the conserved aromatic amino acid immediately downstream of the catalytic Cys is marked with a “+” symbol.

FIG. 4.

FIG. 4.

Predicted bulged stem-loop and pseudoknot structures downstream of N in genomes of bat-CoV HKU4 and bat-CoV HKU5. Stop codons for the N genes are boxed. Broken lines indicate alternative base pairing.

FIG. 5.

FIG. 5.

Phylogenetic analysis of chymotrypsin-like protease (3CLpro), RNA-dependent RNA polymerase (Pol), helicase (Hel), spike (S), and nucleocapsid (N) of bat-CoV HKU4, bat-CoV HKU5, and bat-CoV HKU9. The trees were constructed by the neighbor-joining method using the Jukes-Cantor correction and bootstrap values calculated from 1,000 trees. We included 327, 949, 609, 1,661, and 582 amino acid positions in 3CLpro, Pol, helicase, S and N, respectively, in the analysis. The scale bar indicates the estimated number of substitutions per 10 amino acids. Abbreviations are as defined in the text or in the legend to Fig. 1.

Similar articles

Cited by

References

    1. Apweiler, R., T. K. Attwood, A. Bairoch, A. Bateman, E. Birney, M. Biswas, P. Bucher, L. Cerutti, F. Corpet, M. D. Croning, R. Durbin, L. Falquet, W. Fleischmann, J. Gouzy, H. Hermjakob, N. Hulo, I. Jonassen, D. Kahn, A. Kanapin, Y. Karavidopoulou, R. Lopez, B. Marx, N. J. Mulder, T. M. Oinn, M. Pagni, F. Servant, C. J. Sigrist, and E. M. Zdobnov. 2001. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 29:37-40. - PMC - PubMed
    1. Bateman, A., E. Birney, L. Cerruti, R. Durbin, L. Etwiller, S. R. Eddy, S. Griffiths-Jones, K. L. Howe, M. Marshall, and E. L. Sonnhammer. 2002. The Pfam protein families database. Nucleic Acids Res. 30:276-280. - PMC - PubMed
    1. Brian, D. A., and R. S. Baric. 2005. Coronavirus genome structure and replication. Curr. Top. Microbiol. Immunol. 287:1-30. - PMC - PubMed
    1. Eickmann, M., S. Becker, H. D. Klenk, H. W. Doerr, K. Stadler, S. Censini, S. Guidotti, V. Masignani, M. Scarselli, M. Mora, C. Donati, J. H. Han, H. C. Song, S. Abrignani, A. Covacci, and R. Rappuoli. 2003. Phylogeny of the SARS coronavirus. Science 302:1504-1505. - PubMed
    1. Fouchier, R. A., N. G. Hartwig, T. M. Bestebroer, B. Niemeyer, J. C. de Jong, J. H. Simon, and A. D. Osterhaus. 2004. A previously undescribed coronavirus associated with respiratory disease in humans. Proc. Natl. Acad. Sci. USA 101:6212-6216. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources