FluShuffle and FluResort: new algorithms to identify reassorted strains of the influenza virus by mass spectrometry - PubMed (original) (raw)

FluShuffle and FluResort: new algorithms to identify reassorted strains of the influenza virus by mass spectrometry

Aaron Tl Lun et al. BMC Bioinformatics. 2012.

Abstract

Background: Influenza is one of the oldest and deadliest infectious diseases known to man. Reassorted strains of the virus pose the greatest risk to both human and animal health and have been associated with all pandemics of the past century, with the possible exception of the 1918 pandemic, resulting in tens of millions of deaths. We have developed and tested new computer algorithms, FluShuffle and FluResort, which enable reassorted viruses to be identified by the most rapid and direct means possible. These algorithms enable reassorted influenza, and other, viruses to be rapidly identified to allow prevention strategies and treatments to be more efficiently implemented.

Results: The FluShuffle and FluResort algorithms were tested with both experimental and simulated mass spectra of whole virus digests. FluShuffle considers different combinations of viral protein identities that match the mass spectral data using a Gibbs sampling algorithm employing a mixed protein Markov chain Monte Carlo (MCMC) method. FluResort utilizes those identities to calculate the weighted distance of each across two or more different phylogenetic trees constructed through viral protein sequence alignments. Each weighted mean distance value is normalized by conversion to a Z-score to establish a reassorted strain.

Conclusions: The new FluShuffle and FluResort algorithms can correctly identify the origins of influenza viral proteins and the number of reassortment events required to produce the strains from the high resolution mass spectral data of whole virus proteolytic digestions. This has been demonstrated in the case of constructed vaccine strains as well as common human seasonal strains of the virus. The algorithms significantly improve the capability of the proteotyping approach to identify reassorted viruses that pose the greatest pandemic risk.

PubMed Disclaimer

Figures

Figure 1

Figure 1

An overview of the computational strategy and algorithms used to establish viral protein identity and reassorted strains. The algorithms FluShuffle and FluResort are shaded.

Figure 2

Figure 2

High resolution MALDI mass spectrum of the (a) tryptic and (b) Glu-C endoproteinase whole virus digest of the PanVax vaccine against the 2009 H1N1 influenza pandemic strains. Peaks labelled Glu-C denote autolysis products.

Figure 3

Figure 3

Phylogenetic tree for the hemagglutinin protein (H1 subtype) with colouration of its predicted identity within the PanVax strain. Irrelevant clades have been collapsed for clarity. A scale bar is shown that represents distance as substitutions per site. The location of the expected strain origin (A/California/07/2009) is labelled and the sum of probabilities for its clade of close relatives is shown in brackets as a percentage. The location of the A/Puerto Rico/08/1934 strain is also labelled.

Figure 4

Figure 4

Phylogenetic tree for the nucleoprotein for influenza type A with colouration of its predicted identity within the PanVax strain. The location of the expected origin (A/Puerto Rico/08/1934) is labelled and the sum of probabilities for its clade of close relatives is shown in brackets as a percentage. The location of the A/California/07/2009 strain is also labelled.

Figure 5

Figure 5

High resolution MALDI mass spectrum of the tryptic whole virus digest of the A/Solomon Islands/03/2006 strain. Peaks labelled trypsin denote autolysis products.

Figure 6

Figure 6

Phylogenetic tree for the hemagglutinin protein (H1 subtype) with colouration of its predicted identity within the type A/Solomon Islands/03/2006 strain. The location of the expected identity is labelled and the sum of probabilities for its clade of close relatives is shown in brackets as a percentage. The clade of closely related sequences with the greatest sum probability is marked in bold. The clade containing seasonal H1N1 strains is also shown with its sum of probabilities.

References

    1. Wilschut JC, McElhaney JE, Palache AM, editor. Influenza rapid reference. 2. Netherlands: Elsevier; 2006.
    1. Van-Tam J, Sellwood C. Introduction to pandemic influenza. Walingford UK: C.A.B. International; 2010.
    1. Nelson MI, Holmes EC. The evolution of epidemic influenza. Nature Rev Genetics. 2007;8:196–205. doi: 10.1038/nrg2053. -DOI -PubMed
    1. Zambon MC. The pathogenesis of influenza in humans. Rev Med Virol. 2001;11:227–241. doi: 10.1002/rmv.319. -DOI -PubMed
    1. Nguyen-Van-Tam JS, Hampson AW. The epidemiology and clinical impact of pandemic influenza. Vaccine. 2008;21:1762–1768. -PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources