Discovering hidden viral piracy (original) (raw)
Related papers
Journal of Virology, 2014
The genome sequences of new viruses often contain many "orphan" or "taxon-specific" proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as "genus specific" by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein,
Viral Proteins Acquired from a Host Converge to Simplified Domain Architectures
PLoS Computational Biology, 2012
The infection cycle of viruses creates many opportunities for the exchange of genetic material with the host. Many viruses integrate their sequences into the genome of their host for replication. These processes may lead to the virus acquisition of host sequences. Such sequences are prone to accumulation of mutations and deletions. However, in rare instances, sequences acquired from a host become beneficial for the virus. We searched for unexpected sequence similarity among the 900,000 viral proteins and all proteins from cellular organisms. Here, we focus on viruses that infect metazoa. The highconservation analysis yielded 187 instances of highly similar viral-host sequences. Only a small number of them represent viruses that hijacked host sequences. The low-conservation sequence analysis utilizes the Pfam family collection. About 5% of the 12,000 statistical models archived in Pfam are composed of viral-metazoan proteins. In about half of Pfam families, we provide indirect support for the directionality from the host to the virus. The other families are either wrongly annotated or reflect an extensive sequence exchange between the viruses and their hosts. In about 75% of cross-taxa Pfam families, the viral proteins are significantly shorter than their metazoan counterparts. The tendency for shorter viral proteins relative to their related host proteins accounts for the acquisition of only a fragment of the host gene, the elimination of an internal domain and shortening of the linkers between domains. We conclude that, along viral evolution, the host-originated sequences accommodate simplified domain compositions. We postulate that the trimmed proteins act by interfering with the fundamental function of the host including intracellular signaling, post-translational modification, protein-protein interaction networks and cellular trafficking. We compiled a collection of hijacked protein sequences. These sequences are attractive targets for manipulation of viral infection.
VirusMINT: a viral protein interaction database
Nucleic Acids Research, 2009
Understanding the consequences on host physiology induced by viral infection requires complete understanding of the perturbations caused by virus proteins on the cellular protein interaction network. The VirusMINT database (http://mint.bio.uniroma2\. it/virusmint/) aims at collecting all protein interactions between viral and human proteins reported in the literature. VirusMINT currently stores over 5000 interactions involving more than 490 unique viral proteins from more than 110 different viral strains. The whole data set can be easily queried through the search pages and the results can be displayed with a graphical viewer. The curation effort has focused on manuscripts reporting interactions between human proteins and proteins encoded by some of the most medically relevant viruses: papilloma viruses, human immunodeficiency virus 1, Epstein-Barr virus, hepatitis B virus, hepatitis C virus, herpes viruses and Simian virus 40.
The Landscape of Human Proteins Interacting with Viruses and Other Pathogens
PLoS Pathogens, 2008
Infectious diseases result in millions of deaths each year. Mechanisms of infection have been studied in detail for many pathogens. However, many questions are relatively unexplored. What are the properties of human proteins that interact with pathogens? Do pathogens interact with certain functional classes of human proteins? Which infection mechanisms and pathways are commonly triggered by multiple pathogens? In this paper, to our knowledge, we provide the first study of the landscape of human proteins interacting with pathogens. We integrate human-pathogen protein-protein interactions (PPIs) for 190 pathogen strains from seven public databases. Nearly all of the 10,477 human-pathogen PPIs are for viral systems (98.3%), with the majority belonging to the human-HIV system (77.9%). We find that both viral and bacterial pathogens tend to interact with hubs (proteins with many interacting partners) and bottlenecks (proteins that are central to many paths in the network) in the human PPI network. We construct separate sets of human proteins interacting with bacterial pathogens, viral pathogens, and those interacting with multiple bacteria and with multiple viruses. Gene Ontology functions enriched in these sets reveal a number of processes, such as cell cycle regulation, nuclear transport, and immune response that participate in interactions with different pathogens. Our results provide the first global view of strategies used by pathogens to subvert human cellular processes and infect human cells. Supplementary data accompanying this paper is available at http://staff.vbi.vt.edu/ dyermd/publications/dyer2008a.html.
PLOS ONE, 2021
The MERS-CoV, SARS-CoV, and SARS-CoV-2 are highly pathogenic viruses that can cause severe pneumonic diseases in humans. Unfortunately, there is a non-available effective treatment to combat these viruses. Domain-motif interactions (DMIs) are an essential means by which viruses mimic and hijack the biological processes of host cells. To disentangle how viruses achieve this process can help to develop new rational therapies. Data mining was performed to obtain DMIs stored as regular expressions (regexp) in 3DID and ELM databases. The mined regexp information was mapped on the coronaviruses’ proteomes. Most motifs on viral protein that could interact with human proteins are shared across the coronavirus species, indicating that molecular mimicry is a common strategy for coronavirus infection. Enrichment ontology analysis for protein domains showed a shared biological process and molecular function terms related to carbon source utilization and potassium channel regulation. Some of the...
Structure homology and interaction redundancy for discovering virus–host protein interactions
EMBO reports, 2013
Virus-host interactomes are instrumental to understand global perturbations of cellular functions induced by infection and discover new therapies. The construction of such interactomes is, however, technically challenging and time consuming. Here we describe an original method for the prediction of high-confidence interactions between viral and human proteins through a combination of structure and high-quality interactome data. Validation was performed for the NS1 protein of the influenza virus, which led to the identification of new host factors that control viral replication.
2017
Abstract:<br><br>Viruses are obligate intracellular parasites predominant in all domains of life. Viral infections cause diseases and death to humans and edible organisms like plants and cattle. Viral short genomes encode subversion mechanisms of the host cell subsystems. These subversion mechanisms are based on virus-host protein-protein interactions. <br>The descent of next generation sequencing costs has spurred an exponential growth of the number of viral genomes in bioinformatic databases. In contrast, the cost of experimental high-throughput techniques for virus-host protein-protein determination is not decreasing at the same rate. Although the number of virus-host protein-protein interactions has been growing in databases too, it is still low in order to analyze viral subversion mechanisms with system biology approaches. <br>There is a need of computational virus-host protein-protein interaction prediction methods. However, the number of viral and host...
Molecular Ecology
Coronaviruses (CoVs) have complex genomes that encode a fixed array of structural and nonstructural components, as well as a variety of accessory proteins that differ even among closely related viruses. Accessory proteins often play a role in the suppression of immune responses and may represent virulence factors. Despite their relevance for CoV phenotypic variability, information on accessory proteins is fragmentary. We applied a systematic approach based on homology detection to create a comprehensive catalogue of accessory proteins encoded by CoVs. Our analyses grouped accessory proteins into 379 orthogroups and 12 super-groups. No orthogroup was shared by the four CoV genera and very few were present in all or most viruses in the same genus, reflecting the dynamic evolution of CoV genomes. We observed differences in the distribution of accessory proteins in CoV genera. Alphacoronaviruses harboured the largest diversity of accessory open reading frames (ORFs), deltacoronaviruses the smallest. However, the average number of accessory proteins per genome was highest in betacoronaviruses. Analysis of the evolutionary history of some orthogroups indicated that the different CoV genera adopted similar evolutionary strategies. Thus, alphacoronaviruses and betacoronaviruses acquired phosphodiesterases and spike-like accessory proteins independently, whereas horizontal gene transfer from reoviruses endowed betacoronaviruses and deltacoronaviruses with fusion-associated small transmembrane (FAST) proteins. Finally, analysis of accessory ORFs in annotated CoV genomes indicated ambiguity in their naming. This complicates cross-communication among researchers and hinders automated searches of large data sets (e.g., PubMed, GenBank). We suggest that orthogroup membership is used together with a naming system to provide information on protein function.