STRING v9.1: protein-protein interaction networks, with increased coverage and integration - PubMed (original) (raw)

. 2013 Jan;41(Database issue):D808-15.

doi: 10.1093/nar/gks1094. Epub 2012 Nov 29.

Affiliations

PMID: 23203871
PMCID: PMC3531103
DOI: 10.1093/nar/gks1094

STRING v9.1: protein-protein interaction networks, with increased coverage and integration

Andrea Franceschini et al. Nucleic Acids Res. 2013 Jan.

Abstract

Complete knowledge of all direct and indirect interactions between proteins in a given cell would represent an important milestone towards a comprehensive description of cellular mechanisms and functions. Although this goal is still elusive, considerable progress has been made-particularly for certain model organisms and functional systems. Currently, protein interactions and associations are annotated at various levels of detail in online resources, ranging from raw data repositories to highly formalized pathway databases. For many applications, a global view of all the available interaction data is desirable, including lower-quality data and/or computational predictions. The STRING database (http://string-db.org/) aims to provide such a global perspective for as many organisms as feasible. Known and predicted associations are scored and integrated, resulting in comprehensive protein networks covering >1100 organisms. Here, we describe the update to version 9.1 of STRING, introducing several improvements: (i) we extend the automated mining of scientific texts for interaction information, to now also include full-text articles; (ii) we entirely re-designed the algorithm for transferring interactions from one model organism to the other; and (iii) we provide users with statistical information on any functional enrichment observed in their networks.

PubMed Disclaimer

Figures

Figure 1.

Improved procedure for interaction transfer between organisms. Left: steps 1 and 2 of the functional association transfer pipeline. In the first step, the individual links between proteins are combined into a score between orthologous groups, sequentially, from the strongest link (thick line) to the weakest (thin). Each subsequent score is down-weighted, both based on the similarity of its organism to organisms that have already contributed to the combined scores, and on number of proteins from the same organism inside the orthologous group. In the second step of the transfer pipeline, the links between orthologous groups are transferred back to individual protein pairs belonging to these groups. This is done sequentially from the lowest to highest taxonomy level. In the above example, the two transferred links from the highest taxonomic level (orange links) are penalized for the increase in number of proteins from the target species in one of the orthologous groups. Right: ROC curves indicating the performance of predicted interolog scores, benchmarked against KEGG pathways; an inferred link between two proteins is considered to be a true positive when both proteins are annotated to be together in at least one shared KEGG pathway.

Figure 2.

Network visualization and statistical analysis of a user-supplied protein list. The STRING screenshot shows a user-supplied set of genes, here a selection of cancer genes as annotated at the COSMIC database (52). The set is restricted to those genes that are known to pre-dispose to cancer already when mutated in the germline, and that have at least one connection in STRING. The inset illustrates the website’s new functionality for automatically detecting statistically enriched functions or processes in a network. In this example, one of the detected processes (nucleotide excision repair) is of interest and has been selected; STRING automatically highlighted the corresponding nodes in the network, where they are seen to form a densely connected module.

Cited by

Deciphering molecular landscape of breast cancer progression and insights from functional genomics and therapeutic explorations followed by in vitro validation.
Khan B, Qahwaji R, Alfaifi MS, Athar T, Khan A, Mobashir M, Ashankyty I, Imtiyaz K, Alahmadi A, Rizvi MMA. Khan B, et al. Sci Rep. 2024 Nov 20;14(1):28794. doi: 10.1038/s41598-024-80455-6. Sci Rep. 2024. PMID: 39567714 Free PMC article.
VI-VS: calibrated identification of feature dependencies in single-cell multiomics.
Boyeau P, Bates S, Ergen C, Jordan MI, Yosef N. Boyeau P, et al. Genome Biol. 2024 Nov 15;25(1):294. doi: 10.1186/s13059-024-03419-z. Genome Biol. 2024. PMID: 39548591 Free PMC article.
Dataset of Panda sperm proteome.
Liu S, Wang T, Liu Y, Wang S, Li F, Chen J, Hu X, Zhang M, Wang J, Li Y, James A, Hou R, Cai K. Liu S, et al. Data Brief. 2024 Oct 21;57:111052. doi: 10.1016/j.dib.2024.111052. eCollection 2024 Dec. Data Brief. 2024. PMID: 39525650 Free PMC article.
Striatal dopamine gene network moderates the effect of early adversity on the risk for adult psychiatric and cardiometabolic comorbidity.
Barth B, Arcego DM, de Mendonça Filho EJ, de Lima RMS, Parent C, Dalmaz C, Portella AK, Pokhvisneva I, Meaney MJ, Silveira PP. Barth B, et al. Sci Rep. 2024 Nov 9;14(1):27349. doi: 10.1038/s41598-024-78465-5. Sci Rep. 2024. PMID: 39521843 Free PMC article.
Identifying the HIV-Resistance-Related Factors and Regulatory Network via Multi-Omics Analyses.
Long X, Liu G, Liu X, Zhang C, Shi L, Zhu Z. Long X, et al. Int J Mol Sci. 2024 Nov 1;25(21):11757. doi: 10.3390/ijms252111757. Int J Mol Sci. 2024. PMID: 39519306 Free PMC article.

References

1. Chothia C. Proteins. One thousand families for the molecular biologist. Nature. 1992;357:543–544. - PubMed
1. Wolf YI, Grishin NV, Koonin EV. Estimating the number of protein folds and families from complete genome data. J.Mol. Biol. 2000;299:897–905. - PubMed
1. Aloy P, Russell RB. Ten thousand interactions for the molecular biologist. Nature Biotechnol. 2004;22:1317–1321. - PubMed
1. Huynen M, Snel B, Lathe W, 3rd, Bork P. Predicting protein function by genomic context: quantitative evaluation and qualitative inferences. Genome Res. 2000;10:1204–1210. - PMC - PubMed
1. Eisenberg D, Marcotte EM, Xenarios I, Yeates TO. Protein function in the post-genomic era. Nature. 2000;405:823–826. - PubMed

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- BioCyc
Miscellaneous
- NCI CPTAC Assay Portal

STRING v9.1: protein-protein interaction networks, with increased coverage and integration - PubMed (original) (raw)

STRING v9.1: protein-protein interaction networks, with increased coverage and integration

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Miscellaneous