GOSim--an R-package for computation of information theoretic GO similarities between terms and gene products - PubMed (original) (raw)
GOSim--an R-package for computation of information theoretic GO similarities between terms and gene products
Holger Fröhlich et al. BMC Bioinformatics. 2007.
Abstract
Background: With the increased availability of high throughput data, such as DNA microarray data, researchers are capable of producing large amounts of biological data. During the analysis of such data often there is the need to further explore the similarity of genes not only with respect to their expression, but also with respect to their functional annotation which can be obtained from Gene Ontology (GO).
Results: We present the freely available software package GOSim, which allows to calculate the functional similarity of genes based on various information theoretic similarity concepts for GO terms. GOSim extends existing tools by providing additional lately developed functional similarity measures for genes. These can e.g. be used to cluster genes according to their biological function. Vice versa, they can also be used to evaluate the homogeneity of a given grouping of genes with respect to their GO annotation. GOSim hence provides the researcher with a flexible and powerful tool to combine knowledge stored in GO with experimental data. It can be seen as complementary to other tools that, for instance, search for significantly overrepresented GO terms within a given group of genes.
Conclusion: GOSim is implemented as a package for the statistical computing environment R and is distributed under GPL within the CRAN project.
Figures
Figure 1
Example of a GO graph starting with leaves GO:0007166 and GO:0007267.
Figure 2
Idea of an optimal assignment: each GO term belonging to gene 2 is assigned to exactly one GO term belonging to gene 1 such that the overall GO term similarity is maximized.
Figure 3
Genes embedded into a feature space defined by the GO similarity to certain prototype genes. principal components analysis was used to reduce the dimensionality of the feature space and the first two principal components are displayed.
Figure 4
Clustering silhouette of the upregulated genes (cDNA chips).
Figure 5
Clustering silhouette of the downregulated genes (cDNA chips).
Figure 6
Clustering silhouette of the upregulated genes (Affymetrix chips).
Figure 7
Clustering silhouette of the downregulated genes (Affymetrix chips).
Similar articles
- GeneTools--application for functional annotation and statistical hypothesis testing.
Beisvag V, Jünge FK, Bergum H, Jølsum L, Lydersen S, Günther CC, Ramampiaro H, Langaas M, Sandvik AK, Laegreid A. Beisvag V, et al. BMC Bioinformatics. 2006 Oct 24;7:470. doi: 10.1186/1471-2105-7-470. BMC Bioinformatics. 2006. PMID: 17062145 Free PMC article. - DynGO: a tool for visualizing and mining of Gene Ontology and its associations.
Liu H, Hu ZZ, Wu CH. Liu H, et al. BMC Bioinformatics. 2005 Aug 9;6:201. doi: 10.1186/1471-2105-6-201. BMC Bioinformatics. 2005. PMID: 16091147 Free PMC article. - SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.
Wang C, Lefkowitz EJ. Wang C, et al. BMC Bioinformatics. 2004 Oct 28;5:171. doi: 10.1186/1471-2105-5-171. BMC Bioinformatics. 2004. PMID: 15511296 Free PMC article. - Estimating the annotation error rate of curated GO database sequence annotations.
Jones CE, Brown AL, Baumann U. Jones CE, et al. BMC Bioinformatics. 2007 May 22;8:170. doi: 10.1186/1471-2105-8-170. BMC Bioinformatics. 2007. PMID: 17519041 Free PMC article. - Microarray analysis of gene expression: considerations in data mining and statistical treatment.
Verducci JS, Melfi VF, Lin S, Wang Z, Roy S, Sen CK. Verducci JS, et al. Physiol Genomics. 2006 May 16;25(3):355-63. doi: 10.1152/physiolgenomics.00314.2004. Epub 2006 Mar 22. Physiol Genomics. 2006. PMID: 16554544 Review.
Cited by
- simona: a comprehensive R package for semantic similarity analysis on bio-ontologies.
Gu Z. Gu Z. BMC Genomics. 2024 Sep 16;25(1):869. doi: 10.1186/s12864-024-10759-4. BMC Genomics. 2024. PMID: 39285315 Free PMC article. - Leveraging integrative toxicogenomic approach towards development of stressor-centric adverse outcome pathway networks for plastic additives.
Sahoo AK, Chivukula N, Madgaonkar SR, Ramesh K, Marigoudar SR, Sharma KV, Samal A. Sahoo AK, et al. Arch Toxicol. 2024 Oct;98(10):3299-3321. doi: 10.1007/s00204-024-03825-z. Epub 2024 Aug 3. Arch Toxicol. 2024. PMID: 39097536 Free PMC article. - In Silico Models to Validate Novel Blood-Based Biomarkers.
Sadlon A. Sadlon A. Methods Mol Biol. 2024;2785:321-344. doi: 10.1007/978-1-0716-3774-6_20. Methods Mol Biol. 2024. PMID: 38427203 - Transcriptomic analysis towards identification of defence-responsive genes and pathways upon application of Sargassum seaweed extract on tomato plants infected with Macrophomina phaseolina.
Bosmaia TC, Agarwal P, Dangariya M, Khedia J, Gangapur DR, Agarwal PK. Bosmaia TC, et al. 3 Biotech. 2023 Jun;13(6):179. doi: 10.1007/s13205-023-03565-4. Epub 2023 May 12. 3 Biotech. 2023. PMID: 37193326 Free PMC article. - Gene-SCOUT: identifying genes with similar continuous trait fingerprints from phenome-wide association analyses.
Middleton L, Harper AR, Nag A, Wang Q, Reznichenko A, Vitsios D, Petrovski S. Middleton L, et al. Nucleic Acids Res. 2022 May 6;50(8):4289-4301. doi: 10.1093/nar/gkac274. Nucleic Acids Res. 2022. PMID: 35474393 Free PMC article.
References
- Resnik P. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal. 1995;1:448–453.
- Resnik P. Semantic Similarity in a Taxonomy: An information-based measure and its application to problems of ambigiguity in natural language. Journal of Artificial Intelligence Research. 1999;11:95–130.
- Lin D. An information-theoretic definition of similarity. In: Kaufmann M, editor. Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA. Vol. 1. 1998. pp. 296–304.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources