WebGestalt: an integrated system for exploring gene sets in various biological contexts - PubMed (original) (raw)
WebGestalt: an integrated system for exploring gene sets in various biological contexts
Bing Zhang et al. Nucleic Acids Res. 2005.
Abstract
High-throughput technologies have led to the rapid generation of large-scale datasets about genes and gene products. These technologies have also shifted our research focus from 'single genes' to 'gene sets'. We have developed a web-based integrated data mining system, WebGestalt (http://genereg.ornl.gov/webgestalt/), to help biologists in exploring large sets of genes. WebGestalt is composed of four modules: gene set management, information retrieval, organization/visualization, and statistics. The management module uploads, saves, retrieves and deletes gene sets, as well as performs Boolean operations to generate the unions, intersections or differences between different gene sets. The information retrieval module currently retrieves information for up to 20 attributes for all genes in a gene set. The organization/visualization module organizes and visualizes gene sets in various biological contexts, including Gene Ontology, tissue expression pattern, chromosome distribution, metabolic and signaling pathways, protein domain information and publications. The statistics module recommends and performs statistical tests to suggest biological areas that are important to a gene set and warrant further investigation. In order to demonstrate the use of WebGestalt, we have generated 48 gene sets with genes over-represented in various human tissue types. Exploration of all the 48 gene sets using WebGestalt is available for the public at http://genereg.ornl.gov/webgestalt/wg\_enrich.php.
Figures
Figure 1
Schematic overview of WebGestalt. WebGestalt is composed of four main modules: gene set management, information retrieval, organization/visualization and statistics. The gene set management module uploads, saves, retrieves and deletes gene sets, as well as performs Boolean operations to generate the unions, intersections and differences between gene sets. The uploading tool accepts datasets defined by experiment data, GO categories or chromosome location ranges. WebGestalt is flexible in the input identifier (Entrez Gene ID, Swiss-Prot ID, Ensembl ID, Unigene ID, gene symbol and Affymetrix Probe Set ID). The saving tool saves sub-sets of genes generated by the organization/visualization module. The information retrieval module currently retrieves information for up to 20 attributes for all genes in a gene set, including nomenclatures, various gene identifiers, map and functional information. Retrieved information can be exported to Microsoft Excel files. The organization/visualization module organizes and visualizes a gene set in figures or tables using eight sub-modules: GO Tree, Tissue Expression Bar Chart, Chromosome Distribution Chart, KEGG Table and Maps, BioCarta Table and Maps, Protein Domain Table, PubMed Table and GRIF Table. The statistics module provides two statistical tests, the hypergeometric test and Fisher's exact test and suggests important biological areas in a gene set.
Figure 2
Enriched DAG under ‘biological process’ for a set of 23 genes that are significantly over-represented in adrenal cortex, using all genes in the human genome as a reference. The enriched GO categories are brought together and visualized as a DAG. Categories in red are enriched ones while those in black are non-enriched parents. Listed in the boxes are the name of the GO category, the number of genes in the category and the _P_-value indicating the significance of enrichment.
Similar articles
- GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies.
Zhang B, Schmoyer D, Kirov S, Snoddy J. Zhang B, et al. BMC Bioinformatics. 2004 Feb 18;5:16. doi: 10.1186/1471-2105-5-16. BMC Bioinformatics. 2004. PMID: 14975175 Free PMC article. - WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013.
Wang J, Duncan D, Shi Z, Zhang B. Wang J, et al. Nucleic Acids Res. 2013 Jul;41(Web Server issue):W77-83. doi: 10.1093/nar/gkt439. Epub 2013 May 23. Nucleic Acids Res. 2013. PMID: 23703215 Free PMC article. - WebGestalt 2017: a more comprehensive, powerful, flexible and interactive gene set enrichment analysis toolkit.
Wang J, Vasaikar S, Shi Z, Greer M, Zhang B. Wang J, et al. Nucleic Acids Res. 2017 Jul 3;45(W1):W130-W137. doi: 10.1093/nar/gkx356. Nucleic Acids Res. 2017. PMID: 28472511 Free PMC article. - GeneKeyDB: a lightweight, gene-centric, relational database to support data mining environments.
Kirov SA, Peng X, Baker E, Schmoyer D, Zhang B, Snoddy J. Kirov SA, et al. BMC Bioinformatics. 2005 Mar 24;6:72. doi: 10.1186/1471-2105-6-72. BMC Bioinformatics. 2005. PMID: 15790402 Free PMC article. - Exploiting big biology: integrating large-scale biological data for function inference.
Marcotte E, Date S. Marcotte E, et al. Brief Bioinform. 2001 Dec;2(4):363-74. doi: 10.1093/bib/2.4.363. Brief Bioinform. 2001. PMID: 11808748 Review.
Cited by
- Transcriptomic and proteomic spatial profiling of pediatric and adult diffuse midline glioma H3 K27-Altered.
Damodharan S, Shireman JM, Xie E, Distler E, Kendziorski C, Dey M. Damodharan S, et al. Sci Rep. 2024 Sep 30;14(1):22668. doi: 10.1038/s41598-024-73199-w. Sci Rep. 2024. PMID: 39349581 Free PMC article. - Behavioral, neurotransmitter and transcriptomic analyses in male and female Fmr1 KO mice.
McCarthy DM, Vied C, Trupiano MX, Canekeratne AJ, Wang Y, Schatschneider C, Bhide PG. McCarthy DM, et al. Front Behav Neurosci. 2024 Sep 6;18:1458502. doi: 10.3389/fnbeh.2024.1458502. eCollection 2024. Front Behav Neurosci. 2024. PMID: 39308631 Free PMC article. - Integration of estimated regional gene expression with neuroimaging and clinical phenotypes at biobank scale.
Hoang N, Sardaripour N, Ramey GD, Schilling K, Liao E, Chen Y, Park JH, Bledsoe X, Landman BA, Gamazon ER, Benton ML, Capra JA, Rubinov M. Hoang N, et al. PLoS Biol. 2024 Sep 13;22(9):e3002782. doi: 10.1371/journal.pbio.3002782. eCollection 2024 Sep. PLoS Biol. 2024. PMID: 39269986 Free PMC article. - A Comprehensive Review of Bioinformatics Tools for Genomic Biomarker Discovery Driving Precision Oncology.
Clark AJ, Lillard JW Jr. Clark AJ, et al. Genes (Basel). 2024 Aug 6;15(8):1036. doi: 10.3390/genes15081036. Genes (Basel). 2024. PMID: 39202397 Free PMC article. Review. - Genome-Wide Analysis of Genetic Diversity and Selection Signatures in Zaobei Beef Cattle.
Shi L, Zhang P, Liu Q, Liu C, Cheng L, Yu B, Chen H. Shi L, et al. Animals (Basel). 2024 Aug 22;14(16):2447. doi: 10.3390/ani14162447. Animals (Basel). 2024. PMID: 39199980 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
- P01 DA015027/DA/NIDA NIH HHS/United States
- R21 AA013532/AA/NIAAA NIH HHS/United States
- P01-DA015027/DA/NIDA NIH HHS/United States
- U01-AA013532/AA/NIAAA NIH HHS/United States
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical