Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes - PubMed (original) (raw)
. 2006 Jun;78(6):1011-25.
doi: 10.1086/504300. Epub 2006 Apr 25.
Affiliations
- PMID: 16685651
- PMCID: PMC1474084
- DOI: 10.1086/504300
Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes
Lude Franke et al. Am J Hum Genet. 2006 Jun.
Abstract
Most common genetic disorders have a complex inheritance and may result from variants in many genes, each contributing only weak effects to the disease. Pinpointing these disease genes within the myriad of susceptibility loci identified in linkage studies is difficult because these loci may contain hundreds of genes. However, in any disorder, most of the disease genes will be involved in only a few different molecular pathways. If we know something about the relationships between the genes, we can assess whether some genes (which may reside in different loci) functionally interact with each other, indicating a joint basis for the disease etiology. There are various repositories of information on pathway relationships. To consolidate this information, we developed a functional human gene network that integrates information on genes and the functional relationships between genes, based on data from the Kyoto Encyclopedia of Genes and Genomes, the Biomolecular Interaction Network Database, Reactome, the Human Protein Reference Database, the Gene Ontology database, predicted protein-protein interactions, human yeast two-hybrid interactions, and microarray co-expressions. We applied this network to interrelate positional candidate genes from different disease loci and then tested 96 heritable disorders for which the Online Mendelian Inheritance in Man database reported at least three disease genes. Artificial susceptibility loci, each containing 100 genes, were constructed around each disease gene, and we used the network to rank these genes on the basis of their functional interactions. By following up the top five genes per artificial locus, we were able to detect at least one known disease gene in 54% of the loci studied, representing a 2.8-fold increase over random selection. This suggests that our method can significantly reduce the cost and effort of pinpointing true disease genes in analyses of disorders for which numerous loci have been reported but for which most of the genes are unknown.
Figures
Figure 1
Basic principles of the prioritization method for positional candidate genes with the use of a functional human gene network. The method integrates different gene-gene interaction data sources in a Bayesian way (left panel). Subsequently, this gene network is used to prioritize positional candidate genes, with all genes assigned an initial score of zero. In the example (right panel), three different susceptibility loci are analyzed, each containing a disease gene (P, Q, or R) and two nondisease genes. In each locus, the three positional candidate genes increase the scores of nearby genes in the gene network, by use of a kernel function that models the relationship between gene-gene distance and score effect. Genes within each locus are ranked on the basis of their eventual effect score, corrected for differences in the topology of the network (see the “Material and Methods” section).
Figure 2
Integration of data sets in four gene networks. a, Data sets were benchmarked against a set of 55,606 known true-positive gene pairs derived from BIND, KEGG, HPRD, and Reactome and 800,608 true-negative gene pairs derived from GO. The Venn diagram indicates the data sources from which the true positives were derived and their degree of overlap. Numbers in parentheses indicate the number of interactions that are provided by each of the data sets. b, Potential gene-gene interactions derived from GO, microarray coexpression data, and human and orthologous protein-protein interaction data were integrated using a Bayesian classifier. The steps involved in building this classifier are shown.
Figure 3
ROC curve of the GO network, the MA+PPI network, and the combined GO+MA+PPI network. The baseline (solid gray line) indicates the performance of a classifier that would be totally uninformative.
Figure 4
Accuracy of positional candidate-gene prioritization. a and b, Percentage of the 409 disease genes that was ranked among the top 5 (a) or top 10 (b) genes per locus, after artificial susceptibility loci of varying widths around these genes were constructed and when different types of gene networks were used. The baselines (gray lines) indicate the percentage of disease genes expected to rank among the top 5 or top 10 genes by chance. c, ROC curves for susceptibility loci that contain 50, 100, or 150 genes.
Figure 5
Probability of detecting at least one disease gene when a fixed number of top-ranked positional candidate genes—as ranked by Prioritizer—are followed up for each locus. Each locus contains either 100 or 150 genes, and the GO+MA+PPI+TP network was employed. The baselines (dashed lines) show the probability of detecting at least one disease gene if a fixed number of arbitrarily chosen genes in each locus are followed up.
Figure 6
Prioritizer analysis of breast cancer. Susceptibility loci, each containing 100 genes, were defined around 10 known breast cancer genes. The 10 highest-ranked genes for each locus are shown in the graph, with colors indicating the locus in which they reside. Use of the GO+MA+PPI network led to four breast cancer genes (PIK3CA, CHEK2, BARD1, and TP53 [_circles_]) being ranked in the top 10. Chr. = chromosome.
Figure A1
Difference in likelihood ratios between genes that were represented on the microarrays and genes that were not.
Figure A2
Degree distributions for the four networks. The MA+Y2H network has a topology that most closely follows a scale-free, power-law distribution, compared with the other three networks.
Similar articles
- Walking the interactome for prioritization of candidate disease genes.
Köhler S, Bauer S, Horn D, Robinson PN. Köhler S, et al. Am J Hum Genet. 2008 Apr;82(4):949-58. doi: 10.1016/j.ajhg.2008.02.013. Epub 2008 Mar 27. Am J Hum Genet. 2008. PMID: 18371930 Free PMC article. - Syndrome to gene (S2G): in-silico identification of candidate genes for human diseases.
Gefen A, Cohen R, Birk OS. Gefen A, et al. Hum Mutat. 2010 Mar;31(3):229-36. doi: 10.1002/humu.21171. Hum Mutat. 2010. PMID: 20052752 - Finding genes influencing susceptibility to complex diseases in the post-genome era.
Rannala B. Rannala B. Am J Pharmacogenomics. 2001;1(3):203-21. doi: 10.2165/00129785-200101030-00005. Am J Pharmacogenomics. 2001. PMID: 12083968 Review. - Speeding disease gene discovery by sequence based candidate prioritization.
Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS. Adie EA, et al. BMC Bioinformatics. 2005 Mar 14;6:55. doi: 10.1186/1471-2105-6-55. BMC Bioinformatics. 2005. PMID: 15766383 Free PMC article. - Insights into genetics, human biology and disease gleaned from family based genomic studies.
Posey JE, O'Donnell-Luria AH, Chong JX, Harel T, Jhangiani SN, Coban Akdemir ZH, Buyske S, Pehlivan D, Carvalho CMB, Baxter S, Sobreira N, Liu P, Wu N, Rosenfeld JA, Kumar S, Avramopoulos D, White JJ, Doheny KF, Witmer PD, Boehm C, Sutton VR, Muzny DM, Boerwinkle E, Günel M, Nickerson DA, Mane S, MacArthur DG, Gibbs RA, Hamosh A, Lifton RP, Matise TC, Rehm HL, Gerstein M, Bamshad MJ, Valle D, Lupski JR; Centers for Mendelian Genomics. Posey JE, et al. Genet Med. 2019 Apr;21(4):798-812. doi: 10.1038/s41436-018-0408-7. Epub 2019 Jan 18. Genet Med. 2019. PMID: 30655598 Free PMC article. Review.
Cited by
- A human proteogenomic-cellular framework identifies KIF5A as a modulator of astrocyte process integrity with relevance to ALS.
Szebényi K, Barrio-Hernandez I, Gibbons GM, Biasetti L, Troakes C, Beltrao P, Lakatos A. Szebényi K, et al. Commun Biol. 2023 Jun 29;6(1):678. doi: 10.1038/s42003-023-05041-4. Commun Biol. 2023. PMID: 37386082 Free PMC article. - Network expansion of genetic associations defines a pleiotropy map of human cell biology.
Barrio-Hernandez I, Schwartzentruber J, Shrivastava A, Del-Toro N, Gonzalez A, Zhang Q, Mountjoy E, Suveges D, Ochoa D, Ghoussaini M, Bradley G, Hermjakob H, Orchard S, Dunham I, Anderson CA, Porras P, Beltrao P. Barrio-Hernandez I, et al. Nat Genet. 2023 Mar;55(3):389-398. doi: 10.1038/s41588-023-01327-9. Epub 2023 Feb 23. Nat Genet. 2023. PMID: 36823319 Free PMC article. - Deafness gene screening based on a multilevel cascaded BPNN model.
Liu X, Teng L, Zuo W, Zhong S, Xu Y, Sun J. Liu X, et al. BMC Bioinformatics. 2023 Feb 20;24(1):56. doi: 10.1186/s12859-023-05182-7. BMC Bioinformatics. 2023. PMID: 36803022 Free PMC article. - Identification of Warning Transition Points from Hepatitis B to Hepatocellular Carcinoma Based on Mutation Accumulation for the Early Diagnosis and Potential Drug Treatment of HBV-HCC.
Xu F, Meng Q, Wu F, Wang Y, Yang W, Tong Y, Liu L, Chen X. Xu F, et al. Oxid Med Cell Longev. 2022 Sep 5;2022:3472179. doi: 10.1155/2022/3472179. eCollection 2022. Oxid Med Cell Longev. 2022. PMID: 36105485 Free PMC article. - Identifying genes targeted by disease-associated non-coding SNPs with a protein knowledge graph.
Vlietstra WJ, Vos R, van Mulligen EM, Jenster GW, Kors JA. Vlietstra WJ, et al. PLoS One. 2022 Jul 13;17(7):e0271395. doi: 10.1371/journal.pone.0271395. eCollection 2022. PLoS One. 2022. PMID: 35830458 Free PMC article.
References
Web Resources
- Biomolecular Interaction Network Database (BIND), http://bind.ca/
- Ensembl, http://www.ensembl.org/index.html
- GeneNetwork, http://www.genenetwork.nl
- Human Protein Reference Database (HPRD), http://www.hprd.org/
- Kyoto Encyclopedia of Genes and Genomes (KEGG), http://www.genome.jp/kegg/
References
- Seri M, Martucciello G, Paleari L, Bolino A, Priolo M, Salemi G, Forabosco P, Caroli F, Cusano R, Tocco T, Lerone M, Cama A, Torre M, Guys JM, Romeo G, Jasonni V (1999) Exclusion of the Sonic Hedgehog gene as responsible for Currarino syndrome and anorectal malformations with sacral hypodevelopment. Hum Genet 104:108–11010.1007/s004390050919 - DOI - PubMed
- Simard J, Feunteun J, Lenoir G, Tonin P, Normand T, Luu The V, Vivier A, et al (1993) Genetic mapping of the breast-ovarian cancer syndrome to a small interval on chromosome 17q12-21: exclusion of candidate genes EDH17B2 and RARA. Hum Mol Genet 2:1193–1199 - PubMed
- Tumer Z, Croucher PJ, Jensen LR, Hampe J, Hansen C, Kalscheuer V, Ropers HH, Tommerup N, Schreiber S (2002) Genomic structure, chromosome mapping and expression analysis of the human AVIL gene, and its exclusion as a candidate for locus for inflammatory bowel disease at 12q13-14 (IBD2). Gene 288:179–18510.1016/S0378-1119(02)00478-X - DOI - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical