Estimating dataset size requirements for classifying DNA microarray data - PubMed (original) (raw)
Comparative Study
Estimating dataset size requirements for classifying DNA microarray data
Sayan Mukherjee et al. J Comput Biol. 2003.
Abstract
A statistical methodology for estimating dataset size requirements for classifying microarray data using learning curves is introduced. The goal is to use existing classification results to estimate dataset size requirements for future classification experiments and to evaluate the gain in accuracy and significance of classifiers built with additional data. The method is based on fitting inverse power-law models to construct empirical learning curves. It also includes a permutation test procedure to assess the statistical significance of classification performance for a given dataset size. This procedure is applied to several molecular classification problems representing a broad spectrum of levels of complexity.
Similar articles
- A prediction-based resampling method for estimating the number of clusters in a dataset.
Dudoit S, Fridlyand J. Dudoit S, et al. Genome Biol. 2002 Jun 25;3(7):RESEARCH0036. doi: 10.1186/gb-2002-3-7-research0036. Epub 2002 Jun 25. Genome Biol. 2002. PMID: 12184810 Free PMC article. - Comparison of feature selection and classification combinations for cancer classification using microarray data.
Vinaya V, Bulsara N, Gadgil CJ, Gadgil M. Vinaya V, et al. Int J Bioinform Res Appl. 2009;5(4):417-31. doi: 10.1504/IJBRA.2009.027515. Int J Bioinform Res Appl. 2009. PMID: 19640829 - A granular computing approach to gene selection.
Sun L, Xu J. Sun L, et al. Biomed Mater Eng. 2014;24(1):1307-14. doi: 10.3233/BME-130933. Biomed Mater Eng. 2014. PMID: 24212026 - Microarray data analysis: from disarray to consolidation and consensus.
Allison DB, Cui X, Page GP, Sabripour M. Allison DB, et al. Nat Rev Genet. 2006 Jan;7(1):55-65. doi: 10.1038/nrg1749. Nat Rev Genet. 2006. PMID: 16369572 Review. - Relative expression analysis for molecular cancer diagnosis and prognosis.
Eddy JA, Sung J, Geman D, Price ND. Eddy JA, et al. Technol Cancer Res Treat. 2010 Apr;9(2):149-59. doi: 10.1177/153303461000900204. Technol Cancer Res Treat. 2010. PMID: 20218737 Free PMC article. Review.
Cited by
- Development and Validation of Biomarker Classifiers for Treatment Selection.
Simon R. Simon R. J Stat Plan Inference. 2008 Feb 1;138(2):308-320. doi: 10.1016/j.jspi.2007.06.010. J Stat Plan Inference. 2008. PMID: 19190712 Free PMC article. - RNA sequencing and swarm intelligence-enhanced classification algorithm development for blood-based disease diagnostics using spliced blood platelet RNA.
Best MG, In 't Veld SGJG, Sol N, Wurdinger T. Best MG, et al. Nat Protoc. 2019 Apr;14(4):1206-1234. doi: 10.1038/s41596-019-0139-5. Epub 2019 Mar 20. Nat Protoc. 2019. PMID: 30894694 - Biological and functional analysis of statistically significant pathways deregulated in colon cancer by using gene expression profiles.
Distaso A, Abatangelo L, Maglietta R, Creanza TM, Piepoli A, Carella M, D'Addabbo A, Ancona N. Distaso A, et al. Int J Biol Sci. 2008;4(6):368-78. doi: 10.7150/ijbs.4.368. Epub 2008 Oct 14. Int J Biol Sci. 2008. PMID: 18953405 Free PMC article. - Applying Machine Learning to Kinematic and Eye Movement Features of a Movement Imitation Task to Predict Autism Diagnosis.
Vabalas A, Gowen E, Poliakoff E, Casson AJ. Vabalas A, et al. Sci Rep. 2020 May 20;10(1):8346. doi: 10.1038/s41598-020-65384-4. Sci Rep. 2020. PMID: 32433501 Free PMC article. - Supervised regularized canonical correlation analysis: integrating histologic and proteomic measurements for predicting biochemical recurrence following prostate surgery.
Golugula A, Lee G, Master SR, Feldman MD, Tomaszewski JE, Speicher DW, Madabhushi A. Golugula A, et al. BMC Bioinformatics. 2011 Dec 19;12:483. doi: 10.1186/1471-2105-12-483. BMC Bioinformatics. 2011. PMID: 22182303 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources