Support vector machine classification and validation of cancer tissue samples using microarray expression data - PubMed (original) (raw)
Support vector machine classification and validation of cancer tissue samples using microarray expression data
T S Furey et al. Bioinformatics. 2000 Oct.
Abstract
Motivation: DNA microarray experiments generating thousands of gene expression measurements, are being used to gather information from tissue and cell samples regarding gene expression differences that will be useful in diagnosing disease. We have developed a new method to analyse this kind of data using support vector machines (SVMs). This analysis consists of both classification of the tissue samples, and an exploration of the data for mis-labeled or questionable tissue results.
Results: We demonstrate the method in detail on samples consisting of ovarian cancer tissues, normal ovarian tissues, and other normal tissues. The dataset consists of expression experiment results for 97,802 cDNAs for each tissue. As a result of computational analysis, a tissue sample is discovered and confirmed to be wrongly labeled. Upon correction of this mistake and the removal of an outlier, perfect classification of tissues is achieved, but not with high confidence. We identify and analyse a subset of genes from the ovarian dataset whose expression is highly differentiated between the types of tissues. To show robustness of the SVM method, two previously published datasets from other types of tissues or cells are analysed. The results are comparable to those previously obtained. We show that other machine learning methods also perform comparably to the SVM on many of those datasets.
Availability: The SVM software is available at http://www.cs. columbia.edu/ approximately bgrundy/svm.
Similar articles
- A combinational feature selection and ensemble neural network method for classification of gene expression data.
Liu B, Cui Q, Jiang T, Ma S. Liu B, et al. BMC Bioinformatics. 2004 Sep 27;5:136. doi: 10.1186/1471-2105-5-136. BMC Bioinformatics. 2004. PMID: 15450124 Free PMC article. - Classification of multiple cancer types by multicategory support vector machines using gene expression data.
Lee Y, Lee CK. Lee Y, et al. Bioinformatics. 2003 Jun 12;19(9):1132-9. doi: 10.1093/bioinformatics/btg102. Bioinformatics. 2003. PMID: 12801874 - A mixture model-based approach to the clustering of microarray expression data.
McLachlan GJ, Bean RW, Peel D. McLachlan GJ, et al. Bioinformatics. 2002 Mar;18(3):413-22. doi: 10.1093/bioinformatics/18.3.413. Bioinformatics. 2002. PMID: 11934740 - Gene selection from microarray data for cancer classification--a machine learning approach.
Wang Y, Tetko IV, Hall MA, Frank E, Facius A, Mayer KF, Mewes HW. Wang Y, et al. Comput Biol Chem. 2005 Feb;29(1):37-46. doi: 10.1016/j.compbiolchem.2004.11.001. Comput Biol Chem. 2005. PMID: 15680584 - Multifaceted approach to the diagnosis and classification of acute leukemias.
McKenna RW. McKenna RW. Clin Chem. 2000 Aug;46(8 Pt 2):1252-9. Clin Chem. 2000. PMID: 10926919 Review.
Cited by
- Artificial Intelligence Opportunities to Guide Precision Dosing Strategies.
Barrett JS. Barrett JS. J Pediatr Pharmacol Ther. 2024 Aug;29(4):434-440. doi: 10.5863/1551-6776-29.4.434. Epub 2024 Aug 13. J Pediatr Pharmacol Ther. 2024. PMID: 39144390 Free PMC article. No abstract available. - Artificial intelligence methods available for cancer research.
Murmu A, Győrffy B. Murmu A, et al. Front Med. 2024 Aug 8. doi: 10.1007/s11684-024-1085-3. Online ahead of print. Front Med. 2024. PMID: 39115792 Review. - Hyperspectral dark-field microscopy of human breast lumpectomy samples for tumor margin detection in breast-conserving surgery.
Hwang J, Cheney P, Kanick SC, Le HND, McClatchy DM 3rd, Zhang H, Liu N, John Lu ZQ, Cho TJ, Briggman K, Allen DW, Wells WA, Pogue BW. Hwang J, et al. J Biomed Opt. 2024 Sep;29(9):093503. doi: 10.1117/1.JBO.29.9.093503. Epub 2024 May 7. J Biomed Opt. 2024. PMID: 38715717 Free PMC article. - Proteomics appending a complementary dimension to precision oncotherapy.
Zhou Z, Zhang R, Zhou A, Lv J, Chen S, Zou H, Zhang G, Lin T, Wang Z, Zhang Y, Weng S, Han X, Liu Z. Zhou Z, et al. Comput Struct Biotechnol J. 2024 Apr 20;23:1725-1739. doi: 10.1016/j.csbj.2024.04.044. eCollection 2024 Dec. Comput Struct Biotechnol J. 2024. PMID: 38689716 Free PMC article. Review. - Polygenic Risk Score for Cardiovascular Diseases in Artificial Intelligence Paradigm: A Review.
Khanna NN, Singh M, Maindarkar M, Kumar A, Johri AM, Mentella L, Laird JR, Paraskevas KI, Ruzsa Z, Singh N, Kalra MK, Fernandes JFE, Chaturvedi S, Nicolaides A, Rathore V, Singh I, Teji JS, Al-Maini M, Isenovic ER, Viswanathan V, Khanna P, Fouda MM, Saba L, Suri JS. Khanna NN, et al. J Korean Med Sci. 2023 Nov 27;38(46):e395. doi: 10.3346/jkms.2023.38.e395. J Korean Med Sci. 2023. PMID: 38013648 Free PMC article. Review.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical
Research Materials