Exploring the performance of feature selection method using breast cancer dataset (original) (raw)

Abstract

Breast cancer is the most common type of cancer occurring mostly in females. In recent years, many researchers have devoted to automate diagnosis of breast cancer by developing different machine learning model. However, the quality and quantity of feature in breast cancer diagnostic dataset have significant effect on the accuracy and efficiency of predictive model. Feature selection is effective method for reducing the dimensionality and improving the accuracy of predictive model. The use of feature selection is to determine feature required for training model and to remove irrelevant and duplicate feature. Duplicate feature is a feature that is highly correlated to another feature. The objective of this study is to conduct experimental research on three different feature selection methods for breast cancer prediction. Sequential, embedded and chi-square feature selection are implemented using breast cancer diagnostic dataset. The study compares the performance of sequential embedde...

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (26)

  1. T. A. Assegie, "An optimized K-Nearest Neighbor based breast cancer detection," Journal of Robotics and Control, vol. 2, no. 3, pp. 115-118, May 2020, doi: 10.18196/jrc.2363.
  2. R. A. I. Alhayali, M. A. Ahmed, Y. M. Mohialden, and A. H. Ali," Efficient method for breast cancer classification based on ensemble hoffeding tree and naïve Bayes," Indonesian Journal of Electrical Engineering and Computer Science, vol. 18, no. 2, pp. 1074-1080, May 2020, doi: 10.11591/ijeecs.v18.i2.pp1074-1080.
  3. H. Dhahri, E. A. Maghayreh, A. Mahmood, and W. Elkilani, "Automated Breast Cancer Diagnosis Based on Machine Learning Algorithms," Hindawi Journal of Healthcare Engineering, vol. 2019, pp. 1-11, doi: 10.1155/2019/4253641.
  4. Z. Uyu and L. Choridah, "Feature Selection Mammogram based on Breast Cancer Mining," International Journal of Electrical and Computer Engineering, vol. 8, no. 1, pp. 60-69, February 2018, doi: 10.11591/ijece.v8i1.pp60-69.
  5. T. S. Lim, K. G. Tay, A. Huong, and X. Y. Lim, "Breast cancer diagnosis system using hybrid support vector machine-artificial neural network," International Journal of Electrical and Computer Engineering, vol. 11, no. 4, pp. 3059-3069, August 2021, doi: 10.11591/ijece.v11i4.pp3059-3069.
  6. Y. Guoa, B. Zhanga, Y. Sunb, K. Jiang, and K. Wu, "Machine learning based feature selection and knowledge reasoning for CBR system under big data," Pattern Recognition, vol. 112, 2021, doi: 10.1016/j.patcog.2020.107805.
  7. N. Maleki, Y. Zeinali, and T. A. Seyed, "A K-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection," Expert Systems with Applications, vol. 164, 2021, doi: 10.1016/j.eswa.2020.113981.
  8. S. Punitha, F. Al-Turjman, and Thompson, "An automated breast cancer diagnosis using feature selection and parameter optimization in ANN," Computers and Electrical Engineering, vol. 90, 2021, doi: 10.1016/j.compeleceng.2020.106958.
  9. K. Zhu and J. Yang, "A cluster-based sequential feature selection algorithm," 2013 Ninth International Conference on Natural Computation, 2013, doi: 10.1109/ICNC.2013.6818094.
  10. L. Wang, C. Shen, and H. Richard, "On the Optimal of Sequential Forward Feature Selec-tion Using Class Separability Measure," International Conference on Digital Image Computing: Tech-niques and Applications, 2021, doi: 10.1109/DICTA.2011.41.
  11. J. Zhang, L. Chen, and F. Abid, "Prediction of Breast Cancer from Imbalance Respect Using Cluster-Based Undersampling Method," Hindawi Journal of Healthcare Engineering, vol. 2019, pp. 10, doi: 10.1155/2019/7294582.
  12. T. A. Assegie, R. L. Tulasi, and N. K. Kumar, "Breast cancer prediction model with decision tree and adaptive boosting," IAES International Journal of Artificial Intelligence (IJ-AI), vol. 10, no. 1, pp. 184-190, 2021, doi: 10.11591/ijai.v10.i1.pp184-190 184.
  13. S. A. Alanazi et al., "Boosting Breast Cancer Detection Using Convolutional Neural Network," Hindawi Journal of Healthcare Engineering, vol. 2021, pp. 1-11, doi: 10.1155/2021/5528622.
  14. T. A. Assegie and P. S. Nair, "The Performance of Different Machine Learning Models on Diabetes Prediction," International Journal of Scientific & Technology Research, vol. 9, no. 01, pp. 2491-2494, January 2020. [Online]. Available at: https://www.ijstr.org/final-print/jan2020/The-Performance-Of-Different-Machine-Learning-Models-On-Diabetes-Prediction-.pdf
  15. Y. A. Mohammed and E. G. Saleh, "Comparative study of logistic regression and artificial neural networks on predicting breast cancer cytology," Indonesian Journal of Electrical Engineering and Computer Science, vol. 21, no. 2, 2021, pp. 1113-1120, doi: 10.11591/ijeecs.v21.i2.pp1113-1120.
  16. A. Khamparia, S. Bharati, P. Podder, D. Gupta, A. Khanna, T. K. Phung, and H. Thanh, "Diagnosis of breast cancer based on modern mammography using hybrid transfer learning," Multidimensional Systems and Signal Processing, vol. 32, pp. 747-765, 2021, doi: 10.1007/s11045-020-00756-7.
  17. R. R. Janghel, A. Shukla, R. Tiwari, and R. Kala, "Intelligent Decision Support System for Breast Cancer," Soft Computing and Expert System Laboratory, pp. 351-358, 2010, doi: 10.1007/978-3-642-13498-2_46.
  18. Z. Rustam, Y. Amalia, S. Hartini, and G. S. Saragih, "Linear discriminant analysis and support vector machines for classifying breast cancer," IAES International Journal of Artificial Intelligence, vol. 10, no. 1, pp. 253-256, March 2021, doi: 10.11591/ijai.v10.i1.pp253-256.
  19. M. S. essiane et al., "Feature selection based on dialectics to support breast cancer diagnosis using thermographic images," Research on Biomedical Engineering, pp. 1-22, 2021, doi: 10.1007/s42600-021-00158-z.
  20. A. Ridok, N. Widodo, W. F. Mahmudy, and M. Rifa, "A hybrid feature selection on AIRS method for identifying breast cancer diseases," International Journal of Electrical and Computer Engineering, vol. 11, no. 1, pp. 728-735, February 2021, doi: 10.11591/ijece.v11i1.pp728-735.
  21. M. Mahmood, B. Al-Khateeb, and W. M. Alwash, "A review on neural networks approach on classifying cancers.," IAES International Journal of Artificial Intelligence, vol. 9, no. 2, pp. 317-326, June 2020, doi: 10.11591/ijai.v9.i2.pp317-326.
  22. W. N. Ibeni, M. Z. Salikon, A. Mustapha, S. A. Daud, and M. N. Salleh, "Comparative analysis on Bayesian classification for breast cancer problem," Bulletin of Electrical Engineering and Informatics, vol. 8, no. 4, pp. 1303-1311, December 2019, doi: 10.11591/eei.v8i4.1628.
  23. Y. A Mohammed and E. Saleh, " An enhancement of mammogram images for breast cancer classification using artificial neural networks," IAES International Journal of Artificial Intelligence, vol. 10, no. 2, pp. 332-345, 2021, doi: 10.11591/ijai.v10.i2.pp332-345.
  24. S. Bagchi, K. G Tay, A. Huong, dan S. K. Debnath, "Image processing and machine learning techniques used in computer-aided detection system for mammogram screening-A review," International Journal of Electrical and Computer Engineering, vol. 10, no. 3, pp. 2336-2348, June 2020, doi: 10.11591/ijece.v10i3.pp2336-2348.
  25. G. Saranya and A. Pravin, "A comprehensive study on disease risk predictions in machine learning," International Journal of Electrical and Computer Engineering, vol. 10, no. 4, pp. 4217-4225, August 2020, doi: 10.11591/ijece.v10i4.pp4217-4225.
  26. BIOGRAPHIES OF AUTHORS Tsehay Admassu Assegie is Lecturer at College of Natural & Computational Science, Injibara University, Ethiopia. He Holds a M.Sc., degree in Computer Science. His research areas are machine learning, medical image analysis and pattern recognition. He has published over 26 research articles in referred and Scopus indexed international journals. He can be contacted at email: tsehayadmassu@inu.edu.et. Dr. Ravulapalli Lakshmi Tulasi is currently working as a Professor in the Department of Computer Science and Engineering, R.V.R & J.C College of Engineering, Guntur, Andhra Pradesh, India. Her research interests include Machine Learning, Data Mining, Information Retrieval Systems, and Semantic Web. She can be contacted at email: rtulasi.2002@gmail.com. Vadivel Elanangai is currently working as Assistant Professor in the Department of Electrical and Electronics Engineering at St. Peter's Institute of Higher Education and Research, AVADI, Chennai. She has 11 years of Teaching Experience. She is currently doing her research in Image Processing. Her current research interest includes Image Processing, VLSI Design, Fuzzy logic, Artificial Neural Network. She has also published research papers in reputed journals and conference proceedings. She can be contacted at email: elanagai123@gmail.com. Napa Komal Kumar is currently working as Assistant Professor in the Department of Computer Science and Engineering at St. Peter's Institute of Higher Education and Research, Avadi, Chennai. His research interests include Machine Learning, Data Mining, and Cloud Computing. He can be contacted at email: komalkumarnapa@gmail.com.