Diagnosis of Breast Cancer on Decision Tree Modeling with R Packages (original) (raw)

Classification of Breast Cancer Tissues using Decision Tree Algorithms

Nowadays, Healthcare sector data are enormous, composite and diverse because it contains a data of different types and getting knowledge from that data is essential. So for this purpose, data mining techniques may be utilized to mine knowledge by building models from healthcare dataset. At present, the classification of breast cancer patients has been a demanding research confront for many researchers. For building a classification model for the cancer patient, we used four different classification algorithms such as J48, REPTree, RandomForest, and RandomTree and tested on the dataset taken from UCI. The main aim of this paper is to classify the patient into benign (not cancer) or malignant (cancer), based on some diagnostic measurements integrated into the dataset.

Breast Cancer Classification using Decision Tree Algorithms

International Journal of Advanced Computer Science and Applications

Cancer is a major health issue that affects individuals all over the world. This disease has claimed the lives of many people, and will continue to do so in the future. Breast cancer has recently surpassed cervical cancer as the most frequent cancer among women in both industrialized and developing countries and it is now the second leading cause of cancer mortality among women. A high number of women die each year as a result of this disease. Breast cancer is significantly easier to treat if caught early. This paper introduces a decision tree-based data mining technique for breast cancer early detection with highest accuracy, which helps patients to recover. Breast cancers are classed as benign (unable to penetrate surrounding tissue) or malignant (able to infiltrate adjacent tissue) breast growths. Two tests were included in the review. The primary study uses 10 breast cancer samples from the Kaggle archive, whereas the follow-up study uses 286 breast cancer samples from the same pool. The Decision Tree's accuracy in the first trial was 100%, while it was 97.9% in the follow-up inquiry. These findings justify the use of the proposed machine learning-based Decision Tree classifier in pre-evaluating patients for triage and decision-making prior to the availability of data.

Diagnosis of Breast Cancer using Decision Tree Data Mining Technique

Cancer is a big issue all around the world. It is a disease, which is fatal in many cases and has affected the lives of many and will continue to affect the lives of many more. Breast cancer represents the second primary cause of cancer deaths in women today and has become the most common cancer among women both in the developed and the developing world in the last years. 40,000 women die in a year from this disease, which is one woman every 13 minute dying from this disease everyday.

Comparison of decision tree methods for breast cancer diagnosis

2013

In almost all parts of the world, breast cancer is one of the major causes of death among women. But at the same time, it is one of the most curable cancers if it is diagnosed at early stage. This paper tries to find a model that diagnoses and classifies breast cancer with high accuracy and that will help to both patients and doctors in the future. Here we present several different decision tree methods in order to classify breast cancer with high accuracy. The results achieved in this research are very promising (accuracy is 96.49 %). It is very promising result compared to previous researches where decision tree techniques were used. As benchmark test, Breast Cancer Wisconsin (Original) was used.

Performance Analysis of Breast Cancer Classification Using Decision Tree Classifiers

International Journal of Current Pharmaceutical Research, 2017

Breast cancer is one of the dangerous cancers among world’s women above 35 y. The breast is made up of lobules that secrete milk and thin milk ducts to carry milk from lobules to the nipple. Breast cancer mostly occurs either in lobules or in milk ducts. The most common type of breast cancer is ductal carcinoma where it starts from ducts and spreads across the lobules and surrounding tissues. According to the medical survey, each year there are about 125.0 per 100,000 new cases of breast cancer are diagnosed and 21.5 per 100,000 women due to this disease in the United States. Also, 246,660 new cases of women with cancer are estimated for the year 2016. Early diagnosis of breast cancer is a key factor for long-term survival of cancer patients. Classification plays an important role in breast cancer detection and used by researchers to analyse and classify the medical data. In this research work, priority-based decision tree classifier algorithm has been implemented for Wisconsin Brea...

Classification and feature selection of breast cancer data based on decision tree algorithm

Studies in Informatics and Control, 2003

Medical information systems have received a lot of research attention in the past. As a result of advances in hardware and software technologies, the nature of medical information systems has changed from only performing record keeping functions to more decision making oriented functionalities. Large collections of medical data are valuable resource from which potentially new and useful knowledge can be discovered through data mining. Data mining is an increasingly popular field that uses statistical, visualization, machine learning, and other data manipulation and knowledge extraction techniques aiming at gaining an insight into the relationships and patterns hidden in the data. It is very useful if results of data mining can be communicated to humans in an understandable way. In this paper, we introduce an efficient symbolic machine learning algorithm to identify the important breast cancer attributes needed for interpretation. The proposed technique is based on an inductive decision tree learning algorithm that has low complexity with high transparency and accuracy. In addition, among all features, we use only the subset of features that leads to the best performance. The proposed technique is evaluated using real data of 699 samples for building the decision tree. Evaluation shows that the ratio of correct classification of new cases is high.

A Novel Approach to Perform Analysis and Prediction on Breast Cancer Dataset using R

International Journal of Grid and Distributed Computing, 2018

Screening shows impact on cancer mortality rate by decreasing the number of advanced cancers with poor diagnosis, while cancer treatment works through decreasing the case-fatality rate. The prediction of breast cancer survivability has been a challenging research problem for many researchers. The objective of this research work is to propose a Novel model that can analysis the Breast cancer data and do efficient prediction. The contributions made in this paper are as follows, we collected three different the dataset from UCI Machine Learning repositories. We propose an approach, where a detailed comparison made between feature selection algorithms. Trained the datasets using Decision Tree, Random Forest and Support vector machine (SVM) machine learning algorithms. An attempt made to understand the impact of model selection metric in predicting different classes of Brest cancer. The results indicated that the Random forest is the best predictor wit 0.98 accuracy on the holdout sample, SVM came out to be the second with 0.97 accuracy and the Decision Tree came out with 0.96 to be the worst of the four condition tree with 0.95 accuracy. Finally performed prediction using Neural Network with three hidden layers and measured the efficiency, using Root Mean Square Error (RMSE) along with its variations.

Decision Tree Algorithms for Predictive Modeling in Breast Cancer Treatment

2022 IEEE 2nd International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering (MI-STA), 2022

Successful treatment of breast cancer increases a high chance of survival among women. Machine Learning could support the discovery of knowledge and important patterns from medical data by a good predictive model. The study aims to identify the effective and predictive modeling for the primary treatment of breast cancer after diagnosis. The data-set was collected from patients records who have malignant from the Benghazi Medical Center. Decision Tree algorithms J48, CART, and Random Forest were used to build three models by WEKA software. The data-set was divided into two subsets training and testing. Accuracy of Prediction, Sensitivity test, Specificity test, Area under curve, Kappa statistics, and Mean Absolute Error was used to compare models' performance. The study results showed that the Random Forest was the better model for "training data" and "test data" compared to J48 and CART models. Also, the Random Forest model could provide good information and is able to recognize the important patterns. This study concluded that the Random Forest might a good model for detecting breast cancer treatment.

Predicting Breast Cancer Treatment Using Decision Tree Algorithms and Statistical Metrics

Background: In both developed and developing countries, breast cancer is the most common cancer in women. Also, it is the second main cause of cancer death in women. However, appropriate and successful treatment has a positive effect on the survival rate for a patient with cancer according to WHO's report in 2016. Classification algorithms are frequently used to analyze breast cancer data to predict. The main objective of this research is to identify the best prediction model for breast cancer treatment by using decision tree algorithms. Materials and Methods: Data were collected from the patients' records at the BMC such as patients' ages and stages of the disease. The dataset was 336 patients with malignant and 10 features. Three of the decision tree algorithms were used to develop breast cancer prediction models; the J48, CART, and Random Forest were used as classifiers. This research used WEKA software to build and evaluate the models. Statistical performance metrics were used toevaluate the models such asClassifier Accuracy,Kappa Statistic, and ROC Curves. Results: Experimental results showed that the effectiveness of all models. But the Random Forest classifier has performed better well with the training dataset. Also, the results showed that the sensitivity, specificity, ROC area, and classification accuracy of the Random Forest model has achieved above 91% success for the test dataset. Also it had highest value of Kappa Statistic, and lowest value of Mean absolute error. Conclusion: The research was concluded that the Random Forest algorithm was identified as the best predictive model of training dataset and test dataset in this research.

A Support Vector Machine and Decision Tree Based Breast Cancer Prediction

International Journal of Engineering and Advanced Technology

The first step in diagnosis of a breast cancer is the identification of the disease. Early detection of the breast cancer is significant to reduce the mortality rate due to breast cancer. Machine learning algorithms can be used in identification of the breast cancer. The supervised machine learning algorithms such as Support Vector Machine (SVM) and the Decision Tree are widely used in classification problems, such as the identification of breast cancer. In this study, a machine learning model is proposed by employing learning algorithms namely, the support vector machine and decision tree. The kaggle data repository consisting of 569 observations of malignant and benign observations is used to develop the proposed model. Finally, the model is evaluated using accuracy, confusion matrix precision and recall as metrics for evaluation of performance on the test set. The analysis result showed that, the support vector machine (SVM) has better accuracy and less number of misclassificatio...