Rough Set Approach for Generation of Classification Rules of Breast Cancer Data (original) (raw)
Related papers
Journal of the American society for …, 2004
Rough set theory is a relatively new intelligent technique used in the discovery of data dependencies; it evaluates the importance of attributes, discovers the patterns of data, reduces all redundant objects and attributes, and seeks the minimum subset of attributes. Moreover, it is being used for the extraction of rules from databases. In this paper, we present a rough set approach to attribute reduction and generation of classification rules from a set of medical datasets. For this purpose, we first introduce a rough set reduction technique to find all reducts of the data that contain the minimal subset of attributes associated with a class label for classification. To evaluate the validity of the rules based on the approximation quality of the attributes, we introduce a statistical test to evaluate the significance of the rules. Experimental results from applying the rough set approach to the set of data samples are given and evaluated. In addition, the rough set classification accuracy is also compared to the well-known ID3 classifier algorithm. The study showed that the theory of rough sets is a useful tool for inductive learning and a valuable aid for building expert systems.
Rough Set Theory and Decision Rules in Data Analysis of Breast Cancer Patients
Lecture Notes in Computer Science, 2004
In this paper an approach based on the rough set theory and induction of decision rules is applied to analyse relationships between condition attributes describing breast cancer patients and their treatment results. The data set contains 228 breast cancer patients described by 16 attributes and is divided into two classes: the 1st class-patients who had not experienced cancer recurrence; the 2nd class-patients who had cancer recurrence. In the first phase of the analysis, the rough sets based approach is applied to determine attribute importance for the patients' classification. The set of selected attributes, which ensured high quality of the classification, was obtained. Then, the decision rules were generated by means of the algorithm inducting the minimal cover of the learning examples. The usefulness of these rules for predicting therapy results was evaluated by means of the cross-validation technique. Moreover, the syntax of selected rules was interpreted by physicians. Proceeding in this way, they formulated some indications, which may be helpful in making decisions referring to the treatment of breast cancer patients. To sum up, this paper presents a case study of applying rough sets theory to analyse medical data.
Hybrid system based on rough sets and genetic algorithms for medical data classifications
2013
Computational intelligence provides the biomedical domain by a significant support. The application of machine learning techniques in medical applications have been evolved from the physician needs. Screening, medical images, pattern classification, prognosis are some examples of health care support systems. Typically medical data has its own characteristics such as huge size and features, continuous and real attributes that refer to patients' investigations. Therefore, discretization and feature selection process are considered a key issue in improving the extracted knowledge from patients' investigations records. In this paper, a hybrid system that integrates Rough Set (RS) and Genetic Algorithm (GA) is presented for the efficient classification of medical data sets of different sizes and dimensionalities. Genetic Algorithm is applied with the aim of reducing the dimension of medical datasets and RS decision rules were used for efficient classification. Furthermore, the proposed system applies the Entropy Gain Information (EI) for discretization process. Four biomedical data sets are tested by the proposed system (EI-GA-RS), and the highest score was obtained through three different datasets. Other different hybrid techniques shared the proposed technique the highest accuracy but the proposed system preserves its place as one of the highest results systems four three different sets. EI as discretization technique also is a common part for the best results in the mentioned datasets while RS as an evaluator realized the best results in three different data sets.
Application of Rough Set Theory in Medical Health Care Data Analytics
International Journal of Advanced Science and Technology, 2019
Rough Set theory (RST) is a mathematical tool and used to deal with vagueness, impreciseness, inconsistence and uncertain type knowledge. RST-based research has been applied in machine learning, inductive reasoning, decision support systems and knowledge discovery applications. Popular methods like finding of reducts, core, feature selection and reduction through the concepts of approximations have attracted researchers to use RST further in the field of high dimensional data like social networks, IoT applications and Big data analytics. In this article we make an attempt to summarize the basic concepts, characteristics of RST, some evolutionary extensions of RST and applications limited to Medical data analysis. In keeping the view of learners, a survey on RST based software tools and packages outlined with their exhaustive functionalities. It also identifies the importance of RST in the domain of medical or clinical data analytics, and also exhibits the strengths and limitations of the respective underlying approaches.
Applications of Rough Sets in Health Sciences and Disease Diagnosis
2015
Soft computing is a consortium of techniques that work together to setup flexible information processing capability for handling real-life ambiguous situations. It aims at solving problems involving uncertainty and imprecision mimicking the human like decision making. Fuzzy set theory is an approach that has been widely adopted in such situations. Rough Set Theory (RST) is another soft computing approach that uses sets to represent vague or incomplete knowledge and provide a framework for approximation of concepts. It has been widely used to deal with imprecision in health sciences such as in patient diagnosis and disease classification. In this paper we present a review of rough set theory and its applications in disease diagnosis with several examples using real data sets. Key-Words: Rough Set Theory, Soft Computing, Vague data, Imprecision, Health Sciences, Disease diagnosis
Knowledge Mining from Clinical Datasets Using Rough Sets and Backpropagation Neural Network
Computational and Mathematical Methods in Medicine, 2015
The availability of clinical datasets and knowledge mining methodologies encourages the researchers to pursue research in extracting knowledge from clinical datasets. Different data mining techniques have been used for mining rules, and mathematical models have been developed to assist the clinician in decision making. The objective of this research is to build a classifier that will predict the presence or absence of a disease by learning from the minimal set of attributes that has been extracted from the clinical dataset. In this work rough set indiscernibility relation method with backpropagation neural network (RS-BPNN) is used. This work has two stages. The first stage is handling of missing values to obtain a smooth data set and selection of appropriate attributes from the clinical dataset by indiscernibility relation method. The second stage is classification using backpropagation neural network on the selected reducts of the dataset. The classifier has been tested with hepat...
A Framework for Intelligent Medical Diagnosis Using Rough Set with Formal Concept Analysis
International Journal of Artificial Intelligence & Applications, 2011
Medical diagnosis process vary in the degree to which they attempt to deal with different complicating aspects of diagnosis such as relative importance of symptoms, varied symptom pattern and the relation between diseases them selves. Based on decision theory, in the past many mathematical models such as crisp set, probability distribution, fuzzy set, intuitionistic fuzzy set were developed to deal with complicating aspects of diagnosis. But, many such models are failed to include important aspects of the expert decisions. Therefore, an effort has been made to process inconsistencies in data being considered by Pawlak with the introduction of rough set theory. Though rough set has major advantages over the other methods, but it generates too many rules that create many difficulties while taking decisions. Therefore, it is essential to minimize the decision rules. In this paper, we use two processes such as pre process and post process to mine suitable rules and to explore the relationship among the attributes. In pre process we use rough set theory to mine suitable rules, whereas in post process we use formal concept analysis from these suitable rules to explore better knowledge and most important factors affecting the decision making.
A combined data mining approach using rough set theory and case-based reasoning in medical datasets
Decision Science Letters, 2014
Case-based reasoning (CBR) is the process of solving new cases by retrieving the most relevant ones from an existing knowledge-base. Since, irrelevant or redundant features not only remarkably increase memory requirements but also the time complexity of the case retrieval, reducing the number of dimensions is an issue worth considering. This paper uses rough set theory (RST) in order to reduce the number of dimensions in a CBR classifier with the aim of increasing accuracy and efficiency. CBR exploits a distance based co-occurrence of categorical data to measure similarity of cases. This distance is based on the proportional distribution of different categorical values of features. The weight used for a feature is the average of cooccurrence values of the features. The combination of RST and CBR has been applied to real categorical datasets of Wisconsin Breast Cancer, Lymphography, and Primary cancer. The 5fold cross validation method is used to evaluate the performance of the proposed approach. The results show that this combined approach lowers computational costs and improves performance metrics including accuracy and interpretability compared to other approaches developed in the literature.
Rough Set Approach in Machine Learning: A Review
International Journal of Computer Applications, 2012
The Rough Set (RS) theory can be considered as a tool to reduce the input dimensionality and to deal with vagueness and uncertainty in datasets. Over the years, there has been a rapid growth in interest in rough set theory and its applications in artificial intelligence and cognitive sciences, especially in research areas such as machine learning, intelligent systems, inductive reasoning, pattern recognition, data preprocessing, knowledge discovery, decision analysis, and expert systems. This paper discusses the basic concepts of rough set theory and point out some rough set-based research directions and applications. The discussion also includes a review of rough set theory in various machine learning techniques like clustering, feature selection and rule induction.
Rough Sets in Medical Informatics Applications
2009
Rough sets offer an effective approach of managing uncertainties and can be employed for tasks such as data dependency analysis, feature identification, dimensionality reduction, and pattern classification. As these tasks are common in many medical applications it is only natural that rough sets, despite their relative 'youth'compared to other techniques, provide a suitable method in such applications.