New learning models for generating classification rules based on rough set approach (original) (raw)

A Rough Set Based Classification Model for the Generation of Decision Rules

International Journal of Database Theory and Application, 2014

This paper introduces a very important classification aspect for the analysis of huge amount of data stored in databases and other repositories. Numerous classification models are available in the literature, to predict the class of objects whose class level is unknown. Literature reveals that most of the available models are not capable in handling imperfect data. In view of this, present paper proposes a new rough set based classification model to derive the classification (IF-THEN) rules. Furthermore, developed model has been applied to handle bank-loan applications database as either safe, unsafe or risky. However, proposed model can also be used for the analysis of data from other domains.

Rough Set Approach in Machine Learning: A Review

International Journal of Computer Applications, 2012

The Rough Set (RS) theory can be considered as a tool to reduce the input dimensionality and to deal with vagueness and uncertainty in datasets. Over the years, there has been a rapid growth in interest in rough set theory and its applications in artificial intelligence and cognitive sciences, especially in research areas such as machine learning, intelligent systems, inductive reasoning, pattern recognition, data preprocessing, knowledge discovery, decision analysis, and expert systems. This paper discusses the basic concepts of rough set theory and point out some rough set-based research directions and applications. The discussion also includes a review of rough set theory in various machine learning techniques like clustering, feature selection and rule induction.

Learning rules from incomplete training examples by rough sets

Expert Systems with Applications, 2002

Machine learning can extract desired knowledge from existing training examples and ease the development bottleneck in building expert systems. Most learning approaches derive rules from complete data sets. If some attribute values are unknown in a data set, it is called incomplete. Learning from incomplete data sets is usually more dif®cult than learning from complete data sets. In the past, the rough-set theory was widely used in dealing with data classi®cation problems. In this paper, we deal with the problem of producing a set of certain and possible rules from incomplete data sets based on rough sets. A new learning algorithm is proposed, which can simultaneously derive rules from incomplete data sets and estimate the missing values in the learning process. Unknown values are ®rst assumed to be any possible values and are gradually re®ned according to the incomplete lower and upper approximations derived from the given training examples. The examples and the approximations then interact on each other to derive certain and possible rules and to estimate appropriate unknown values. The rules derived can then serve as knowledge concerning the incomplete data set. q

Application of Rough Set Theory in Data Mining

Rough set theory is a new method that deals with vagueness and uncertainty emphasized in decision making. Data mining is a discipline that has an important contribution to data analysis, discovery of new meaningful knowledge, and autonomous decision making. The rough set theory offers a viable approach for decision rule extraction from data.This paper, introduces the fundamental concepts of rough set theory and other aspects of data mining, a discussion of data representation with rough set theory including pairs of attribute-value blocks, information tables reducts, indiscernibility relation and decision tables. Additionally, the rough set approach to lower and upper approximations and certain possible rule sets concepts are introduced. Finally, some description about applications of the data mining system with rough set theory is included.

Rough Set Algorithms in Classification Problem

Studies in Fuzziness and Soft Computing, 2000

In the paper we present some algorithms, based on rough set theory, that can be used for the problem of new cases classification. Most of the algorithms were implemented and included in Rosetta system [43]. We present several methods for computation of decision rules based on reducts. We discuss the problem of real value attribute discretization for increasing the performance of algorithms and quality of decision rules. Finally we deal with a problem of resolving conflicts between decision rules classifying a new case to different categories (classes).

A Generic Scheme for Generating Prediction Rules Using Rough Sets

2009

This chapter presents a generic scheme for generating prediction rules based on rough set approach for stock market prediction. To increase the efficiency of the prediction process, rough sets with Boolean reasoning discretization algorithm is used to discretize the data. Rough set reduction technique is applied to find all the reducts of the data, which contains the minimal subset of attributes that are associated with a class label for prediction. Finally, rough sets dependency rules are generated directly from all generated reducts.

A new Enhanced Automated Fuzzy-Based Rough Decision Model

In this paper, we introduce a new automated Fuzzy-Based Rough Decision Model algorithm. Our Algorithm consists of three phases are: () automatic attributes fuzzification, () eliminate redundant attributes using rough set theory, () Generating Fuzzy rough rules then calculate automatically fitness value (Confidence) and support for each rule. In phase one, the user input the number of fuzzy sets of each attributes, our algorithm determine the maximum and minimum values of each attribute that define and calculates automatically the width (∆) which divides the universe of discourse of each attribute into " n " intervals according to the number of fuzzy sets, also the algorithm calculates automatically the width (δ i) according to the width (∆). In phase two, we use the rough set techniques to reduce the number of attributes that comes from phase one and produce fuzzy-rough rules. In phase three, our algorithm automatically calculates the confidence (weight or fitness value) and support of each fuzzy rough rule then it calculates the total weight or fitness value of all linguistic rules. The result of fitness value of our algorithm that applied on dataset in (X.Hu, T.Lin, and J.Han.) [ ] is. Rough sets theory was first introduced by Pawlak in the s [ , , , ] and it has been applied in many applications such as machine learning, knowledge discovery, and expert systems since then. It provides powerful tools for data analysis and data mining from imprecise and ambiguous data. Many rough sets models have been developed in the rough set community in the last decades [ , , ]. Applying the traditional rough set models in large data sets in data mining applications has shown that one of the strong drawbacks of the classical rough set theory assumes that all attributes values are discrete. In real life datasets values of attributes could be both of symbolic and real-valued. Therefore, the traditional rough set theory will have difficulty in handling such values. There is a need for some methods which have the capability of utilizing set approximations and attributes reduction for real-valued attributed dataset. This can be done by combining (integrate) fuzzy sets and rough sets in a Fuzzy-Based Rough Model [ , , , , , ]. Another major drawback of traditional Fuzzy-Based Rough Model is that the linguistic values (fuzzy sets) for numeric values of each attribute should determining by the membership functions of these linguistic terms which, every element in the universe of discourse is a member of the fuzzy set with some grade (degree of membership functions). Therefore, the user should define the parameters of membership functions of these linguistic values from his view which is different from one user to another. Therefore, we propose a new automated Fuzzy-Based Rough Decision Model algorithm that can define the parameters of membership functions of these linguistic values automatically that the user determine only the number of fuzzy sets (linguistic values) then the maximum and minimum values of each attribute are determined automatically then the algorithm calculates the width (∆) that divides the universe of discourse " u " of each attribute into " n " intervals according to the number of fuzzy sets then the algorithm calculates automatically the width (δi) according to the width (∆). Another strong drawback of the traditional rough set theory is the inefficiency of rough set methods and algorithms of computing the core attributes and reduct and identifying the dispensable attributes, which limits the suitability of the traditional rough set model in data mining applications. Further investigation of the problem reveals that most existing rough set models [ , , , ] do not integrate with the relational database systems and a lot of computational intensive operations are performed in flat files rather than utilizing the high performance database set operations. Moreover, not much attention and attempt have been paid to design new rough sets model by effectively combining database technologies to generate the core attributes and reducts so as to make their computations efficient and scalable in the large data set. To overcome this problem, a New Rough Sets Model Based on Database Systems has been introduced [ , ] for this purpose to redefine some concepts of rough set theory such as core attributes and reducts by using relational algebra so that the computation of core attributes and reducts can be performed with very efficient set-oriented database operations, such as Cardinality to denote the count and Projection. The paper is organized as follows: We give an overview of the rough set theory based on the model proposed by Pawlak [ , ] with some examples in Section. In Section , we give an

Rough Set : Buzzword of Data Classification

2016

Classification is an important Data Mining Technique with broad applications in every walk of life. It is termed as classifying each item in a set of data into one of predefined set of classes or groups. The present study compares the performance evaluation of Naïve Bayes, Random Forest, k Star, Multilayer Preceptron, j48 classification algorithm and Rough Set Theory. The paper presents the experimental results about classification accuracy and explores that the accuracy of Rough Set Theory is improved than other