Code Slicing to Improve Case Classification Accuracy (original) (raw)

An Efficient and Effective Case Classification Method Based On Slicing

ijcim.th.org

One of the most important tasks that we have to face in real world applications is the task of classifying particular situations and /or events as belonging to a certain class. In order to solve the classification problem, accurate classifier systems or models must be built. Several computational intelligence methodologies have been applied to construct such a classifier from particular cases or data. This paper introduces a new classification method based on slicing techniques that was proposed for procedural programming languages. The paper also discusses two of common classification algorithms that are used either in data mining or in general AI. The algorithms are: Induction of Decision Tree Algorithm (ID3) and Base Learning Algorithm (C4.5). The paper also studies the comparison between the proposed method and the two selected classification algorithms using several domains.

Classification by similarity: An overview of statistical methods of case-based reasoning

Computers in Human Behavior, 1995

There has recently been a great deal of interest in case-based reasoning, the generation of solutions to new problems using methods which have served for similar problems in the past. Much of the commonly available computer software is however concerned with "case-retrieval." The latter involves the matching of an observation for which the outcome is not known, to a database of examples for which the outcome is known. Various types of case retrieval, or "classification by similarity" (CBS), algorithms are discussed. Several CBS algorithms, as well as various other techniques, were applied to two small datasets. Although more comparisons are required, the CBS algorithms were found to perform significantly better than a linear discriminant analysis on a predominantly binary dataset. A single-nearest-neighbor technique, first developed in the 1950s, performed particularly well on this dataset. A more sophisticated CBS algorithm, based upon a type of neural network, performed consistently well on both datasets. As CBS techniques generally encourage the

Retrieval of Similarity Measures of Code Component

IRA-International Journal of Technology & Engineering (ISSN 2455-4480), 2017

Modern programming languages, especially object oriented languages facilitate to create libraries of reusable components (e.g. class definition). The majority of software companies are designing the components and reusing those wherever applicable. Maintaining such components (i.e. class library) and accessing those at right time in right form is challenging because large no. of components in library. Object Oriented Programming supports the reusability of the code. The major challenge in programming is to improve the learning quality and productivity of the software developer, subject teachers and students. To support programming in Java, researcher implemented a design retrieval algorithm which will make it possible to search through potentially reusable Java classes. The proposed work, selects the appropriate descriptors of the inputted cases-.java files. It will separate the code components automatically and stores in the repository. The different levels of ambiguity in selection of cases are controlled through data preprocessing technique of data mining. The set of adjustments applied to get the similarity of the code components.

Retrieval of Java Program Code Components using Case Based Reasoning (CBR)

International Journal of Engineering and Advanced Technology, 2020

Object Oriented Programming (OOP) facilitates to create libraries of reusable software components. The reusability approach in developing a new system can be applied to an existing system with prior modifications. The reusability definitely decreases the time and effort required for developing the new system. To support reusability of program code, a proper code retrieval process is necessary. It makes possible to search the similar code component of java programming environment. OOP paradigm has specific style of writing the program code. The program code is a collection of objects, classes and methods. It is very easy to store the cases and reuse or revise wherever necessary. To get the similarity between the program code components, it is necessary to have an efficient retrieval method. The retrieval phase can retrieve the program code components as classes, methods, and interfaces depending on components selection by the user. A purely case-based approach is adopted for revising...

Feature Selection for Improving Case-Based Classifiers on High-Dimensional Data Sets

Flairs, 2005

Case-based reasoning (CBR) is a suitable paradigm for class discovery in molecular biology, where the rules that define the domain knowledge are difficult to obtain, and there is not sufficient knowledge for formal knowledge representation. To extend the capabilities of this paradigm, we propose logistic regression for CBR (LR4CBR), a method that uses logistic regression as a feature selection (FS) method for CBR systems. Our method not only improves the prediction accuracy of CBR classifiers in biomedical domains, but also selects a subset of features that have meaningful relationships with their class labels. In this paper, we introduce two methods to rank features for logistic regression. We show that using logistic regression as a filter FS method outperforms other FS techniques, such as Fisher and t-test, which have been widely used in analyzing biological data sets. The FS methods are combined with a computational framework for a CBR system called TA3. We also evaluate the method on two mass spectrometry data sets, and show that the prediction accuracy of TA3 improves from 90% to 98% and from 79.2% to 95.4%. Finally, we compare our list of discovered biomarkers with the lists of selected biomarkers from other studies for the mass spectrometry data sets, and show the overlapping biomarkers.

A New Strategy for Case-Based Reasoning Retrieval Using Classification Based on Association

Lecture Notes in Computer Science, 2016

Cased Based Reasoning (CBR) is an important area of research in the field of Artificial Intelligence. It aims to solve new problems by adapting solutions, that were used to solve previous similar ones. Among the four typical phases-retrieval, reuse, revise and retain, retrieval is a key phase in CBR approach, as the retrieval of wrong cases can lead to wrong decisions. To accomplish the retrieval process, a CBR system exploits Similarity-Based Retrieval (SBR). However, SBR tends to depend strongly on similarity knowledge, ignoring other forms of knowledge, that can further improve retrieval performance. The aim of this study is to integrate class association rules (CARs) as a special case of association rules (ARs), to discover a set (of rules) that can form an accurate classifier in a database.

Classification in the Retrieval Phase of Case-based Reasoning

2017

Case-based reasoning (CBR) is a problem solving technique that uses previous experiences to solve new problems. Among the four phases of CBR, Retrieval is the first and the most important phase, as it lays the foundation of the entire CBR cycle. Retrieval aims to retrieve similar cases from the case-base, given a new situation. CBR systems typically use a strategy called similarity-based retrieval for retrieving cases. One of the derivatives of similarity-based retrieval is k-nearest neighbor (k-NN) algorithm. In this paper, we compare the performances of k-NN, Fuzzy nearest neighbor (Fuzzy NN) and Genetic Programming (GP) classifiers for retrieval of cases. We evaluate these algorithms in WEKA, with benchmark data sets for classification from UCI.

Weighted Fuzzy Similarity Relations in Case-Based Reasoning: a Case Study in Classification

This paper describes a fuzzy similarity relation approach to Case-Based Reasoning. Residuated implication operators are used to create a fuzzy resemblance relation between cases, modeling the CBR basic principle "the more similar the problem descriptions are, the more similar the solution descriptions are" as a fuzzy gradual rule. We take the classification of Schistosomiasis prevalence estimation in a region of Brazil as case study, in order to investigate the effects in such a framework of weighting cases individually in classification tasks, considering a set of training strategies.

Integrating rules and cases for the classification task

Lecture Notes in Computer Science, 1995

The recent progress in Case-Based Reasoning has shown that one of the most important challenges in developing future AI methods will be to combine and synergistically utilize general and case-based knowledge. In this paper a very rudimentary kind of integration for the classification task, based on simple heuristics, is sketched: "To solve a problem, first try to use the conventional rulebased approach. If it does not work, try to remember a similar problem you have solved in the past and adapt the old solution to the new situation". This heuristic approach is based on the knowledge base that consists of rule base and exception case base. The method of generating this kind of knowledge base from a set of examples is described. The proposed approach is tested, and compared with alternative approaches. The experimental results show that the presented integration method can lead to an improvement in accuracy and comprehensibility. particular experience, exceptions and/or non-typical situations. The problem solver can classify a new case by means of the following algorithm: If a new case is covered by some rule Then apply a solution from a rule with the highest priority Else adapt the solution from the most similar case