PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS (original) (raw)

COMPARATIVE ANALYSIS OF CLASSIFICATION TECHNIQUES USING WEKA

Data Mining is the process of extracting interesting, non-trivial, implicit, previously unknown and potentially useful patterns or knowledge with the help of various techniques from various data sources. Classification is the process of finding a model that describes and distinguishes data classes or concepts. There exist several algorithms for classification in data mining, these algorithms have their strengths and weaknesses, and there is no single algorithm that is most suitable for all classes of data. This project is directed at evaluating the performance of three classification algorithms, i.e., decision tree algorithm, naïve bayes algorithm, and k-nearest Neighbour algorithm. Waikato Environment for Knowledge Analysis (WEKA) was used to analyze the algorithms; performance parameters include classification accuracy, error rate, execution time, confusion matrix, and area under the curve. Five datasets were used for the analysis, which are the Iris dataset, chronic kidney disease dataset, Breast cancer dataset, diabetes dataset, and hypothyroid dataset. The datasets were obtained from the UCI Machine Repository and split into training and testing; 60% 40% and 70% 30%. The decision tree algorithm was found to be more accurate than the naive bayes algorithm and K-NN algorithm. In terms of Execution time, K-NN outperforms naive bayes and decision trees on the five datasets. Moreover, K-NN has more percentage of error recorded on average on the five datasets. Therefore, no particular algorithm is best suited for a specific situation, the performance of classification algorithms depends on the type and size of datasets, i.e., one algorithm is more appropriate for one dataset while another algorithm is not appropriate for the same dataset.

Comparison of Different Datasets Using Various Classification Techniques with Weka

2014

Data Mining refers to mining or extracting knowledge from huge volume of data. Classification is used to classify each item in set of data into one of the predefined set of classes. In data mining, an important technique is classification, generally used in broad applications, which classifies various kinds of data. In this paper, different datasets from University of California, Irvine (UCI) are compared with different classification techniques. Each technique has been evaluated with respect to accuracy and execution time and performance evaluation has been carried out with J48, Simple CART (Classification and Regression Testing), and BayesNet and NaiveBayesUpdatable Classification algorithm.

Performance evaluation of different classification techniques using different datasets

International Journal of Electrical and Computer Engineering (IJECE), 2019

Nowadays data mining become one of the technologies that paly major effect on business intelligence. However, to be able to use the data mining outcome the user should go through many processes such as classified data. Classification of data is processing data and organize them in specific categorize to be use in most effective and efficient use. In data mining one technique is not applicable to be applied to all the datasets. Many data users wasting a lot of time trying many classification techniques in order to find the most an appropriate technique to be used. This paper showing the difference result of applying different techniques on the same data. This paper evaluates the performance of different classification techniques using different datasets. In this study four data classification techniques have chosen. They are as follow, BayesNet, NaiveBayes, Multilayer perceptron and J48. The selected data classification techniques performance tested under two parameters, the time taken to build the model of the dataset and the percentage of accuracy to classify the dataset in the correct classification. The experiments are carried out using Weka 3.8 software. The results in the paper demonstrate that the efficiency of Multilayer Perceptron classifier in overall the best accuracy performance to classify the instances, and NaiveBayes classifiers were the worst outcome of accuracy to classifying the instance for each dataset.

An Empirical Evaluation of Data Mining Classification Algorithms

Data Mining is the process of extracting interesting knowledge from large datasets by joining methods from statistics and artificial intelligence with database management. Classification is one of the main functionality in the field of data mining. Classification is the forms of data analysis that can be used to extract models describing important data classes The well known classification methods are Decision tree classification, Neuaral network classification, Naïve Bayes Classification, k-nearest neighbor classification and Support Vector Machine (SVM) classification. In this paper, we present the comparison of five classification algorithms, J48; which is based on C4.5 decision tree based learning, Multilayer perceptron (MLP); uses the multilayer feed forward neural network approach, Instance based K-nearest neighbour (IBK), Naive Bayse (NB), and Sequential Minimal Optimization (SMO); is an extension of support vector machine. Performance of these classification algorithms are compared with respect to classifier accuracy, error rates, building time of classifier and other statistical measures on WEKA tool. The result showed that there is no universal classification algorithm which works better for all the dataset.

Evaluation of Various Classification Techniques of Weka Using Different Datasets

International Journal of Advance Research and Innovative Ideas in Education, 2016

In this paper we have compared various classification methods using UCI machine learning dataset under WEKA. We have used three measuring factors which names are Accuracy, kappa statistics and mean absolute error for execution by each technique is observed during experiment. This work has been carried out to make a performance evolution of J48, Multilayerperceptron, Naive Bayes and SMO classifier. On Account of this work we have used four type of secondary data.

PERFORMANCE EVALUATION OF THE DATA MINING CLASSIFICATION METHODS

The paper aims to analyze how the performance evaluation of different classification models from data mining process. Classification is the most widely used data mining technique of supervised learning. This is the process of identifying a set of features and templates that describe the data classes or concepts. We applied various classification algorithms on different data sets to streamline and improve the algorithm performance.

Comparison of Different Classification Techniques Using Different Datasets

2013

In this paper different classification techniques of Data Mining are compared using diverse datasets from University of California, Irvine(UCI). Accuracy and time required for execution by each technique is observed. The Data Mining refers to extracting or mining knowledge from huge volume of data. Classification is an important data mining technique with broad applications. It classifies data of various kinds. Classification is used in every field of our life. Classification is used to classify each item in a set of data into one of predefined set of classes or groups. This work has been carried out to make a performance evaluation of J48, MultilayerPerceptron, NaiveBayesUpdatable, and BayesNet classification algorithm. Naive Bayes algorithm is based on probability and j48 algorithm is based on decision tree. The paper sets out to make comparative evaluation of classifiers J48, MultilayerPerceptron, NaiveBayesUpdatable, and BayesNet in the context of Labour, Soyabean and Weather da...

Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool

Citation/Export MLA S. R. Priyanka Shetty, Sujata Joshi, “Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool”, March 15 Volume 3 Issue 3 , International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), ISSN: 2321-8169, PP: 1168 - 1173, DOI: 10.17762/ijritcc2321-8169.150361 APA S. R. Priyanka Shetty, Sujata Joshi, March 15 Volume 3 Issue 3, “Performance Analysis of Different Classification Methods in Data Mining for Diabetes Dataset Using WEKA Tool”, International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), ISSN: 2321-8169, PP: 1168 - 1173, DOI: 10.17762/ijritcc2321-8169.150361

Comparative Analysis of Various Decision Tree Classification Algorithms using WEKA

Classification is a technique to construct a function or set of functions to predict the class of instances whose class label is not known. Discovered knowledge is usually presented in the form of high level, easy to understand classification rules. There is various classification techniques used to classify the data, one of which is decision tree algorithms. This paper presents a comparative analysis of various decision tree based classification algorithms. In experiments, the effectiveness of algorithms is evaluated by comparing the results on 5 datasets from the UCI and KEEL repository.

PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS (original) (raw)

Related papers