Learning Process Termination Criteria (original) (raw)
Related papers
Learning curves: Asymptotic values and rate of convergence
1994
Training classifiers on large databases is computationally demanding. It is desirable to develop efficient procedures for a reliable prediction of a classifier's suitability for implementing a given task, so that resources can be assigned to the most promising candidates or freed for exploring new classifier candidates. We propose such a practical and principled predictive method. Practical because it avoids the costly procedure of training poor classifiers on the whole training set, and principled because of its theoretical foundation. The effectiveness of the proposed procedure is demonstrated for both single-and multi-layer networks.
An Empirical Study of Fitting Learning Curves
2001
Abstract: It is well known that many learning algorithms have diminishing returns for increased training data size. This paper empirically studies fitting learning curves of large data sets in search of a principled stopping criterion. Such a criterion is particularly useful when the data size is huge as in most data mining applications. Learning curves are obtained by running decision tree algorithm C4.
Early Assessment of Classification Performance
2004
The ability to distinguish between objects is the fundamental to learning and intelligent behavior in general. The difference between two things is the information we seek; the processed information is actually the base for the knowledge. Automatic extraction of knowledge has been in interest ever since the advent of computing, and has received a wide attention with the successes of data mining. One of the tasks of data mining is also classification, which provides a mapping from attributes (observations) to pre-specified classes. Based on the distinction between the objects they are mapped into different classes.
Currently for small-scale machine learning projects, there is no limit which has been set by its researchers to categorise datasets for inexperienced users such as students while assessing and comparing performance of machine learning algorithms. Based on the lack of such a threshold, this paper presents a step by step guide for identifying the dataset threshold for the performance estimators in supervised machine learning experiments. The identification of the dataset threshold involves performing experiments using four different datasets having different sample sizes from the University of California Irvine (UCI) machine learning repository. The sample sizes are categorised in relation to the number of attributes and number of instances available in the dataset. The identified dataset threshold will help unfamiliar machine learning experimenters to categorise datasets correctly and hence selecting the appropriate performance estimation method.
PERFORMANCE ANALYSIS OF LEARNING AND CLASSIFICATION
There are different learning and classification algorithms that are used to learn patterns and categorize data according to the frequency and relationship between attributes in a given dataset. The desired result has always been higher accuracy in predicting future values or events from the given dataset. These algorithms are crucial in the field of knowledge discovery and data mining and have been extensively researched and improved for accuracy.
Model Selection and Assessment for Classification Using Validation
Lecture Notes in Computer Science, 2005
We address the problem of determination of the size of the test set which can can guarantee statistically significant results in classifier error estimation and in selection of the best classifier from a given set. We focus on the case of the 0-1 valued loss function and we provide one and two sides optimal bounds for Validation (known also as Hold-Out Estimate and Train-and-Test Method). We also calculate the smallest sample size, necessary for obtaining the bound for given estimation accuracy and reliability of estimation, and we present the results in tables. Finally, we propose strategies for classifier design using the bounds derived.
An Empirical Comparison Of Supervised Learning Processes
International Journal of …
Data mining as a formal discipline is only two decades old, but it has registered phenomenal development and has become a mature discipline in this short span. In this paper, we present an empirical study of supervised learning processes based on empirical evaluation of different classification algorithms. We have included most of the supervised learning processes based on different pre pruning and post pruning criteria. We have included ten datasets, collected from internationally renowned agencies. Different specific models are presented and results are generated. Issues related to different processes are analyzed suitably. We also present a comparison of our study with benchmark results of different datasets and classification algorithms. We have presented results of all algorithms with fifteen different performance measures out of a set of twenty three calculated measures, making it a comprehensive study.
A Comparative Framework for Evaluating Classification Algorithms
Data mining methods have been widely used for extracting precious knowledge from large amounts of data. Classification algorithms are the most popular models. The model is selected with respect to its classification accuracy; therefore, the performance of each classifier plays a very crucial role. This paper discusses the application of some classification models on multiple datasets and compares the accuracy of the results. The relationship between dataset characteristics and accuracy is also debated, and finally, a regression model is introduced for predicting the classifier accuracy on a given dataset.
We examine the learning-curve sampling method, an approach for applying machinelearning algorithms to large data sets. The approach is based on the observation that the computational cost of learning a model increases as a function of the sample size of the training data, whereas the accuracy of a model has diminishing improvements as a function of sample size. Thus, the learning-curve sampling method monitors the increasing costs and performance as larger and larger amounts of data are used for training, and terminates learning when future costs outweigh future benefits. In this paper, we formalize the learning-curve sampling method and its associated cost-benefit tradeoff in terms of decision theory. In addition, we describe the application of the learning-curve sampling method to the task of model-based clustering via the expectation-maximization (EM) algorithm. In experiments on three real data sets, we show that the learning-curve sampling method produces models that are nearly as accurate as those trained on complete data sets, but with dramatically reduced learning times. Finally, we describe an extension of the basic learning-curve approach for model-based clustering that results in an additional speedup. This extension is based on the observation that the shape of the learning curve for a given model and data set is roughly independent of the number of EM iterations used during training. Thus, we run EM for only a few iterations to decide how many cases to use for training, and then run EM to full convergence once the number of cases is selected.
2015
Framework for user modeling is represented that is useful for both supervised and unsupervised machine learning techniques which will reduce the cost of development that is typically related to the knowledge-based approaches of machine learning for supervised approaches and user modeling that is basically required for the handling of the label-data. Experimental data is used for Research in bioinformatics. Vast amounts of experimental data populate the Current biological databases. Bioinformatics uses the machine learning concepts and has attained a lot of success in this research field. We focus on semi-surprised framework which incorporates labeled and unlabeled data in the general-purpose learner. Some of transfer graph, learning algorithms and the standard methods that include support vector machines and as a special case the regularized least squares can be obtained. We can use properties of reproducing the kernel Hilbert space to prove the new. Represented theorems provide the...