A wrapper approach for feature selection based on Bat Algorithm and Optimum-Path Forest (original) (raw)
Related papers
BBA: A binary bat algorithm for feature selection
2012
Feature selection aims to find the most important information from a given set of features. As this task can be seen as an optimization problem, the combinatorial growth of the possible solutions may be in-viable for a exhaustive search. In this paper we propose a new nature-inspired feature selection technique based on the bats behaviour, which has never been applied to this context so far. The wrapper approach combines the power of exploration of the bats together with the speed of the Optimum-Path Forest classifier to find the set of features that maximizes the accuracy in a validating set. Experiments conducted in five public datasets have demonstrated that the proposed approach can outperform some well-known swarmbased techniques.
An Optimization of Feature Selection for Classification using Bat Algorithm
International Journal of Recent Technology and Engineering, 2021
Data mining is the action of searching the large existing database in order to get new and best information. It plays a major and vital role now-a-days in all sorts of fields like Medical, Engineering, Banking, Education and Fraud detection. In this paper Feature selection which is a part of Data mining is performed to do classification. The role of feature selection is in the context of deep learning and how it is related to feature engineering. Feature selection is a preprocessing technique which selects the appropriate features from the data set to get the accurate result and outcome for the classification. Natureinspired Optimization algorithms like Ant colony, Firefly, Cuckoo Search and Harmony Search showed better performance by giving the best accuracy rate with less number of features selected and also fine f-Measure value is noted. These algorithms are used to perform classification that accurately predicts the target class for each case in the data set. We propose a techni...
Naive Bayes-guided bat algorithm for feature selection
TheScientificWorldJournal, 2013
When the amount of data and information is said to double in every 20 months or so, feature selection has become highly important and beneficial. Further improvements in feature selection will positively affect a wide array of applications in fields such as pattern recognition, machine learning, or signal processing. Bio-inspired method called Bat Algorithm hybridized with a Naive Bayes classifier has been presented in this work. The performance of the proposed feature selection algorithm was investigated using twelve benchmark datasets from different domains and was compared to three other well-known feature selection algorithms. Discussion focused on four perspectives: number of features, classification accuracy, stability, and feature generalization. The results showed that BANB significantly outperformed other algorithms in selecting lower number of features, hence removing irrelevant, redundant, or noisy features while maintaining the classification accuracy. BANB is also prove...
Enhanced Feature Subset Selection Using Niche Based Bat Algorithm
Redundant and irrelevant features disturb the accuracy of the classifier. In order to avoid redundancy and irrelevancy problems, feature selection techniques are used. Finding the most relevant feature subset that can enhance the accuracy rate of the classifier is one of the most challenging parts. This paper presents a new solution to finding relevant feature subsets by the niche based bat algorithm (NBBA). It is compared with existing state of the art approaches, including evolutionary based approaches. The multi-objective bat algorithm (MOBA) selected eight, 16, and 248 features with 93.33%, 93.54%, and 78.33% accuracy on ionosphere, sonar, and Madelon datasets, respectively. The multi-objective genetic algorithm (MOGA) selected 10, 17, and 256 features with 91.28%, 88.70%, and 75.16% accuracy on same datasets, respectively. Finally, the multi-objective particle swarm optimization (MOPSO) selected nine, 21, and 312 with 89.52%, 91.93%, and 76% accuracy on the above datasets, respectively. In comparison, NBBA selected six, 19, and 178 features with 93.33%, 95.16%, and 80.16% accuracy on the above datasets, respectively. The niche multi-objective genetic algorithm selected eight, 15, and 196 features with 93.33%, 91.93%, and 79.16 % accuracy on the above datasets, respectively. Finally, the niche multi-objective particle swarm optimization selected nine, 19, and 213 features with 91.42%, 91.93%, and 76.5% accuracy on the above datasets, respectively. Hence, results show that MOBA outperformed MOGA and MOPSO, and NBBA outperformed the niche multi-objective genetic algorithm and the niche multi-objective particle swarm optimization.
Feature Selection Using Different Transfer Functions for Binary Bat Algorithm
International Journal of Mathematical, Engineering and Management Sciences, 2020
The selection feature is an important and fundamental step in the preprocessing of many classification and machine learning problems. The feature selection (FS) method is used to reduce the amount of data used and to create high-probability of classification accuracy (CA) based on fewer features by deleting irrelevant data that often reason confusion for the classifiers. In this work, bat algorithm (BA), which is a new metaheuristic rule, is applied as a wrapper type of FS technique. Six different types of BA (BA-S and BA-V) are proposed, where apiece used a transfer function (TF) to map the solutions from continuous space to the discrete space. The results of the experiment show that the features that use the BA-V methods (that is, the V-shaped transfer function) have proven effective and efficient in selecting subsets of features with high classification accuracy.
Applied Computing and Informatics, 2018
In this paper, we present a new hybrid binary version of bat and enhanced particle swarm optimization algorithm in order to solve feature selection problems. The proposed algorithm is called Hybrid Binary Bat Enhanced Particle Swarm Optimization Algorithm (HBBEPSO). In the proposed HBBEPSO algorithm, we combine the bat algorithm with its capacity for echolocation helping explore the feature space and enhanced version of the particle swarm optimization with its ability to converge to the best global solution in the search space. In order to investigate the general performance of the proposed HBBEPSO algorithm, the proposed algorithm is compared with the original optimizers and other optimizers that have been used for feature selection in the past. A set of assessment indicators are used to evaluate and compare the different optimizers over 20 standard data sets obtained from the UCI repository. Results prove the ability of the proposed HBBEPSO algorithm to search the feature space fo...
A Novel Approach for Feature Selection based on the
One of the successful methods in classification problems is feature selection. Feature selection algorithms; try to classify an instance with lower dimension, instead of huge number of required features, with higher and acceptable accuracy. In fact an instance may contain useless features which might result to misclassification. An appropriate feature selection methods tries to increase the effect of significant features while ignores insignificant subset of features. In this work feature selection is formulated as an optimization problem and a novel feature selection procedure in order to achieve to a better classification results is proposed. Experiments over a standard benchmark demonstrate that applying Bee Colony Optimization in the context of feature selection is a feasible approach and improves the classification results.
2016
The dimensionality of the feature space when being high affects the classification accuracies and the computational complexity due to redundant, irrelevant and noisy features present in the dataset. Feature Selection extracts the more informative and distinctive features from any dataset to improve the classification accuracy. Nature Inspired Algorithms are famous meta-heuristic search algorithm used in solving combinatorial optimization problems. Previously, we have proposed FS algorithms based on ACO, ABC, EABC and by the convincing results produced by these algorithms we have proposed Firefly Algorithm(FA),Cuckoo Search(CA) ,Harmony Search(HAS) for feature selection procedure. This paper proposes a new method of feature selection by using FA to optimize the selection of features. Ten UCI datasets have been used for evaluating the proposed algorithm. Experimental results show that, FA-Feature Selection has resulted in optimal feature subset configuration and increased classificati...
Swarm search for feature selection in classification
2013 IEEE 16th International Conference on Computational Science and Engineering, 2013
Finding an appropriate set of features from data of high dimensionality for building an accurate classification model is a well-known NP-hard computational problem. Unfortunately in data mining, some big data are not only big in volume but they are described by a large number of features. Many feature subset selection algorithms have been proposed in the past, they are nevertheless far from perfect. Since using brute-force in exhaustively trying every possible combination of features takes seemingly forever, stochastic optimization may be a solution. In this paper, we propose a new feature selection algorithm for finding an optimal feature set by using metaheuristic, called Swarm Search. The advantage of Swarm Search is its flexibility in integrating any classifier as its fitness function, and installing in any metaheuristic algorithm for facilitating heuristic search. Simulation experiments are carried out by testing the Swarm Search over a high-dimensional dataset, with different classification algorithms and various metaheuristic algorithms. Swarm search is observed to achieve satisfactory results.
Selecting Optimal Feature Set in High-Dimensional Data by Swarm Search
Journal of Applied Mathematics, 2013
Selecting the right set of features from data of high dimensionality for inducing an accurate classification model is a tough computational challenge. It is almost a NP-hard problem as the combinations of features escalate exponentially as the number of features increases. Unfortunately in data mining, as well as other engineering applications and bioinformatics, some data are described by a long array of features. Many feature subset selection algorithms have been proposed in the past, but not all of them are effective. Since it takes seemingly forever to use brute force in exhaustively trying every possible combination of features, stochastic optimization may be a solution. In this paper, we propose a new feature selection scheme called Swarm Search to find an optimal feature set by using metaheuristics. The advantage of Swarm Search is its flexibility in integrating any classifier into its fitness function and plugging in any metaheuristic algorithm to facilitate heuristic search. Simulation experiments are carried out by testing the Swarm Search over some high-dimensional datasets, with different classification algorithms and various metaheuristic algorithms. The comparative experiment results show that Swarm Search is able to attain relatively low error rates in classification without shrinking the size of the feature subset to its minimum.