Protein Superfamily Classification Using Adaptive Evolutionary Radial Basis Function Network (original) (raw)

Two-Stage Approach for Protein Superfamily Classification

Computational Biology Journal, 2013

We deal with the problem of protein superfamily classification in which the family membership of newly discovered amino acid sequence is predicted. Correct prediction is a matter of great concern for the researchers and drug analyst which helps them in discovery of new drugs. As this problem falls broadly under the category of pattern classification problem, we have made all efforts to optimize feature extraction in the first stage and classifier design in the second stage with an overall objective to maximize the performance accuracy of the classifier. In the feature extraction phase, Genetic Algorithm-(GA-) based wrapper approach is used to select few eigenvectors from the principal component analysis (PCA) space which are encoded as binary strings in the chromosome. On the basis of position of 1's in the chromosome, the eigenvectors are selected to build the transformation matrix which then maps the original high-dimension feature space to lower dimension feature space. Using PCA-NSGA-II (non-dominated sorting GA), the nondominated solutions obtained from the Pareto front solve the trade-off problem by compromising between the number of eigenvectors selected and the accuracy obtained by the classifier. In the second stage, recursive orthogonal least square algorithm (ROLSA) is used for training radial basis function network (RBFN) to select optimal number of hidden centres as well as update the output layer weighting matrix. This approach can be applied to large data set with much lower requirements of computer memory. Thus, very small architectures having few number of hidden centres are obtained showing higher level of performance accuracy.

Evolutionary Generalized Radial Basis Function neural networks for improving prediction accuracy in gene classification using feature selection

Applied Soft Computing, 2012

Radial Basis Function Neural Networks (RBFNNs) have been successfully employed in several function approximation and pattern recognition problems. The use of different RBFs in RBFNN has been reported in the literature and here the study centres on the use of the Generalized Radial Basis Function Neural Networks (GRBFNNs). An interesting property of the GRBF is that it can continuously and smoothly reproduce different RBFs by changing a real parameter. In addition, the mixed use of different RBF shapes in only one RBFNN is allowed. Generalized Radial Basis Function (GRBF) is based on Generalized Gaussian Distribution (GGD), which adds a shape parameter, , to standard Gaussian Distribution. Moreover, this paper describes a hybrid approach, Hybrid Algorithm (HA), which combines evolutionary and gradient-based learning methods to estimate the architecture, weights and node topology of GRBFNN classifiers. The feasibility and benefits of the approach are demonstrated by means of six gene microarray classification problems taken from bioinformatic and biomedical domains. Three filters were applied: Fast Correlation-Based Filter (FCBF), Best Incremental Ranked Subset (BIRS), and Best Agglomerative Ranked Subset (BARS); this was done in order to identify salient expression genes from among the thousands of genes in microarray data that can directly contribute to determining the class membership of each pattern. After different gene subsets were obtained, the proposed methodology was performed using the selected gene subsets as new input variables. The results confirm that the GRBFNN classifier leads to a promising improvement in accuracy.

Assessment of Gaussian radial basis function network on protein secondary structures

2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2001

Studies of the radial basis function (RBF) network on protein secondary structures are presented. Secondary structure prediction is a useful first step in understanding how the amino acid sequence of protein determines the native state. If the secondary structure is known, it is possible to derive a comparatively small number of tertiary structures using the secondary structural element pack. A study of the Gaussian-RBF with different window sizes on the dataset developed by Qian-Sejnowski, and also a dissimilar dataset by Chandonia is given. The RBF network predicts each position in turn based on a local window of residues, by sliding this window along the length of the sequence. It is shown that the Gaussian RBF network is not an appropriate technique to be used in the prediction of secondary structure for sequence structural state.

Classification by Evolutionary Generalized Radial Basis Functions

This paper proposes a novelty neural network model by using generalized kernel functions for the hidden layer of a feed forward network (Generalized Radial Basis Functions, GRBF), where the architecture, weights and node typology are learned through an evolutionary programming algorithm. This new kind of model is compared with the corresponding models with standard hidden nodes: Product Unit Neural Networks (PUNN), Multilayer Perceptrons (MLP) and the RBF neural networks. The methodology proposed is tested using six benchmark classification datasets from well-known machine learning problems. Generalized basis functions are found to present a better performance than the other standard basis functions for the task of classification.

Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems

This paper presents a new multiobjective evolutionary algorithm applied to a radial basis function (RBF) network design based on mult iobjective particle swarm optimization augmented with local search features. The algorithm is named the memetic multiobjective particle swarm optimization RBF network (MPSON) because it integrates the accuracy and structure of an RBF network. The proposed algorithm is impleme nted on two-clas s and multiclass pattern classification problems with one complex real problem. The experimental results indicate that the proposed algorithm is viable, and provides an effe ctive means to design multiobjective RBF networks with good generalization capability and compact network structure. The accuracy and complexity of the network obtained by the proposed algorithm are comp ared with the memetic non-dominated sorting genetic algorithm based RBF netwo rk (MGAN) through statistical tests. This study shows that MPSON generates RBF networks coming with an appropriate balance between accuracy and simplicity, outperforming the other algorithms considered.

A multi-objective genetic algorithm for the Protein Structure Prediction

2011 11th International Conference on Intelligent Systems Design and Applications, 2011

The Protein Structure Prediction (PSP) problem consists of predicting the structure of a protein from its amino acids sequence, and have received much attention lately. In fact, being able to predict the structure of a protein, would allow to know the function of the protein. In this paper, we propose a multi-objective evolutionary algorithm for the PSP problem. The prediction model consists of a set of rules that determine possible contacts between amino acids. Such rules are based on four specific amino acid properties, which are involved in the folding process: hydrophobicity, polarity, net charge and residue size. In order to increase the interpretability of the results, rules are organized in a 20x20 matrix where each cell contains the specific rules for a possible pair of residues. The high accuracy values obtained confirm the validity of our proposal.

Multi-Objective Hybrid Evolutionary Algorithms for Radial Basis Function Neural Network Design

2011

This paper presents new multi-objective evolutionary hybrid algorithms for the design of Radial Basis Function Networks (RBFNs) for classification problems. The algorithms are memetic Pareto particle swarm optimization based RBFN (MPPSON), Memetic Elitist Pareto non dominated sorting genetic algorithm based RBFN (MEPGAN) and Memetic Elitist Pareto non dominated sorting differential evolution based RBFN (MEPDEN). The proposed methods integrate accuracy and structure of RBFN simultaneously.

Memetic Elitist Pareto Differential Evolution algorithm based Radial Basis Function Networks for classification problems

Applied Soft Computing, 2011

This paper presents a new multi-objective evolutionary hybrid algorithm for the design of Radial Basis Function Networks (RBFNs) for classification problems. The algorithm, MEPDEN, Memetic Elitist Pareto evolutionary approach based on the Non-dominated Sorting Differential Evolution (NSDE) multiobjective evolutionary algorithm which has been adapted to design RBFNs, where the NSDE algorithm is augmented with a local search that uses the Back-propagation algorithm. The MEPDEN is tested on two-class and multiclass pattern classification problems. The results obtained in terms of Mean Square Error (MSE), number of hidden nodes, accuracy (ACC), sensitivity (SEN), specificity (SPE) and Area Under the receiver operating characteristics Curve (AUC), show that the proposed approach is able to produce higher prediction accuracies with much simpler network structures. The accuracy and complexity of the network obtained by the proposed algorithm are compared with Memetic Eilitist Pareto Non-dominated Sorting Genetic Algorithm based RBFN (MEPGAN) through statistical tests. This study showed that MEP-DEN obtains RBFNs with an appropriate balance between accuracy and simplicity, outperforming the other method considered.