PARTIAL RETRAINING: A NEW APPROACH TO INPUT RELEVANCE DETERMINATION (original) (raw)
Related papers
Hidden node activation differential-a new neural network relevancy criteria
Proceedings of 1st International Conference on Conventional and Knowledge Based Intelligent Electronic Systems. KES '97, 1997
Neural networks have been used in many problems such as character recognition, time series forecasting and image coding. The generalisation of the network depends on its intemal structure. Network parameters should be set correctly so that data outside the class will not be overfitted. One mechanism to achieve an optimal neural network structure is to identify the essential components (hidden nodes) and to prune off the irrelevant ones. Most of the proposed criteria used for pruning are expensive to compute and impractical to use for large networks and large training samples. In this paper, a new relevancy criteria is proposed and three existing criteria are investigated. The properties of the proposed criteria are covered in detail and their similarities to existing criteria are illustrated.
Journal of Applied Sciences Research, 2013
Artificial neural networks are considering powerful statistical modeling technique in the agricultural sciences; however, they provide little information about the contributions of the independent variables in the prediction process. The goal of relative importance analysis is to partition explained variance among multiple predictors to better understand the role played by each predictor. In the present study, a modification to Connection Weights Algorithm and a novel algorithm are proposed to assess the relative importance of independent variables in multilayer perceptron neural network and a comparison in the field of crop production with the Connection Weights Algorithm, Dominance Analysis, Garson's Algorithm, Partial Derivatives, and Multiple Linear Regression is presented. The performance of the two proposed algorithms is studied for empirical data. The Most Squares method (the second proposed algorithm) is found to be a better method in comparison to the above mentioned methods and seem to perform much better than the other methods, and agree with the results of multiple linear regressions in terms of the partial R 2 and consequently, it seem to be more reliable.
Analyzing and Validating Neural Networks Predictions Grégoire Montavon 3
We state some key properties of the recently proposed Layer-wise Relevance Propagation (LRP) method, that make it particularly suitable for model analysis and validation. We also review the capabilities and advantages of the LRP method on empirical data, that we have observed in several previous works.
Ecological Modelling, 2007
Parameter importance Virtual ecology a b s t r a c t Artificial neural network (ANN) models are powerful statistical tools which are increasingly used in modeling complex ecological systems. For interpretation of ANN models, a means of evaluating how systemic parameters contribute to model output is essential. Developing a robust, systematic method for interpreting ANN models is the subject of much current research. We propose a method using sequential randomization of input parameters to determine the relative proportion to which each input variable contributes to the predictive ability of the ANN model (termed the holdback input randomization method or HIPR method). Validity of the method was assessed using a simulated data set in which the relationship between input parameters and output parameters were completely known.
Neural network pruning based on input importance
Journal of Intelligent & Fuzzy Systems, 2019
In this paper we present a new algorithm called Neural Network Pruning Based on Input Importance (NNPII) that prunes the neural network based on the input importance. The algorithm depends on the frequency of using a certain value of an attribute in all the given instances in the dataset. Pruning will include only links between input layer and hidden layer. The algorithm has three phases, the first phase is the preprocessing phase, where the data inputs are replaced with their importance. The second phase is a forward pass, which is similar to forward pass in the backpropgation algorithm, but instead of using the real inputs as inputs, we use the input importance obtained in the preprocessing stage. The third pass is the backward phase which is again as backpropgation algorithm, but in this stage we use the input importance instead of real inputs, and  factor that measures the value changing for every input attribute,  will be incorporated in the formula in updating the weights between the input layer and the hidden layer. The elimination process is performed based on criterion that depends on factor that represents a threshold value for a certain input attribute for all instances. It is worth mentioning that the pruning is performed within the usual training phases. The proposed algorithm has been tested through three types of experiments, a comparison between backpropgation and NNPII, Applying NNPII with various parameter values and finally comparing NNPII with other various pruning algorithms. Results show that NNPII performs well and compete with other pruning algorithms. NNPII outperforms all other algorithms when the classes are fairly distributed in the datasets.
Ranking importance of input parameters of neural networks
Expert Systems with Applications, 1998
Artificial neural networks have been used for simulation, modeling, and control purposes in many engineering applications as an alternative to conventional expert systems. Although neural networks usually do not reach the level of performance exhibited by expert systems, they do enjoy a tremendous advantage of very low construction costs.
MEASURING AND IMPROVING NEURAL NETWORK GENERALIZATION FOR MODEL UPDATING
Journal of Sound and Vibration, 2000
This paper compares various techniques of measuring the generalization ability of a neural network used for model-updating purposes. An appropriate metric for measuring generalization ability is suggested, and it is used to investigate and compare various neural network architectures and training algorithms. The e!ect of noise on generalization ability is considered, and it is shown that the form of the noise does not appear important to the networks. This implies that the optimum training location may be obtained by considering a simple noise model such as Gaussian noise. Various radial basis function neurons and training algorithms are considered. Signi"cant improvements to generalization ability are noted by merging the holdout and training data sets before training the second layer of the network, after the network architecture has been decided. The Gaussian radial basis function is rejected as the radial basis function of choice, due to uncertainty regarding an appropriate value for the spread constant. It is noted that several alternative radial basis functions without spread constants, such as the thin-plate spline, give excellent results. Finally, the use of jitter and committees to improve the generalization ability of networks is considered. It is found that jitter makes neither improvement nor degrades the results. It is also found that a committee of networks performs better than any single network. A good method of generating committee members is to split the available data evenly into multiple random holdout and training data sets.
Sensivity Based Pruning of Input Variables by Means of Weight Cascaded Retraining
DTEW Research …, 1999
This paper investigates the adoption of a wrapped feature selection approach using neural networks for classification purposes. The presented framework consists of a primary model selection or network construction phase and a subsequent input feature pruning phase, introduced here under the name of Weight Cascaded Retraining (WCR). The theoretical exposition in the first part of the paper will be illuminated and validated by means of real-life empirical case material. The main conclusion of the paper can be stated as follows. Feature selection can be very effective in reducing model complexity for classification modelling via neural networks. It allows one to partially circumvent the curse of dimensionality when being confronted with a high number of irrelevant/redundant features. Furthermore, by reducing the number of input features in the neural network training phase, both human understanding and computational performance can be vastly enhanced.
IEEE Access
Artificial Neural Networks (ANNs) are weighted directed graphs of interconnected neurons widely employed to model complex problems. However, the selection of the optimal ANN architecture and its parameters training is not enough to obtain reliable models. The data preprocessing stage is fundamental to improve the model's performance. Specifically, Feature Normalisation (FN) is commonly utilised to remove the features' magnitude aiming at equalising the features' contribution to the model training. Nevertheless, this work demonstrates that the FN method selection affects the model performance. Also, it is well-known that ANNs are commonly considered a "black box" due to their lack of interpretability. In this sense, several works aim to analyse the features' contribution to the network for estimating the output. However, these methods, specifically those based on network's weights, like Garson's or Yoon's methods, do not consider preprocessing factors, such as dispersion factors, previously employed to transform the input data. This work proposes a new features' relevance analysis method that includes the dispersion factors into the weight matrix analysis methods to infer more precisely each feature's actual contribution to the network output. Besides, in this work, the Proportional Dispersion Weights (PWD) are proposed as explanatory factors of similarity between models' performance results. The conclusions from this work improve the understanding of the features' contribution to the model that enhances the feature selection strategy, which is fundamental for reliably modelling a given problem.
Assessing the importance of features for multi-layer perceptrons
Neural Networks, 1998
In this paper we establish a mathematical framework in which we develop measures for determining the contribution of individual features to the performance of a classifier. Corresponding to these measures, we design metrics that allow estimation of the importance of features for a specific multi-layer perceptron neural network. It is shown that all measures constitute lower bounds for the correctness that can be obtained when the feature under study is excluded and the classifier rebuilt. We also present a method for pruning input nodes from the network such that most of the knowledge encoded in its weights is retained. The proposed metrics and the pruning method are validated with a number of experiments with artificial classification tasks. The experiments indicate that the metric called replaceability results in the tightest error bounds. Both this metric and the metric called expected influence result in good rankings of the features.