Some Theorems for Feed Forward Neural Networks (original) (raw)

This paper introduces a new method which employs the concept of “Orientation Vectors” to train a feed forward neural network. It is shown that this method is suitable for problems where large dimensions are involved and the clusters are characteristically sparse. For such cases, the new method is not NP hard as the problem size increases. We ‘derive’ the present technique by starting from Kolmogrov’s method and then relax some of the stringent conditions. It is shown that for most classification problems three layers are sufficient and the number of processing elements in the first layer depends on the number of clusters in the feature space. This paper explicitly demonstrates that for large dimension space as the number of clusters increase from N to N+dN the number of processing elements in the first layer only increases by d(logN), and as the number of classes increase, the processing elements increase only proportionately, thus demonstrating that the method is not NP hard with increase in problem size. Many examples have been explicitly solved and it has been demonstrated through them that the method of Orientation Vectors requires much less computational effort than Radial Basis Function methods and other techniques wherein distance computations are required, in fact the present method increases logarithmically with problem size compared to the Radial Basis Function method and the other methods which depend on distance computations e.g statistical methods where probabilistic distances are calculated. A practical method of applying the concept of Occum’s razor to choose between two architectures which solve the same classification problem has been described. The ramifications of the above findings on the field of Deep Learning have also been briefly investigated and we have found that it directly leads to the existence of certain types of NN architectures which can be used as a “mapping engine”, which has the property of “invertibility”, thus improving the prospect of their deployment for solving problems involving Deep Learning and hierarchical classification. The latter possibility has a lot of future scope in the areas of machine learning and cloud computing.

Sign up for access to the world's latest research.

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Nature of the learning algorithms for feedforward neural networks

1996

The neural network model (NN) comprised of relatively simple computing elements, operat¬ ing in parallel, offers an attractive and versatile framework for exploring a variety of learning structures and processes for intelligent systems. Due to the amount of research developed in the area many types of networks have been defined. The one of interest here is the multi-layer perceptron as it is one of the simplest and it is considered a powerful representation tool whose complete potential has not been adequately exploited and whose limitations need yet to be specified in a formal and coherent framework. This dissertation addresses the theory of gen¬ eralisation performance and architecture selection for the multi-layer perceptron; a subsidiary aim is to compare and integrate this model with existing data analysis techniques and exploit its potential by combining it with certain constructs from computational geometry creating a reliable, coherent network design process which conforms t...

Deep Learning Architecture : A Literature Survey

2018

Deep learning is a sub field of machine learning. Learning can be of supervised, semi-supervised and unsupervised. There are different types of architectures for deep learning . In this paper we are giving an overview of different architectures that are widely used and their application area. Deep learning is applied in many areas such as image processing, speech recognition, data mining, natural language processing, social network filtering, machine translation, bioinformatics and drug design. IndexTerms Deep learning ;deep learning architecture; machine learning _________________________________________________________________________________________________________________

Notes on Deep Feedforward Network (MLP)

The task is to construct a proper computing method to obtain the gradient needed for very much a graph that describes the interactions of the perceptrons, also widely known as multilayer perceptron.

A SURVEY: RESEARCH SUMMARY ON NEURAL NETWORKS

Neural Networks are relatively crude electronic models based on the neural structure of the brain. The brain basically learns from experience. It is natural proof that some problems that are beyond the scope of current computers are indeed solvable by small energy efficient packages.In this paper we propose the fundamentals of neural network topologies, activation function and learning algorithms based on the flow of information in bi-direction or uni-directions. We outline themain features of a number of popular neural networks and provide an overview on their topologies and their learning capabilities.

A Novel Approach in Determining Neural Networks Architecture to Classify Data With Large Number of Attributes

IEEE Access

One of the challenges in the successful implementation of deep neural networks (DNN) lies on the determination of its architecture, in terms of the number of hidden layers and neurons for each hidden layer. In this research, a new approach is proposed to determine the neural networks architecture especially in the form of Multi-layer Perceptron (MLP) which will later be used as a machine learning method to classify data with large number of attribute. The new approach is proposed since the previous approaches are no longer applicable as general guidelines to determine the architecture of neural networks. Thus, the proposed approach aims to determine the number of hidden layers by using principal component analysis (PCA), while the number of neurons for each hidden layer is determined by using K-Means clustering. The determined neural network architecture is utilized to classify data with large number of attribute, such as the Gas Sensor Array Drift dataset which has 128 input attributes and six output classes and the Parkinson's Disease Classification dataset which has 754 output attributes and two output classes. The results indicate that the best-performing architecture for the first dataset is the one that uses one hidden layer, with a PCA cumulative variance of 69.7%, while for the second dataset is the one that uses three hidden layers, with a PCA cumulative variance of 38.9%. Increasing the number of hidden layers does not always improve the performance of neural networks. Therefore, it is essential to determine the number of hidden layers and neurons that are appropriate to achieve good performance in neural networks. The use of PCA and K-Means clustering is expected to provide guidelines in determining neural networks architectures with good performance. INDEX TERMS Hidden layer, hidden neurons, K-Means clustering, large attribute data, Multi-Layer Perceptron, neural networks, principal component analysis.

Deep Learning Made Easier by Linear Transformations in Perceptrons, Conference on AI and Statistics

We transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. This transformation aims at separating the problems of learning the linear and nonlinear parts of the whole input-output mapping, which has many benefits. We study the theoretical properties of the transformation by noting that they make the Fisher information matrix closer to a diagonal matrix, and thus standard gradient closer to the natural gradient. We experimentally confirm the usefulness of the transformations by noting that they make basic stochastic gradient learning competitive with state-of-the-art learning algorithms in speed, and that they seem also to help find solutions that generalize better. The experiments include both classification of small images and learning a lowdimensional representation for images by using a deep unsupervised auto-encoder network. The transformations were beneficial in all cases, with and without regularization and with networks from two to five hidden layers.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.