4.5 Convolutional Neural Network Optimizationx (original) (raw)


Deep learning hyper-parameter optimization is a tough task. Finding an appropriate network configuration is a key to success, however most of the times this labor is roughly done. In this work we introduce a novel library to tackle this problem, the Deep Learning Optimization Library: DLOPT. We briefly describe its architecture and present a set of use examples. This is an open source project developed under the GNU GPL v3 license and it is freely available at this https URL

Adopting the most suitable optimization algorithm (optimizer) for a Neural Network Model is among the most important ventures in Deep Learning and all classes of Neural Networks. It’s a case of trial and error experimentation. In this paper, we will experiment with seven of the most popular optimization algorithms namely: sgd, rmsprop, adagrad, adadelta, adam, adamax and nadam on four unrelated datasets discretely, to conclude which one dispenses the best accuracy, efficiency and performance to our deep neural network. This work will provide insightful analysis to a data scientist in choosing the best optimizer while modelling their deep neural network.

The aim of this study is to present a new type of optimization algorithm to train deep learning model. In other to achieve this, we used a convex combination method to combine the coefficients of Fletcher-Reeves (FR) and Polak-Ribiere-Polyak (PRP) on a three-term conjugate gradient method. This algorithm called Three-term PRP-FR Algorithm was implemented from scratch using python programming language alongside some existing optimizers. These optimizers were evaluated and compared based on the convergence of these optimizers using the accuracy results on the popularly known digit recognition dataset (MNIST dataset).

Neural Networks (NN) is an effective approach used in many fields such as medicine, industry, security, stock market prediction, character recognition, image processing and many other fields. The main problem in the use of Artificial Neural Networks (ANN) is the control of parameters, since there is no explicit and specific way to determine the optimal values of Neural Network parameters. The aim of this study is to choose the best parameters that can be used to model and optimize the number of hidden layers, number of neurons contained in there, momentum, delta rule, transition functions and multidimensional network.

Deep learning refers to the shining branch of machine learning that is based on learning levels of representations. Convolutional Neural Networks (CNN) is one kind of deep neural network. It can study concurrently. In this article, we gave a detailed analysis of the process of CNN algorithm both the forward process and back propagation. Then we applied the particular convolutional neural network to implement the typical face recognition problem by java. Then, a parallel strategy was proposed in section4. In addition, by measuring the actual time of forward and backward computing, we analysed the maximal speed up and parallel efficiency theoretically.

A typical modern optimization technique is usually either heuristic or metaheuristic. This technique has managed to solve some optimization problems in the research area of science, engineering, and industry. However, implementation strategy of metaheuristic for accuracy improvement on convolution neural networks (CNN), a famous deep learning method, is still rarely investigated. Deep learning relates to a type of machine learning technique, where its aim is to move closer to the goal of artificial intelligence of creating a machine that could successfully perform any intellectual tasks that can be carried out by a human. In this paper, we propose the implementation strategy of three popular metaheuristic approaches, that is, simulated annealing, differential evolution, and harmony search, to optimize CNN. The performances of these metaheuristic methods in optimizing CNN on classifying MNIST and CIFAR dataset were evaluated and compared. Furthermore, the proposed methods are also compared with the original CNN. Although the proposed methods show an increase in the computation time, their accuracy has also been improved (up to 7.14 percent).

The human brain is made up of several hundreds of billions of interconnected neurons that process information in parallel. Researchers in the field of artificial intelligence have successfully demonstrated a considerable level of intelligence on chips and this has been termed Neural Networks (NNs). Neural networks, also known as artificial neural networks (ANNs) or simulated neural networks (SNNs), are a subset of machine learning (ML) and they are at the heart of deep learning algorithms. These subsets of ML have their names and structures derived from the human brain and the way that biological neurons signal to one another. A class of NNs that are often used in processing digital data images is the Convolutional Neural Network (CNN or ConvNet). The human brain processes a huge amount of information with each neuron having its own receptive field connected to other neurons in a way that they cover the entire visual field. Mimicking the biological technique, where the neurons only respond to stimuli in the restricted region of the visual field referred to as the receptive field, each neuron in the CNN processes data only in its receptive field. In this review paper, the architecture and application of CNN are presented. Its evolution, concepts, and approaches to solving problems related to digital images, computer vision and are also examined.