George Magoulas | Birkbeck College, University of London (original) (raw)
Papers by George Magoulas
A new theorem for the development and convergence analysis of supervised training algorithms with... more A new theorem for the development and convergence analysis of supervised training algorithms with an adaptive learning rate for each weight is presented. Based on this theoretical result, a strategy is proposed to automatically adapt the search direction, as well as the stepsize length along the resultant search direction. This strategy is applied to some well known local learning algorithms to investigate its e#ectiveness. Keywords and phrases: Globally convergent algorithms, local learning rate adaptation, batch training algorithms, gradient descent, feedforward neural networks. 1 Introduction Supervised learning is a classical method to bring the weights of a neural network towards optimality. A #nite set of arbitrarily ordered examples is presented at the input of the network and associated to appropriate references through an error correction process. Batch training, whichisa special case of supervised learning, is consistent with the theory of unconstrained optimization. This ...
A novel hybrid evolutionary approach is presented in this paper for improving the performance of ... more A novel hybrid evolutionary approach is presented in this paper for improving the performance of neural network classifiers in slowly varying environments. For this purpose, we investigate a coupling of Differential Evolution Strategy and Stochastic Gradient Descent, using both the global search capabilities of Evolutionary Strategies and the effectiveness of on–line gradient descent. The use of Differential Evolution Strategy is related to the concept of evolution of a number of individuals from generation to generation and that of on–line gradient descent to the concept of adaptation to the environment by learning. The hybrid algorithm is tested in two real-life image processing applications. Experimental results suggest that the hybrid strategy is capable to train on–line effectively leading to networks with increased generalization capability. 1
In many cases the supervised neural network training using a backpropagation based learning rule ... more In many cases the supervised neural network training using a backpropagation based learning rule can be trapped in a local minimum of the error function. These training algorithms are local minimization methods and have no mechanism that allows them to escape the influence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that may have minima at different points, which sometimes results in a non-convex error function. In this work global search methods for feed-forward neural network batch training are investigated. These methods are expected to lead to "optimal" or "near-optimal" weight configurations by allowing the network to escape local minima during training. The paper reviews the fundamentals of simulated annealing, genetic algorithms as well as some recently proposed deflection procedures. Simulations and comparisons are presented.
Supervised learning in neural networks based on the popular backpropagation method can be often t... more Supervised learning in neural networks based on the popular backpropagation method can be often trapped in a local minimum of the error function. The class of backpropagation-type training algorithms includes local minimization methods that have no mechanism that allows them to escape the influence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that mayhave minima at different points, which sometimes results in a nonconvex error function. This work investigates the use of global search methods for batch-mode training of feedforward multilayer perceptrons. Global search methods are expected to lead to "optimal" or "near-optimal" weight configurations byallowing the neural network to escape local minima during training and, in that sense, they improve the efficiency of the learning process. The paper reviews the fundamentals of simulated annealing, genetic and evolu...
Simulation optimization is rapidly becoming a mainstream tool for simulation practitioners, as se... more Simulation optimization is rapidly becoming a mainstream tool for simulation practitioners, as several simulation packages include add-on optimization tools. In this paper we are concentrating on an automated optimization approach that is based on adapting model parameters in order to handle uncertainty that arises from stochastic elements of the process under study. We particularly investigate the use of global search methods in this context, as these methods allow the optimization strategy to escape from sub-optimal (i.e., local) solutions and, in that sense, they improve the efficiency of the simulation optimization process. The paper compares several global search methods and demonstrates the successful application of the Particle Swarm Optimizer to simulation modeling optimization and design of a steelworks plant, a representative example of the stochastic and unpredictable behavior of a complex discrete event simulation model.
Studies in Computational Intelligence
ArXiv, 2015
The Resilient Propagation (Rprop) algorithm has been very popular for backpropagation training of... more The Resilient Propagation (Rprop) algorithm has been very popular for backpropagation training of multilayer feed-forward neural networks in various applications. The standard Rprop however encounters difficulties in the context of deep neural networks as typically happens with gradient-based learning algorithms. In this paper, we propose a modification of the Rprop that combines standard Rprop steps with a special drop out technique. We apply the method for training Deep Neural Networks as standalone components and in ensemble formulations. Results on the MNIST dataset show that the proposed modification alleviates standard Rprop's problems demonstrating improved learning speed and accuracy.
In this work we explore on-line training of neural networks for interpreting colonoscopy images t... more In this work we explore on-line training of neural networks for interpreting colonoscopy images through tracking the changing location of an approximate solution of a pattern-based, and, thus, dynamically changing, error function. We have developed a memory-based adaptation of the learning rate for the on-line Backpropagation (BP) and we investigate the use of this scheme in an on-line evolution process that applies an on-line BP-seeded Differential Evolution Strategy to (re-)adapt the neural network to modified environmental conditions. We compare this hybrid strategy to other standard training methods that have traditionally been used for training neural networks off-line. Preliminary results in interpreting colonoscopy images and frames of video sequences suggest that networks trained with this strategy detect malignant regions of interest with high accuracy. Extensive testing in interpreting more complex regions is necessary to fully investigate the properties, the effect of the...
This paper introduces an efficient modification of the Rprop algorithm for training neural networ... more This paper introduces an efficient modification of the Rprop algorithm for training neural networks. The convergence of the new algorithm can be justified theoretically, and its performance is investigated empirically through simulation experiments using some pattern classification benchmarks. Numerical evidence shows that the algorithm exhibits improved learning speed in all cases, and compares favorably against the Rprop and a recently proposed modification, the iRprop.
In many cases the supervised neural network training using a backpropagation based learning rule ... more In many cases the supervised neural network training using a backpropagation based learning rule can be trapped in a local minimum of the error function. These training algorithms are local minimization methods and have no mechanism that allows them to escape the in uence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that may have minima at di erent points, which sometimes results in a nonconvex error function. In this work global search methods for feedforward neural network batch training are investigated. These methods are expected to lead to \optimal" or \near-optimal" weight con gurations by allowing the network to escape local minima during training. The paper reviews the fundamentals of simulated annealing, genetic algorithms as well as some recently proposed de ection procedures. Simulations and comparisons are presented.
Artificial neural networks learn how to solve new problems through a computationally intense and ... more Artificial neural networks learn how to solve new problems through a computationally intense and time consuming process. One way to reduce the amount of time required is to inject preexisting knowledge into the network. To make use of past knowledge, we can take advantage of techniques that transfer the knowledge learned from one task, and reuse it on another (sometimes unrelated) task. In this paper we propose a novel selective breeding technique that extends the transfer learning with behavioural genetics approach proposed by Kohli, Magoulas and Thomas (2013), and evaluate its performance on financial data. Numerical evidence demonstrates the credibility of the new approach. We provide insights on the operation of transfer learning and highlight the benefits of using behavioural principles and selective breeding when tackling a set of diverse financial applications problems.
Supervised learning in neural networks based on the popular backpropagation method can be often t... more Supervised learning in neural networks based on the popular backpropagation method can be often trapped in a local minimum of the error function. The class of backpropagation-type training algorithms includes local minimization methods that have no mechanism that allows them to escape the influence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that may have minima at different points, which sometimes results in a nonconvex error function. This work investigates the use of global search methods for batch-mode training of feedforward multilayer perceptrons. Global search methods are expected to lead to “optimal” or “near-optimal” weight configurations by allowing the neural network to escape local minima during training and, in that sense, they improve the efficiency of the learning process. The paper reviews the fundamentals of simulated annealing, genetic and evolutionary algorithms...
Training a multilayer perceptron (MLP) with algorithms employing global search strategies has bee... more Training a multilayer perceptron (MLP) with algorithms employing global search strategies has been an important research direction in the field of neural networks. Despite a number of signifcant results, an important matter concerning the bounds of the search region| typically defined as a box--where a global optimization method has to search for a potential global minimizer seems to be unresolved. The approach presented in this paper builds on interval analysis and attempts to define guaranteed bounds in the search space prior to applying a global search algorithm for training an MLP. These bounds depend on the machine precision and the term "guaranteed" denotes that the region defined surely encloses weight sets that are global minimizers of the neural network's error function. Although the solution set to the bounding problem for an MLP is in general non-convex, the paper presents the theoretical results that help deriving a box which is a convex set. This box is an...
Sensors (Basel, Switzerland), 2021
Early identification of degenerative processes in the human brain is considered essential for pro... more Early identification of degenerative processes in the human brain is considered essential for providing proper care and treatment. This may involve detecting structural and functional cerebral changes such as changes in the degree of asymmetry between the left and right hemispheres. Changes can be detected by computational algorithms and used for the early diagnosis of dementia and its stages (amnestic early mild cognitive impairment (EMCI), Alzheimer’s Disease (AD)), and can help to monitor the progress of the disease. In this vein, the paper proposes a data processing pipeline that can be implemented on commodity hardware. It uses features of brain asymmetries, extracted from MRI of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database, for the analysis of structural changes, and machine learning classification of the pathology. The experiments provide promising results, distinguishing between subjects with normal cognition (NC) and patients with early or progressive dem...
The efficient supervised training of artificial neural networks is commonly viewed as the minimiz... more The efficient supervised training of artificial neural networks is commonly viewed as the minimization of an error function that depends on the weights of the network. This perspective gives some advantage to the development of effective training algorithms, because the problem of minimizing a function is well known in the field of numerical analysis. Typically, deterministic minimization methods are employed, however, in several cases, significant training speed and alleviation of the local minima problem can be achieved when stochastic minimization methods are used. In this paper a method for adapting the learning rate in stochastic gradient descent is presented. The main feature of the proposed learning rate adaptation scheme is that it exploits gradient– related information from the current as well as the two previous pattern presentations. This seems to provide some kind of stabilization in the value of the learning rate and helps the stochastic gradient descent to exhibit fast...
2015 International Joint Conference on Neural Networks (IJCNN)
Handbook of Research on Innovations in Technology and Marketing for the Connected Consumer
The internet has transformed the landscape in the field of marketing and consumer behaviour in th... more The internet has transformed the landscape in the field of marketing and consumer behaviour in the last two decades, enabling unprecedented reach to the consumers for marketers, inducing low costs in general, providing opportunities to analyse interactions and facilitating the development of novel strategies in digital marketing. As the information age is maturing, it is entering a new era of the fourth industrial revolution where internet's reach is coupled with the smart technologies powered by AI, cloud-based scalable infrastructure, sensor infusion harvesting an ever greater amount of data. The consumer demands and preferences are ever more sophisticated. In this context, the digital marketing field will go through a period of tremendous change, where established norms and practices will no longer serve the purpose of attracting and engaging consumers. A thought experiment is discussed using a persona for the effectiveness of existing digital marketing models. Further, a con...
The Journal of Machine Learning Research
Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion, 2017
A new theorem for the development and convergence analysis of supervised training algorithms with... more A new theorem for the development and convergence analysis of supervised training algorithms with an adaptive learning rate for each weight is presented. Based on this theoretical result, a strategy is proposed to automatically adapt the search direction, as well as the stepsize length along the resultant search direction. This strategy is applied to some well known local learning algorithms to investigate its e#ectiveness. Keywords and phrases: Globally convergent algorithms, local learning rate adaptation, batch training algorithms, gradient descent, feedforward neural networks. 1 Introduction Supervised learning is a classical method to bring the weights of a neural network towards optimality. A #nite set of arbitrarily ordered examples is presented at the input of the network and associated to appropriate references through an error correction process. Batch training, whichisa special case of supervised learning, is consistent with the theory of unconstrained optimization. This ...
A novel hybrid evolutionary approach is presented in this paper for improving the performance of ... more A novel hybrid evolutionary approach is presented in this paper for improving the performance of neural network classifiers in slowly varying environments. For this purpose, we investigate a coupling of Differential Evolution Strategy and Stochastic Gradient Descent, using both the global search capabilities of Evolutionary Strategies and the effectiveness of on–line gradient descent. The use of Differential Evolution Strategy is related to the concept of evolution of a number of individuals from generation to generation and that of on–line gradient descent to the concept of adaptation to the environment by learning. The hybrid algorithm is tested in two real-life image processing applications. Experimental results suggest that the hybrid strategy is capable to train on–line effectively leading to networks with increased generalization capability. 1
In many cases the supervised neural network training using a backpropagation based learning rule ... more In many cases the supervised neural network training using a backpropagation based learning rule can be trapped in a local minimum of the error function. These training algorithms are local minimization methods and have no mechanism that allows them to escape the influence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that may have minima at different points, which sometimes results in a non-convex error function. In this work global search methods for feed-forward neural network batch training are investigated. These methods are expected to lead to "optimal" or "near-optimal" weight configurations by allowing the network to escape local minima during training. The paper reviews the fundamentals of simulated annealing, genetic algorithms as well as some recently proposed deflection procedures. Simulations and comparisons are presented.
Supervised learning in neural networks based on the popular backpropagation method can be often t... more Supervised learning in neural networks based on the popular backpropagation method can be often trapped in a local minimum of the error function. The class of backpropagation-type training algorithms includes local minimization methods that have no mechanism that allows them to escape the influence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that mayhave minima at different points, which sometimes results in a nonconvex error function. This work investigates the use of global search methods for batch-mode training of feedforward multilayer perceptrons. Global search methods are expected to lead to "optimal" or "near-optimal" weight configurations byallowing the neural network to escape local minima during training and, in that sense, they improve the efficiency of the learning process. The paper reviews the fundamentals of simulated annealing, genetic and evolu...
Simulation optimization is rapidly becoming a mainstream tool for simulation practitioners, as se... more Simulation optimization is rapidly becoming a mainstream tool for simulation practitioners, as several simulation packages include add-on optimization tools. In this paper we are concentrating on an automated optimization approach that is based on adapting model parameters in order to handle uncertainty that arises from stochastic elements of the process under study. We particularly investigate the use of global search methods in this context, as these methods allow the optimization strategy to escape from sub-optimal (i.e., local) solutions and, in that sense, they improve the efficiency of the simulation optimization process. The paper compares several global search methods and demonstrates the successful application of the Particle Swarm Optimizer to simulation modeling optimization and design of a steelworks plant, a representative example of the stochastic and unpredictable behavior of a complex discrete event simulation model.
Studies in Computational Intelligence
ArXiv, 2015
The Resilient Propagation (Rprop) algorithm has been very popular for backpropagation training of... more The Resilient Propagation (Rprop) algorithm has been very popular for backpropagation training of multilayer feed-forward neural networks in various applications. The standard Rprop however encounters difficulties in the context of deep neural networks as typically happens with gradient-based learning algorithms. In this paper, we propose a modification of the Rprop that combines standard Rprop steps with a special drop out technique. We apply the method for training Deep Neural Networks as standalone components and in ensemble formulations. Results on the MNIST dataset show that the proposed modification alleviates standard Rprop's problems demonstrating improved learning speed and accuracy.
In this work we explore on-line training of neural networks for interpreting colonoscopy images t... more In this work we explore on-line training of neural networks for interpreting colonoscopy images through tracking the changing location of an approximate solution of a pattern-based, and, thus, dynamically changing, error function. We have developed a memory-based adaptation of the learning rate for the on-line Backpropagation (BP) and we investigate the use of this scheme in an on-line evolution process that applies an on-line BP-seeded Differential Evolution Strategy to (re-)adapt the neural network to modified environmental conditions. We compare this hybrid strategy to other standard training methods that have traditionally been used for training neural networks off-line. Preliminary results in interpreting colonoscopy images and frames of video sequences suggest that networks trained with this strategy detect malignant regions of interest with high accuracy. Extensive testing in interpreting more complex regions is necessary to fully investigate the properties, the effect of the...
This paper introduces an efficient modification of the Rprop algorithm for training neural networ... more This paper introduces an efficient modification of the Rprop algorithm for training neural networks. The convergence of the new algorithm can be justified theoretically, and its performance is investigated empirically through simulation experiments using some pattern classification benchmarks. Numerical evidence shows that the algorithm exhibits improved learning speed in all cases, and compares favorably against the Rprop and a recently proposed modification, the iRprop.
In many cases the supervised neural network training using a backpropagation based learning rule ... more In many cases the supervised neural network training using a backpropagation based learning rule can be trapped in a local minimum of the error function. These training algorithms are local minimization methods and have no mechanism that allows them to escape the in uence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that may have minima at di erent points, which sometimes results in a nonconvex error function. In this work global search methods for feedforward neural network batch training are investigated. These methods are expected to lead to \optimal" or \near-optimal" weight con gurations by allowing the network to escape local minima during training. The paper reviews the fundamentals of simulated annealing, genetic algorithms as well as some recently proposed de ection procedures. Simulations and comparisons are presented.
Artificial neural networks learn how to solve new problems through a computationally intense and ... more Artificial neural networks learn how to solve new problems through a computationally intense and time consuming process. One way to reduce the amount of time required is to inject preexisting knowledge into the network. To make use of past knowledge, we can take advantage of techniques that transfer the knowledge learned from one task, and reuse it on another (sometimes unrelated) task. In this paper we propose a novel selective breeding technique that extends the transfer learning with behavioural genetics approach proposed by Kohli, Magoulas and Thomas (2013), and evaluate its performance on financial data. Numerical evidence demonstrates the credibility of the new approach. We provide insights on the operation of transfer learning and highlight the benefits of using behavioural principles and selective breeding when tackling a set of diverse financial applications problems.
Supervised learning in neural networks based on the popular backpropagation method can be often t... more Supervised learning in neural networks based on the popular backpropagation method can be often trapped in a local minimum of the error function. The class of backpropagation-type training algorithms includes local minimization methods that have no mechanism that allows them to escape the influence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonlinear activation functions that may have minima at different points, which sometimes results in a nonconvex error function. This work investigates the use of global search methods for batch-mode training of feedforward multilayer perceptrons. Global search methods are expected to lead to “optimal” or “near-optimal” weight configurations by allowing the neural network to escape local minima during training and, in that sense, they improve the efficiency of the learning process. The paper reviews the fundamentals of simulated annealing, genetic and evolutionary algorithms...
Training a multilayer perceptron (MLP) with algorithms employing global search strategies has bee... more Training a multilayer perceptron (MLP) with algorithms employing global search strategies has been an important research direction in the field of neural networks. Despite a number of signifcant results, an important matter concerning the bounds of the search region| typically defined as a box--where a global optimization method has to search for a potential global minimizer seems to be unresolved. The approach presented in this paper builds on interval analysis and attempts to define guaranteed bounds in the search space prior to applying a global search algorithm for training an MLP. These bounds depend on the machine precision and the term "guaranteed" denotes that the region defined surely encloses weight sets that are global minimizers of the neural network's error function. Although the solution set to the bounding problem for an MLP is in general non-convex, the paper presents the theoretical results that help deriving a box which is a convex set. This box is an...
Sensors (Basel, Switzerland), 2021
Early identification of degenerative processes in the human brain is considered essential for pro... more Early identification of degenerative processes in the human brain is considered essential for providing proper care and treatment. This may involve detecting structural and functional cerebral changes such as changes in the degree of asymmetry between the left and right hemispheres. Changes can be detected by computational algorithms and used for the early diagnosis of dementia and its stages (amnestic early mild cognitive impairment (EMCI), Alzheimer’s Disease (AD)), and can help to monitor the progress of the disease. In this vein, the paper proposes a data processing pipeline that can be implemented on commodity hardware. It uses features of brain asymmetries, extracted from MRI of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database, for the analysis of structural changes, and machine learning classification of the pathology. The experiments provide promising results, distinguishing between subjects with normal cognition (NC) and patients with early or progressive dem...
The efficient supervised training of artificial neural networks is commonly viewed as the minimiz... more The efficient supervised training of artificial neural networks is commonly viewed as the minimization of an error function that depends on the weights of the network. This perspective gives some advantage to the development of effective training algorithms, because the problem of minimizing a function is well known in the field of numerical analysis. Typically, deterministic minimization methods are employed, however, in several cases, significant training speed and alleviation of the local minima problem can be achieved when stochastic minimization methods are used. In this paper a method for adapting the learning rate in stochastic gradient descent is presented. The main feature of the proposed learning rate adaptation scheme is that it exploits gradient– related information from the current as well as the two previous pattern presentations. This seems to provide some kind of stabilization in the value of the learning rate and helps the stochastic gradient descent to exhibit fast...
2015 International Joint Conference on Neural Networks (IJCNN)
Handbook of Research on Innovations in Technology and Marketing for the Connected Consumer
The internet has transformed the landscape in the field of marketing and consumer behaviour in th... more The internet has transformed the landscape in the field of marketing and consumer behaviour in the last two decades, enabling unprecedented reach to the consumers for marketers, inducing low costs in general, providing opportunities to analyse interactions and facilitating the development of novel strategies in digital marketing. As the information age is maturing, it is entering a new era of the fourth industrial revolution where internet's reach is coupled with the smart technologies powered by AI, cloud-based scalable infrastructure, sensor infusion harvesting an ever greater amount of data. The consumer demands and preferences are ever more sophisticated. In this context, the digital marketing field will go through a period of tremendous change, where established norms and practices will no longer serve the purpose of attracting and engaging consumers. A thought experiment is discussed using a persona for the effectiveness of existing digital marketing models. Further, a con...
The Journal of Machine Learning Research
Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion, 2017