Automatic tuning of hyperparameters using Bayesian optimization (original) (raw)

Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian Optimization and Tuning Rules

2020

Deep learning techniques play an increasingly important role in industrial and research environments due to their outstanding results. However, the large number of hyper-parameters to be set may lead to errors if they are set manually. The state-of-the-art hyper-parameters tuning methods are grid search, random search, and Bayesian Optimization. The first two methods are expensive because they try, respectively, all possible combinations and random combinations of hyper-parameters. Bayesian Optimization, instead, builds a surrogate model of the objective function, quantifies the uncertainty in the surrogate using Gaussian Process Regression and uses an acquisition function to decide where to sample the new set of hyper-parameters. This work faces the field of Hyper-Parameters Optimization (HPO). The aim is to improve Bayesian Optimization applied to Deep Neural Networks. For this goal, we build a new algorithm for evaluating and analyzing the results of the network on the training a...

Hyperparameter Tuning of Deep learning Models in Keras

Sparklinglight Transactions on Artificial Intelligence and Quantum Computing

Hyperparameter tuning or optimization is one of the fundamental way to improve the performance of the machine learning models. Hyper parameter is a parameter passed during the learning process of the model to make corrections or adjustments to the learning process. To generalise diverse data patterns, the same machine learning model may require different constraints, weights, or learning rates. Hyperparameters are the term for these kind of measurements. These parameters have been trial-and-error tested to ensure that the model can solve the machine learning task optimally. This paper focus on the science of hyperparameter tuning using some tools with experimental values and results of each experiments. We have also documented 4 metrics to analyze the hyperparameter tuning results and benchmark the outcome. The experimental results of two tools used commonly for deep learning models namely Keras tuner and AiSara tuner are captured in the article. All relevant experimental code is al...

Machine Learning Model Optimization with Hyper Parameter Tuning Approach

2021

Hyper-parameters tuning is a key step to find the optimal machine learning parameters. Determining the best hyper-parameters takes a good deal of time, especially when the objective functions are costly to determine, or a large number of parameters are required to be tuned. In contrast to the conventional machine learning algorithms, Neural Network requires tuning hyperparameters more because it has to process a lot of parameters together, and depending on the fine tuning, the accuracy of the model can be varied in between 25%-90%. A few of the most effective techniques for tuning hyper-parameters in the Deep learning methods are: Grid search, Random forest, Bayesian optimization, etc. Every method has some advantages and disadvantages over others. For example: Grid search has proven to be an effective technique to tune hyper-parameters, along with drawbacks like trying too many combinations, and performing poorly when it is required to tune many parameters at a time. In our work, w...

Hyperparameter Optimization for Deep NeuralNetwork Models: A Comprehensive Study onMethods and Techniques

Research Square (Research Square), 2023

Advancements in computing and storage technologies have significantly contributed to the adoption of deep learning (DL)-based models among machine learning (ML) experts. Although a generic model can be used in the search for a near-optimal solution in any problem domain, what makes these DL models context-sensitive is the combination of the training data and the hyperparameters. Due to the lack of inherent explainability of DL models the Hyperparameter Optimization (HPO) or tuning specific to each model is a combination of art, science, and experience. In this article, we have explored various existing methods or ways to identify the optimal set of values for the hyperparameters specific to the DL models along with the techniques to realize those methods in real-life situations. The article also includes a detailed comparative study among various state-of-the-art HPO techniques using the Keras Tuner tuning toolkit and highlights the observations describing how the model performance can be improved by applying various HPO techniques.

Sherpa: Hyperparameter Optimization for Machine Learning Models

2018

Sherpa is a free open-source hyperparameter optimization library for machine learning models. It is designed for problems with computationally expensive iterative function evaluations, such as the hyperparameter tuning of deep neural networks. With Sherpa, scientists can quickly optimize hyperparameters using a variety of powerful and interchangeable algorithms. Additionally, the framework makes it easy to implement custom algorithms. Sherpa can be run on either a single machine or a cluster via a grid scheduler with minimal configuration. Finally, an interactive dashboard enables users to view the progress of models as they are trained, cancel trials, and explore which hyperparameter combinations are working best. Sherpa empowers machine learning researchers by automating the tedious aspects of model tuning and providing an extensible framework for developing automated hyperparameter-tuning strategies. Its source code and documentation are available at https://github.com/LarsHH/she...

Speeding up the Hyperparameter Optimization of Deep Convolutional Neural Networks

2018

Most learning algorithms require the practitioner to manually set the values of many hyper-parameters before the learning process can begin. However, with modern algorithms, the evaluation of a given hyperparameter setting can take a considerable amount of time and the search space is often very high-dimensional. We suggest using a lower-dimensional representation of the original data to quickly identify promising areas in the hyperparameter space. This information can then be used to initialize the optimization algorithm for the original, higher-dimensional data. We compare this approach with the standard procedure of optimizing the hyperparameters only on the original input. We perform experiments with various state-of-the-art hyperparameter optimization algorithms such as random search, the tree of parzen estimators (TPEs), sequential model-based algorithm con¯guration (SMAC), and a genetic algorithm (GA). Our experiments indicate that it is possible to speed up the optimization process by using lower-dimensional data representations at the beginning, while increasing the dimensionality of the input later in the optimization process. This is independent of the underlying optimization procedure, making the approach promising for many existing hyperparameter optimization algorithms.

OptABC: an Optimal Hyperparameter Tuning Approach for Machine Learning Algorithms

2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA)

Hyperparameter tuning in machine learning algorithms is a computationally challenging task due to the large-scale nature of the problem. In order to develop an efficient strategy for hyper-parameter tuning, one promising solution is to use swarm intelligence algorithms. Artificial Bee Colony (ABC) optimization lends itself as a promising and efficient optimization algorithm for this purpose. However, in some cases, ABC can suffer from a slow convergence rate or execution time due to the poor initial population of solutions and expensive objective functions. To address these concerns, a novel algorithm, OptABC, is proposed to help ABC algorithm in faster convergence toward a near-optimum solution. OptABC integrates artificial bee colony algorithm, K-Means clustering, greedy algorithm, and opposition-based learning strategy for tuning the hyper-parameters of different machine learning models. OptABC employs these techniques in an attempt to diversify the initial population, and hence enhance the convergence ability without significantly decreasing the accuracy. In order to validate the performance of the proposed method, we compare the results with previous state-of-the-art approaches. Experimental results demonstrate the effectiveness of the OptABC compared to existing approaches in the literature.

An effective algorithm for hyperparameter optimization of neural networks

IBM Journal of Research and Development

A major challenge in designing neural network (NN) systems is to determine the best structure and parameters for the network given the data for the machine learning problem at hand. Examples of parameters are the number of layers and nodes, the learning rates, and the dropout rates. Typically, these parameters are chosen based on heuristic rules and manually fine-tuned, which may be very time-consuming, because evaluating the performance of a single parametrization of the NN may require several hours. This paper addresses the problem of choosing appropriate parameters for the NN by formulating it as a box-constrained mathematical optimization problem, and applying a derivative-free optimization tool that automatically and effectively searches the parameter space. The optimization tool employs a radial basis function model of the objective function (the prediction accuracy of the NN) to accelerate the discovery of configurations yielding high accuracy. Candidate configurations explored by the algorithm are trained to a small number of epochs, and only the most promising candidates receive full training. The performance of the proposed methodology is assessed on benchmark sets and in the context of predicting drug-drug interactions, showing promising results. The optimization tool used in this paper is open-source.

Deep neural network hyper-parameter tuning through twofold genetic approach

Soft Computing, 2021

In this paper, traditional and meta-heuristic approaches for optimizing deep neural networks (DNN) have been surveyed, and a genetic algorithm (GA)-based approach involving two optimization phases for hyper-parameter discovery and optimal data subset determination has been proposed. The first phase aims to quickly select an optimal combination of the network hyper-parameters to design a DNN. Compared to the traditional grid-search-based method, the optimal parameters have been computed 6.5 times faster for recurrent neural network (RNN) and 8 times faster for convolutional neural network (CNN). The proposed approach is capable of tuning multiple hyper-parameters simultaneously. The second phase finds an appropriate subset of the training data for near-optimal prediction performance, providing an additional speedup of 75.86% for RNN and 41.12% for CNN over the first phase.

ACCELERATED BAYESIAN OPTIMIZATION FOR DEEP LEARNING

Bayesian optimization for deep learning has extensive execution time because it involves several calculations and parameters. To solve this problem, this study aims at accelerating the execution time by focusing on the output of the activation function that is strongly related to accuracy. We developed a technique to accelerate the execution time by stopping the learning model so that the activation function of the first and second layers would become zero. Two experiments were conducted to confirm the effectiveness of the proposed method. First, we implemented the proposed technique and compared its execution time with that of Bayesian optimization. We successfully accelerated the execution time of Bayesian optimization for deep learning. Second, we attempted to apply the proposed method for credit card transaction data. From these experiments, it was confirmed that the purpose of our study was achieved. In particular, we concluded that the proposed method can accelerate the execution time when deep learning is applied to an extremely large amount of data.