Implementation of Query Optimization in Distributed Database System through Genetic Algorithm (original) (raw)

An Evolutionary Genetic Algorithm for Optimization of Distributed Database Queries

The Computer Journal, 2011

High-performance low-cost PC hardware and high-speed LAN/WAN technologies make distributed database (DDB) systems an attractive research area where query optimization and DDB design are the two important and related problems. Since dynamic programming is not feasible for optimizing queries in a DDB, we propose a new genetic algorithm (GA)-based query optimizer (new genetic algorithm (NGA)) and compare its performance with random and optimal (exhaustive) algorithms. We perform experiments on a synthetic database with replicated relations, but no horizontal or vertical fragmentation. Network links are assumed to be gigabit ethernet. Comparisons with optimal results show that our NGA formulation performs only 20% of the optimal results and we have achieved 50% improvement over a previous GA-based algorithm.

Distributed Query Processing Plans Generationusing Genetic Algorithm

International Journal of Computer Theory and Engineering, 2011

Large amount of information available in distributed databases needs to be exploited by organizations in order to be competitive in the market. In order to exploit this information, queries are posed thereupon. These queries require efficient processing, which mandates devising of optimal query processing strategies that generate efficient query processing plans for a given distributed query. The number of possible query processing plans grows rapidly with increase in the number of sites used, and relations accessed, by the query. There is a need to generate efficient query processing plans from among all possible query plans. The proposed approach attempts to generate such query processing plans using genetic algorithm. The approach generates query plans based on the closeness of data required to answer the user query. The query plans having the required data residing in fewer sites, are considered more efficient, and are thus preferred, over query plans having data spread across a large number of sites. The query plans so generated involve minimum number of sites for answering the user query leading to efficient query processing. Further, experimental results show that the GA based approach converges quickly towards the optimal query processing plans for an observed crossover and mutation rate.

Novel Distributed Query Optimization Model and Hybrid Query Optimization Algorithm

International Journal of Computer Applications, 2013

Query optimization is the most critical phase in query processing. Query optimization in distributed databases explicitly needed in many aspects of the optimization process, this is not only increases the cost of optimization, but also changes the trade-offs involved in the optimization process significantly .This paper describes the synthetically evolution of query optimization methods from uniprocessor relational database systems to parallel database systems. We point out a set of parameters to characterize and compare query optimization methods, mainly: (i) type of algorithm (static or dynamic), (ii) working environments (re-optimization or rescheduling) and (iii) level of modification. The major contributions of this paper are: (I) Understanding the mechanisms of query optimization methods with respect to the considered environments and their constraints (e.g. parallelism, distribution, heterogeneity, large scale, dynamicity of nodes). (ii) Study the problem of query optimization particular in term of heterogeneously environment and pointing out their main characteristics, which allow comparing them and help to Implement new query optimization algorithm and model. These contributions is led to performance enhancement of query optimization in distributed database system through classify by different QEPs and minimize the response time.

Evolutionary Algorithms for Query Op-timization in Distributed Database Sys-tems: A review

ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal

Evolutionary Algorithms are bio-inspired optimization problem-solving approaches that exploit principles of biological evolution. , such as natural selection and genetic inheritance. This review paper provides the application of evolutionary and swarms intelligence based query optimization strategies in Distributed Database Systems. The query optimization in a distributed environment is challenging task and hard problem. However, Evolutionary approaches are promising for the optimization problems. The problem of query optimization in a distributed database environment is one of the complex problems. There are several techniques which exist and are being used for query optimization in a distributed database. The intention of this research is to focus on how bio-inspired computational algorithms are used in a distributed database environment for query optimization. This paper provides working of bio-inspired computational algorithms in distributed database query optimization which inc...

An Analysis on Query Optimization in Distributed Database

2014

The query optimizer is a significant element in today’s relational database management system. This element is responsible for translating a user-submitted query commonly written in a non-procedural language-into an efficient query evaluation program that can be executed against the database. This research paper describes architecture steps of query process and optimization time and memory usage. Key goal of this paper is to understand the basic query optimization process and its architecture.

Using Heuristics and Genetic Algorithms for Large-scale Database Query Optimization

2007

Distributed database system technology is one of the major developments in information technology area. It will continue to have a very significant impact on data processing in the upcoming years because distributed database systems have many potential advantages over centralized systems for geographically distributed organizations. The continuing interest in distributed database systems in the research community and the marketplace and the introduction of many commercial products indicate that distributed database systems will play a more important role in data processing and eventually will replace centralized systems as the major database technology in the future. The availability of high speed communication networks and, especially, the phenomenal popularity of the Internet and the intranets will undoubtedly speed up the transition process. Some challenging problems must be solved before the full potential benefits of distributed database technology can be realized. Among them is query processing (including query optimization), one of the most important issues in distributed database system design. The query optimization problem in large-scale distributed databases is NP-hard in nature and difficult to solve. In this study, the query optimization problem is reduced to a join ordering problem similar to a variant of traveling salesman problem. We explored several heuristics and a genetic algorithm for solving the join ordering problem. Some computational experiments on these algorithms were conducted and solution qualities compared. The computation experiments show that heuristics and genetic algorithms are viable methods for solving query optimization problem in large scale distributed database systems. 262 issues related to the problem, to model the problem, taking into consideration the most important factors, to propose some solution methods for these models, and, finally, to conduct computational experiments and compare the results to determine the effectiveness and efficiency of the solution techniques (algorithms). We believe that the development of the comprehensive models for the query optimization in large-scale systems, as well as finding effective and/or efficient solution techniques to solve the problems that have been identified are important and will contribute to the use of and research on distributed database technology.

IJERT-Genetic Algorithm Based Query Execution Plan Generation Using Join Site Mechanism in Heterogeneous Distributed Database

International Journal of Engineering Research and Technology (IJERT), 2013

https://www.ijert.org/genetic-algorithm-based-query-execution-plan-generation-using-join-site-mechanism-in-heterogeneous-distributed-database https://www.ijert.org/research/genetic-algorithm-based-query-execution-plan-generation-using-join-site-mechanism-in-heterogeneous-distributed-database-IJERTV2IS120438.pdf A heterogeneous distributed database system (HDDBS) is an attractive research area where query optimization plays important role. Heterogeneous Distributed Database is an integration of distribution of data with database schema. Retrieving the proper result with best query processing strategy is nothing but optimization. In this paper, a Genetic Algorithm (GA)-based query optimizer is used to optimize Distributed Queries. In this paper we concentrated on join-site mechanism for finding different execution plans. Main aim of this work is to choose the right set of plans for queries which minimizes the total execution time. Mobile agents is used to perform specific task by migrating and executing on several hosts connected in the network. By implementing this, we will get the optimized plan for the particular query so that when similar query comes again for execution, we will directly execute that query with optimized plan

Analysis of Query Optimization Components in Distributed Database

Indian Journal of Science and Technology, 2018

Objectives: This paper brings to light different query optimization components and their optimizing functionalities which are helpful to improve the response time of query and the efficiency of distributed database. A cache based optimization is also analyzed to highlight the query optimization process. Methods: As data is the most valuable asset for any organization due to this they want to get access and use it efficiently and in a timely manner. To evaluate the efficiency of query optimization its different components e.g. search space, search strategy and cost model are evaluated with the help of examples, tables and diagrams. By comparing the different results, a cache based optimization technique is also evaluated. Findings: It is observed that in search space generated plans are equivalent in the sense they provide same results but their operation, implementation and performance is different. Different algorithms of search strategy are also examined to get the quicker and accurate results and notice that movement of search strategy is greatly depend upon join ordering and cost model. It is also observed that the cost model is helpful to select the best query execution plan but it depends upon the different parameters for example queue length, sever distance, server capacity and load. The latest cache based query optimization technique is also examined and noted that it is key to improve the response time of query as its computational cost is very low. It will be more helpful if it is placed at each site. Applications and Future Improvements: Currently cache based query optimization is applicable only for homogeneous distributed databases. In future this technique can also be implemented for heterogeneous type of databases.

Optimizing distributed join queries: A genetic algorithm approach

Annals of Operations Research, 1997

ptimizing join queries is a major problem in distributed database systems, particularly when files are replicated and copies stored at different nodes in the network. A distributed query optimization algorithm must select file copies and determine how and where those files will be processed. Process decisions include which files to reduce via semijoins, if any, the sites at which to perform join operations, and the order in which to perform those join operations. We extend the scope of distributed query optimization research by develop-ing a model that, for the first time, includes all of these design decisions and considers both communication and local processing costs. We develop a genetic algorithm-based solution procedure for this model which quickly determines efficient query processing plans. We demonstrate that ignoring local processing costs or restricting join processing to the result site, as commonly done in prior research, can result in inefficient query execution plans.

Query Optimization in Distributed Database: A Review

Distributed database is emerging as a boon for large organizations as it provides better flexibility and ease compared to centralized database. As the data is growing over the distributed environment day by day, a better distributed management system is required to manage this large data. Query optimization is a process of finding out better query execution plan from multiple available options. As there a multiple sites in distributed database having parts of the data, query optimization is one of the challenging tasks in distributed database. In this review paper query optimization challenges in distributed database and its basic steps have been studied. And a review of some proposed systems has been done.