Query Optimization Strategies in Distributed Databases (original) (raw)

Optimization of Distributed Database Queries Using Hybrids of Ant Colony Optimization Algorithm

With the advancement of Computer Networks and increase in size of databases, the decentralization of databases has led to the development of Distributed Database over multiple machines where distribution of the database is Transparent to the users. The query optimization problem in large-scale distributed databases is NP-hard in nature and difficult to solve. Research is being carried out to find an appropriate algorithm to seek an optimal solution especially when the size of the database increases [4]. An Ant Colony optimization Algorithm meets the requirement mentioned above because of its characteristics of positive feedback, distributed computing and combination with heuristics. However, when ACO is implemented in Distributed Database queries, the Initial Information needed by ACO to generate an optimal result set is not systematic and organized which leads to slower convergence speed in the beginning of the processing to generate an optimal solution. In this paper, hybrids of A...

Distributed Query Optimization Using Hybrid Ant Colony Algorithm

Distributed database is emerging as a boon for large organizations as it provides better flexibility and ease compared to centralized database. As the data is growing over the distributed environment day by day, a better distributed management system is required to manage this large data. Query optimization is a process of finding out better query execution plan from multiple available options. As there a multiple sites in distributed database having parts of the data, and the size of data is not static, a dynamic solution is needed to optimize queries in distributed database. The combination of Ant Colony Algorithm and Genetic Algorithm can be used to provide a dynamic approach.

A multi-colony ant algorithm for optimizing join queries in distributed database systems

Knowledge and Information Systems, 2013

Distributed database systems provide a new data processing and storage technology for decentralized organizations of today. Query optimization, the process to generate an optimal execution plan for the posed query, is more challenging in such systems due to the huge search space of alternative plans incurred by distribution. As finding an optimal execution plan is computationally intractable, using stochastic-based algorithms has drawn the attention of most researchers. In this paper, for the first time, a multi-colony ant algorithm is proposed for optimizing join queries in a distributed environment where relations can be replicated but not fragmented. In the proposed algorithm, four types of ants collaborate to create an execution plan. Hence, there are four ant colonies in each iteration. Each type of ant makes an important decision to find the optimal plan. In order to evaluate the quality of the generated plan, two cost models are used-one based on the total time and the other on the response time. The proposed algorithm is compared with two previous genetic-based algorithms on chain, tree and cyclic queries. The experimental results show that the proposed algorithm saves up to about 80 % of optimization time with no significant difference in the quality of generated plans compared with the best existing genetic-based algorithm.

Dynamic Programming with Ant Colony Optimization Metaheuristic for Optimization of Distributed Database Queries

In this paper, we introduce and evaluate a new query optimization algorithm based on Dynamic Programming (DP) and Ant Colony Optimization (ACO) metaheuristic for distributed database queries. DP algorithm is widely used for relational query optimization, however its memory, and time requirements are very large for the query optimization problem in a distributed database environment which is an NP-hard combinatorial problem. Our aim is to combine the power of DP with heuristic approaches so that we can have a polynomial time approximation algorithm based on a constructive method. DP and ACO algorithms together provide execution plans that are very close to the best performing solutions, and achieve this in polynomial time. This makes our algorithm viable for large multi-way join queries.

Query Optimization in Distributed Database: A Review

Distributed database is emerging as a boon for large organizations as it provides better flexibility and ease compared to centralized database. As the data is growing over the distributed environment day by day, a better distributed management system is required to manage this large data. Query optimization is a process of finding out better query execution plan from multiple available options. As there a multiple sites in distributed database having parts of the data, query optimization is one of the challenging tasks in distributed database. In this review paper query optimization challenges in distributed database and its basic steps have been studied. And a review of some proposed systems has been done.

Novel Distributed Query Optimization Model and Hybrid Query Optimization Algorithm

International Journal of Computer Applications, 2013

Query optimization is the most critical phase in query processing. Query optimization in distributed databases explicitly needed in many aspects of the optimization process, this is not only increases the cost of optimization, but also changes the trade-offs involved in the optimization process significantly .This paper describes the synthetically evolution of query optimization methods from uniprocessor relational database systems to parallel database systems. We point out a set of parameters to characterize and compare query optimization methods, mainly: (i) type of algorithm (static or dynamic), (ii) working environments (re-optimization or rescheduling) and (iii) level of modification. The major contributions of this paper are: (I) Understanding the mechanisms of query optimization methods with respect to the considered environments and their constraints (e.g. parallelism, distribution, heterogeneity, large scale, dynamicity of nodes). (ii) Study the problem of query optimization particular in term of heterogeneously environment and pointing out their main characteristics, which allow comparing them and help to Implement new query optimization algorithm and model. These contributions is led to performance enhancement of query optimization in distributed database system through classify by different QEPs and minimize the response time.

An Analysis on Query Optimization in Distributed Database

2014

The query optimizer is a significant element in today’s relational database management system. This element is responsible for translating a user-submitted query commonly written in a non-procedural language-into an efficient query evaluation program that can be executed against the database. This research paper describes architecture steps of query process and optimization time and memory usage. Key goal of this paper is to understand the basic query optimization process and its architecture.

Presenting New Method to Optimize Query in Distributed Database System

Query optimization is one of the essential problems in centralized and distributed database. The data allocation to different sites is proposed in a distributed DMS(Database Management System) before a query in order to decrease, the next communicative costs namely an optimized bed production which is of ‘NP’ issues. In this article, it was attempted to examine both the methods to allocate data and produce optimized design in a distributed system and the space to query for query optimization in the distributed environment and show the need concerning optimization method in view of different aspects of optimization process. We install a new method for optimization in distributed database environment which indicates somehow our simple optimization design is executed relatively well until the database design is physical

A REVIEW OF QUERY OPTIMIZATION IN DISTRIBUTED OBJECT ORIENTED RELATIONAL DATABASE MANAGEMENT SYSTEM

Execution of Structured Query Language (SQL) queries in optimized way in the distributed database is a hitch that most of the database programmer faces since the inception of database technology. Query optimization in network is one of the hardest problems in the database area. The commercialization and success of database systems is primarily due to the development of complicated query optimization techniques. Database users post their queries in a declarative mode by by means of SQL or Object Query Langua ge (OQL) and the Query Optimizer of the related database system find a best plan to execute the same. The optimizer determines the best indices to be used to execute a query and the order in which the operations of a query should be executed. To achieve t his, the optimizer estimate alternative plans, and also estimate the cost of query plan by means of a cost model, and then selects the plan with lowest cost. There has been much research into this field. In this paper, we will review the difficulty of dist ributed query optimization; and will emphasis on the various components of the query optimizer required in distributed environment, i.e. cost model, search space and search strategy. A review of the existing work in this field is shown and future work is h ighlighted based on recent work that utilizes mobile agent technologies.

Analysis of Query Optimization Components in Distributed Database

Indian Journal of Science and Technology, 2018

Objectives: This paper brings to light different query optimization components and their optimizing functionalities which are helpful to improve the response time of query and the efficiency of distributed database. A cache based optimization is also analyzed to highlight the query optimization process. Methods: As data is the most valuable asset for any organization due to this they want to get access and use it efficiently and in a timely manner. To evaluate the efficiency of query optimization its different components e.g. search space, search strategy and cost model are evaluated with the help of examples, tables and diagrams. By comparing the different results, a cache based optimization technique is also evaluated. Findings: It is observed that in search space generated plans are equivalent in the sense they provide same results but their operation, implementation and performance is different. Different algorithms of search strategy are also examined to get the quicker and accurate results and notice that movement of search strategy is greatly depend upon join ordering and cost model. It is also observed that the cost model is helpful to select the best query execution plan but it depends upon the different parameters for example queue length, sever distance, server capacity and load. The latest cache based query optimization technique is also examined and noted that it is key to improve the response time of query as its computational cost is very low. It will be more helpful if it is placed at each site. Applications and Future Improvements: Currently cache based query optimization is applicable only for homogeneous distributed databases. In future this technique can also be implemented for heterogeneous type of databases.