Comparing A-Star Heuristics with Integer Linear Programming for the Multiple Query Optimization Problem (original) (raw)

Integer Linear Programming Approach for the Multiple Query Optimization Problem

Multiple Query Optimization (MQO) is a technique for processing a batch of queries in such a way that shared tasks in these queries are executed only once, resulting in significant savings in the total evaluation. The first phase of MQO requires producing alternative query execution plans so that the shared tasks between queries are identified and maximized. The second phase of MQO is an optimization problem where the goal is selecting exactly one of the alternative plans for each query to minimize the total execution cost of all queries. A-star, branch-and-bound, dynamic programming (DP), and genetic algorithm (GA) solutions for MQO have been given in the literature. However; the performance of optimal algorithms, A-star and DP, is not sufficient for solving large MQO problems involving large number of queries. In this study, we propose an Integer Linear Programming (ILP) formulation to solve the MQO problem exactly for large number of queries and evaluate its performance. Our results show that ILP outperforms the existing A-star algorithm.

Metadata of the chapter that will be visualized in SpringerLink Book Title Information Sciences and Systems 2014 Series Title Chapter Title Integer Linear Programming Approach for the Multiple Query Optimization Problem

Multiple Query Optimization (MQO) is a technique for processing a batch of queries in such a way that shared tasks in these queries are executed only once, resulting in significant savings in the total evaluation. The first phase of MQO requires producing alternative query execution plans so that the shared tasks between queries are identified and maximized. The second phase of MQO is an optimization problem where the goal is selecting exactly one of the alternative plans for each query to minimize the total execution cost of all queries. A-star, branch-and-bound, dynamic programming (DP), and genetic algorithm (GA) solutions for MQO have been given in the literature. However, the performance of optimal algorithms, A-star and DP, is not sufficient for solving large MQO problems involving large number of queries. In this study, we propose an Integer Linear Programming (ILP) formulation to solve the MQO

Genetic algorithm for the multiple-query optimization problem

Systems, Man, and …, 2007

Producing answers to a set of queries with common tasks efficiently is known as the multiple-query optimization (MQO) problem. Each query can have several alternative evaluation plans, each with a different set of tasks. Therefore, the goal of MQO is to choose the right set of plans for queries which minimizes the total execution time by performing common tasks only once. Since MQO is an NP-hard problem, several, mostly heuristics based, solutions have been proposed for solving it. To the best of our knowledge, this correspondence is the first attempt to solve MQO using an evolutionary technique, genetic algorithms.

Using Heuristics and Genetic Algorithms for Large-scale Database Query Optimization

2007

Distributed database system technology is one of the major developments in information technology area. It will continue to have a very significant impact on data processing in the upcoming years because distributed database systems have many potential advantages over centralized systems for geographically distributed organizations. The continuing interest in distributed database systems in the research community and the marketplace and the introduction of many commercial products indicate that distributed database systems will play a more important role in data processing and eventually will replace centralized systems as the major database technology in the future. The availability of high speed communication networks and, especially, the phenomenal popularity of the Internet and the intranets will undoubtedly speed up the transition process. Some challenging problems must be solved before the full potential benefits of distributed database technology can be realized. Among them is query processing (including query optimization), one of the most important issues in distributed database system design. The query optimization problem in large-scale distributed databases is NP-hard in nature and difficult to solve. In this study, the query optimization problem is reduced to a join ordering problem similar to a variant of traveling salesman problem. We explored several heuristics and a genetic algorithm for solving the join ordering problem. Some computational experiments on these algorithms were conducted and solution qualities compared. The computation experiments show that heuristics and genetic algorithms are viable methods for solving query optimization problem in large scale distributed database systems. 262 issues related to the problem, to model the problem, taking into consideration the most important factors, to propose some solution methods for these models, and, finally, to conduct computational experiments and compare the results to determine the effectiveness and efficiency of the solution techniques (algorithms). We believe that the development of the comprehensive models for the query optimization in large-scale systems, as well as finding effective and/or efficient solution techniques to solve the problems that have been identified are important and will contribute to the use of and research on distributed database technology.

On Multi Query Optimization Algorithms Problem

Without multi query optimization, Relational Database Management System for online and analytical decision support systems would have been inefficient and hence unpractical. It is an expensive process because it relies at a great extent on evaluating the different plans (access paths) and choosing an optimal one among them. In Multi Query Optimization, queries are executed in batches and there were many different algorithms acted in such way that, in case some queries have a common sub-expression such a sub-expression is executed once and the output shared. We studied the basic multi query optimization algorithms including Basic Volcano, Volcano-SH and Volcano RU, identified their strengths and weaknesses and recommend strategies for developing new improved multi query optimization algorithm so as to reduce weaknesses and integrate strengths of the different basic multi query algorithms into one efficient algorithm.

Multi Query Optimization Algorithm Using Semantic and Heuristic Approaches

Multi Query Optimization is one of the most important tasks in Relational Database Management System (RBMS) and it becomes common due to high usage of online decision support management systems in every industry nowadays. In multi query optimization, queries are optimized and executed in batches. However, there are many algorithms use to detect and unified common sub-expressions among multiple queries and unified them so that the more encompassing sub-expression is executed and the other sub-expressions are derived from. In this work, multi-query optimization algorithm using heuristics and semantic approaches was proposed and encoded on SQL Server version 10.0.1600 and three queries were used for the experiment between the proposed algorithm and most recent basic Multi Query Optimization Algorithm (Volcano RU). The result of experiment showed that, Proposed Algorithm gave the best plans compared Volcano RU Algorithm, across all three queries and was best for all queries in terms of execution time and CPU time.

Query Scheduling in Multi Query Optimization

2001

Complex queries are becoming commonplace, with the growing use of decision support systems. Decision support queries often have a lot of common sub-expressions within each query, and queries are often run as a batch. Multi query optimization aims at exploiting common sub-expressions, to reduce the evaluation cost of queries, by computing them once and then caching them for future use, both within individual queries and across queries in a batch. In case cache space is limited, the total size of sub-expressions that are worth caching may exceed available cache space. Prior work in multi query optimization involves choosing a set of common sub-expressions that fit in available cache space, and once computed, retaining their results across the execution of all queries in a batch. Such optimization algorithms do not consider the possibility of dynamically changing the cache contents. This may lead to sub-expressions occupying cache space even if they are not used by subsequent queries. The available cache space can be best utilized by evaluating the queries in an appropriate order and changing the cache contents as queries are executed. We present several algorithms that consider these factors, in order to reduce the cost of query evaluation

A framework for multi-query optimization

1996

In some key database applications, a sequence of interdependent queries may be posed simultaneously to the DBMS. The optimization of such sequences is called multi-query optimization, and it attempts to exploit these dependencies in the derivation of a query evaluation plan (qep).

Efficient Algorithm for Multi Query Optimization

Multi Query Optimization is an important process in database and it becomes the commonplace due to the frequent usage of decision support systems in almost all the multinational enterprises. The multiple queries from different users that have been addressed to one schema often have a lot of common sub-expressions and it is the function of the multi query optimization algorithms such as Basic Volcano, Volcano RU and Volcano SH algorithms to optimize such multiple queries together and executes the common operation once and share the output among the queries. In this work, a multi query shareability algorithm which can efficiently detect the common sub-expressions among the multiple queries and share the output among those queries was proposed and algorithm for optimal order of those queries was also proposed. The Algorithm has a time complexity of O(n 2 + 9n +6) while the most recent basic algorithm thus Volcano RU Algorithm has O(2n 2 +20n +12), both the algorithms have O(n 2) time complexity which is quadratic in nature. However, the Proposed Algorithm is more efficient and better than Volcano RU algorithm even if n approach to infinity.