Multidatabase global query optimization (original) (raw)

On global query optimization in multidatabase systems

[1992 Proceedings] Second International Workshop on Research Issues on Data Engineering: Transaction and Query Processing, 1992

Multidatabase management systems (MDBMS) enable data sharing among heterogeneous local databases (component databases) and thus provide interoperability required by diverse applications. In multidatabase systems, user requests data from the multidatabase by posing non-procedural queries. For a query involving more than one database, a global optimization should be performed to achieve good overall system performance. Amcmg the research topics in multidatabase systems, few work was reported on global query optimization. One of the reason is that the significant differences among the optimization problem for multidatabase and the distributed homogenews system are not well recognized. 217 0-81 86-2660=1/92 $3.00 0 1992 IEEE m , , , I , ---

A New Framework For Query Optimization In Multidatabase System Environment

2005

Due to the changing business environment, the need arises to integrate the pre-existing database system in federated way. A great effort had been done to enhance the performance of the Multidatabase management system (MDBMS). The most critical issue considered in this area is the query optimization that can be considered as the backbone of any successful database. Reviewing the most recent related work developed in the query optimization show that, a major challenge for global query optimization in a Multidatabase system is that some required local information about local database components (DBCs) such as local cost model may not be available at the global level due to local autonomy. The main objective of this paper is to introduce a framework that manages a communication between the MDBMS and the DBCs which will be the base to achieving its main goal of solving Query optimization problem.

Source-Aware multidatabase query processing

1997

Abstract We introduce a multidatabase model to represent the information that derives from different local databases. This model, known as Tuple-Source (TS) relational model, accommodates tuples from di erent local databases by attaching them with their source information in the global relations which are also known as TS-relations. In other words, a source attribute is implicit in every TS-relation.

Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems

1998

To meet users' growing needs for accessing pre-existing heterogeneous databases, a multidatabase system (MDBS) integrating multiple databases has attracted many researchers recently. A key feature of an MDBS is local autonomy. For a query retrieving data from multiple databases, global query optimization should be performed to achieve good system performance. There are a number of new challenges for global query optimization in an MDBS. Among them, a major one is that some local optimization information, such as local cost parameters, may not be available at the global level because of local autonomy. It creates difficulties for finding a good decomposition of a global query during query optimization. To tackle this challenge, a new query sampling method is proposed in this paper. The idea is to group component queries into homogeneous classes, draw a sample of queries from each class, and use observed costs of sample queries to derive a cost formula for each class by multiple regression. The derived formulas can be used to estimate the cost of a query during query optimization. The relevant issues, such as query classification rules, sampling procedures, and cost model development and validation, are explored in this paper. To verify the feasibility of the method, experiments were conducted on three commercial database management systems supported in an MDBS. Experimental results demonstrate that the proposed method is quite promising in estimating local cost parameters in an MDBS.

An algebraic transformation framework for multidatabase queries

Distributed and Parallel Databases, 1995

Existence of semantic conflicts between component databases severely impacts query processing in a multidatabase system. In this paper, we describe two types of semantic conflicts that have to be dealt with in the integration of databases modeling information about related sets of real-world entities. These are the entity identification problem and the attribute value conflict problem. While the two-way outerjoin operation has been commonly used for resolving entity identification problem between two component relations, outerjoins using regular equality comparisons between component relation keys is shown to produce counter-intuitive entity identification result. We remedy this by defining a new key-equality comparator in place of regular equality comparator, for outerj oins. For the attribute value conflict problem, we define a Generalized Attribute Derivation (GAD) operation which allows user-defined attribute derivation functions to be used to compute new attributes from the component relations' attributes. By adding two-way outerjoin and GAD to the set of relational operations, the traditional algebraic transformation framework for relational queries is no longer adequate for multidatabase query processing and optimization. As a result, we introduce constrained query tree as the multidatabase query representation. We show that some knowledge about query predicates and attribute derivation functions can be used to simplify queries. Such knowledge is modeled as an outerjoin graph attached to every outerjoin operation in the query tree. Based on this, we further extend the traditional algebraic transformation framework to include two-way outerjoins and GAD operations. Otlr framework demonstrates that properties of selection/join predicates and attribute derivation functions can be used to provide interesting transformation alternatives. This framework also serves as a formal ground for developing optimization strategies for multidatabase queries.

Reformulating query plans for multidatabase systems

… of the second international conference on …, 1993

A practical heterogeneous, distributed multidatabase system must answer queries efficiently. Conventional query optimization techniques are not adequate here because these techniques are dependent on the database structure, and rely on limited information which is not sufficient in complicated multidat abase queries. This paper presents an automated approach to reformulating query plans to improve the efficiency of multidatabase

Providing multidatabase access: an association approach

1993

One of the major tasks in the design of a multidatabase system (MDBS) is the de nition and maintenance of the global schema. Traditionally, this is accomplished by requiring the local databases participating in the MDBS to provide \export schemas" that are merged into a global schema. Resolution of schema and data incompatibilities, and mapping between local and global schemas are, in general, very di cult tasks that must be performed at the multidatabase level. We believe that a solution to this formidable problem may lie in the shifting of responsibility for these tasks to the local level. We propose a model in which the MDBS administrator de nes the global schema as a view that is to be maintained by e a c h of the participating databases. The MDBS layer supports submission and processing of (global) queries expressed over a union of such views. Each participating database must provide a view of its database that conforms to the global speci cation and must promise to respond to queries formulated over this view. We discuss the architecture of such systems and the problems involved in the processing of global queries.

Data Base Management Systems Query Optimization Techniques for Distributed Database Systems

IJRASET, 2021

The fundamental goal of this postulation is to introduce various models for single also as numerous inquiry handling in the Distributed data set framework which brings about less question handling cost. One of the significant issues in the plan and execution of Distributed Information Base Management Systems (DDBMS) is productive inquiry handling. The objective of dispersed inquiry improvement decreases to minimization of measure of information to be communicated among destinations for handling a given inquiry. The issue of question handling in DDBS (1 1) has been concentrated broadly in writing. In the greater part of calculations, the capability of the question will contain a grouping of tasks. In such cases, while executing tasks from right to left, as per the request for tasks in arrangement, the aftereffect of an activity might be an operand to the next activity. Since the tasks are subject to each other, at a moment in particular one activity at one site will be executed despite the fact that the climate is dispersed. Then frameworks at any remaining locales will be inactive for this inquiry. Another model, Totally Reducible Relation Model (CRK Medel), which permits parallelism and processes numerous tasks all the while at all important locales is introduced. It is expected that the tasks are in the type of conjunctions. So every activity can be handled freely. In this model at some moment, relations at every single significant site will be totally diminished by relating sets of every appropriate activity (Determinations, Semijoins and Joins) all the while. Thus, every connection will be checked just a single time to deal with all appropriate tasks by decreasing VO cost.