Parallel query processing for OLAP in grids (original) (raw)

OLAP query processing is critical for enterprise grids. Capitalizing on our experience with the ParGRES database cluster, we propose a middleware solution, GParGRES, which exploits database replication and inter-and intra-query parallelism to efficiently support OLAP queries in a grid. GParGRES is designed as a wrapper that enables the use of ParGRES in PC clusters of a grid (in our case, Grid5000). Our approach has two levels of query splitting: grid-level splitting, implemented by GParGRES, and nodelevel splitting, implemented by ParGRES. GParGRES has been partially implemented as database grid services compatible with existing grid solutions such as the open grid service architecture and the Web services resource framework. We give preliminary experimental results obtained with two clusters of Grid5000 using queries of the TPC-H Benchmark. The results show linear or almost linear speedup in query execution, as more nodes are added in all tested configurations. N. KOTOWSKI ET AL. databases using Web services and provide transparent support for database queries . Ideally, a grid database solution must respect database autonomy (i.e. avoid database or application migration) while taking advantage of distributed and parallel computing. This can be achieved through the development of a middleware layer between the user applications and the databases. Such a middleware should provide for distributed and parallel query processing with non-intrusive techniques, considering DBMS as black-box components; hence, there is no need for database or application migration.