Adaptable cache service and application to grid caching (original) (raw)

Distributed Semantic Caching in Grid Middleware

Lecture Notes in Computer Science, 2007

This paper proposes a flexible caching solution to improve query evaluation in grids. It reduces both, data transfer and query computation, by adopting a distributed semantic caching approach. Our proposal introduces multi-scale cache cooperation including single site cooperation between object caches and distributed context aware cooperation between several query caches. Different cache miss resolution protocols are introduced for query evaluation and experimented in a grid data management for bioinformatics applications.

Intelligent cache management for data grid

2005

Data grid is a composition of different services for accessing and manipulating data in a distributed environment. One of the major problems is high access time of remote queries in the data grid. In this paper we present a scheme for data cache management system for data intensive applications on Data Grid, which provides a faster access to remote data by intelligently managing the copies in the local cache. In our architecture, we emphasize on data cache management for a multi DB heterogenous database environment using basic Grid services. We give a detailed view for manipulating intelligent cache for structured queries. We also cover, to some extent, intelligent query distribution concept in our homogenous intra-organization data grid with example scenarios.

Management of a cooperative cache in grids with grid cache services

… and Computation: Practice …, 2007

Distributed systems like grids support diverse models of distributed computation and need to operate large data entities in a distributed way. A significant quantity of this data are used only for a limited period of time. Caching is recognized as one of the most effective techniques to manage temporary data and collaborative cache is traditionally proposed to scale cache capabilities in distributed environments. Grid needs to manage dynamically different models of computation with different data access patterns. In this paper, we propose a basic infrastructure for the management of collaborative caches that permits to operate and control dynamically different cache mechanisms and cache schemes in grid. Beside traditional collaborative caching where the cooperation is often limited to data resolution, in our infrastructure the collaborative cache capacities are extended to operate and manage these distributed temporal data. Our proposition is composed of a reference cache model that defines four layers for the management of collaborative cache; an information model that represents the main cache elements and their activity; and a set of operations to request specific tasks to monitor, operate, and coordinate a generic collaborative cache system. Implementation issues of a prototype in Globus Toolkit 4 are discussed.

Optimization of Distributed Queries in Grid Via Caching

Lecture Notes in Computer Science, 2005

Caching can highly improve performance of query processing in distributed databases. In this paper we show how this technique can be used in grid architecture where data integration is implemented by means of updatable views. Views integrate data from heterogeneous sources and provide users with their integrated form. The whole process of integration is transparent, i.e. users need not be aware that data are not located at one place. In data grids caching can be used at different levels of architecture. We focus on caching at the middleware layer where the cache is stored in the database of the integrating unit. These results can be then used while answering queries from grid users, so there will be no need to reevaluate the whole queries. In such a way caching can highly increase performance of applications operating on grid. In the paper we also present an example how a query can be optimized by rewriting to make use of cached results.

Data Cache with Distributed Cache: A Design Approach

International Journal of Computer Science and Engineering, 2017

Caching techniques has helped developers to deliver applications that are capable of fast turnaround time which otherwise could have been much slower and under-performed software solutions, less worthy of user's appreciation. Caching can typically be used at both hardware and software levels with the same ultimate goal of either achieving higher throughput or higher latency or both together. Limiting the subject to software level cache, the caching techniques could further be introduced in one of the two categories namely web cache and data cache. While web cache is often defined in the context of a browser which is a clientside application, the data cache is defined in the context of caching needs of a data extensive application. In terms of a database management system, it means a cache provisioned at the database services itself whereas, in the context of the application, it means the cache that spans through layers of the application, more precisely termed as tiers in a multi-tier application that is designed to cache an already queried data. The requirement of frequent data access in high volumes, in distributed applications, drives the need for more capable infrastructure towards building a caching framework. In this paper, we focus our discussion on data cache requirements of a distributed application and the key design factors that distinguish a distributed cache as an elegant cache service provider plugin to such distributed applications. We also propose a simplistic design that could be used to implement the core of a custom distributed cache.

A Modular Approach to Distributed Caching

2020

It is essential today organizations to have fast and stable access to information stored in different sources. Last generation of in-memory database demonstrates much better productivity in data processing compared to classical relational database management system. In this paper, an approach for modular distributed caching is proposed. The approach is based on a modular layered architecture, which extends the primary relational based systems and will make it possible to increase the speed of queries processing. Some tests based on a prototype are performed and discussed.

Towards a distributed scalable data service for the grid

Parallel Computing: Current & Future Issues of High-End Computing (Proc. of PARCO 2005, Malaga, Spain), Germany, 2006, pp. 73-80. , 2006

ADHOC (Adaptive Distributed Herd of Object Caches) is a Grid-enabled, fast, scalable object repository providing programmers with a general storage module. We present three different software tools based on ADHOC: A parallel cache for Apache, a DSM, and a main memory parallel file system. We also show that these tools exhibit a considerable performance and speedup both in absolute figures and w.r.t. other software tools exploiting the same features.

DICE: An Effective Query Result Cache for Distributed Storage Systems

2010

Due to the proliferation of Internet and Intranet, the distributed storage systems have received a lot of attention. These systems span a large number of machines and store huge amount of data for a lot of users. In the distributed storage systems, a row can be directly accessed using a row key. We concentrate on a problem of efficient processing of queries whose predicate is on a column but not a row key. In this paper, we present a cache management technique, called DICE which maintains query results of range queries to support the next range queries. To accelerate the search time of the cached query results, we use modified Interval Ski Lists. In addition, we devise a novel cache replacement policy since DICE maintains an interval rather than a data item. Since our cache replacement policy considers the properties of intervals, our proposed technique is more efficient than traditional buffer replacement algorithms. Our experimental result demonstrates the efficiency of our proposed technique.