Practical Wide-Area Database Replication1 (original) (raw)

Testing the Dependability and Performance of Group Communication Based Database Replication Protocols

2005 International Conference on Dependable Systems and Networks (DSN'05), 2005

Database replication based on group communication systems has recently been proposed as an efficient and resilient solution for large-scale data management. However, its evaluation has been conducted either on simplistic simulation models, which fail to assess concrete implementations, or on complete system implementations which are costly to test with realistic large-scale scenarios.

On the performance of wide-area synchronous database replication

CNDS-Johns Hopkins …, 2002

A fundamental challenge in database replication is to maintain a low cost of updates while assuring global system consistency. The difficulty of the problem is magnified for wide-area network settings due to the high latency and the increased likelihood of network partitions. As a consequence, most of the research in the area has focused either on improving the performance of local transaction execution or on replication models with weaker semantics, which rely on application knowledge to resolve potential conflicts. In this work we identify the performance bottleneck of the existing synchronous replication schemes as residing in the update synchronization algorithm. We compare the performance of several such synchronization algorithms and highlight the large performance gap between various methods. We design a generic, synchronous replication scheme that uses an enhanced synchronization algorithm and demonstrate its practicality by building a prototype that replicates a PostgreSQL database system. We claim that the use of an optimized synchronization engine is the key to building a practical synhronous replication system for wide-area network settings.

Enhancing the Availability of Networked Database Services by Replication and Consistency Maintenance

2003

We describe an operational middleware platform for maintaining the consistency of replicated data objects, called COPla (Common Object Platform). It supports both eager and lazy update propagation for replicated data in networked relational databases. The purpose of replication is to enhance the availability of data objects and services in distributed database networks. Orthogonal to recovery strategies of backed-up snapshots, logs and other measures to alleviate database downtimes, COPla caters for high availability during downtimes of parts of the network by supporting a range of different consistency modes for distributed replications of critical data objects.

RJDBC: A Simple Database Replication Engine

International Conference on Enterprise Information Systems, 2004

Abstract: Providing fault tolerant services is a key question among many services manufacturers. Thus, enterprisesusually acquire complex and expensive replication engines. This paper offers an interesting choice to organizationswhich can not afford such costs. RJDBC stands for a simple, easy to install middleware, placedbetween the application and the database management system, intercepting all database operations and forwardingthem among all the

SIPRe: a partial database replication protocol with SI replicas

2008

Database replication has been researched as a solution to overcome the problems of performance and availability of distributed systems. Full database replication, based on group communication systems, is an attempt to enhance performance that works well for a reduced number of sites. If application locality is taken into consideration, partial replication, i.e. not all sites store the full database, also enhances scalability. On the other hand, it is needed to keep all copies consistent. If each DBMS provides SI, the execution of transactions has to be coordinated so as to obtain Generalized-SI (GSI). In this paper, a partial replication protocol providing GSI is introduced that gives a consistent view of the database, providing an adaptive replication technique and supporting the failure and recovery of replicas.

Database replication in large scale systems

Proceedings of the 2009 EDBT/ICDT Workshops on - EDBT/ICDT '09, 2009

In distributed systems, replication is used for ensuring availability and increasing performances. However, the heavy workload of distributed systems such as web2.0 applications or Global Distribution Systems, limits the benefit of replication if its degree (i.e., the number of replicas) is not controlled. Since every replica must perform all updates eventually, there is a point beyond which adding more replicas does not increase the throughput, because every replica is saturated by applying updates. Moreover, if the replication degree exceeds the optimal threshold, the useless replica would generate an overhead due to extra communication messages. In this paper, we propose a suitable replication management solution in order to reduce useless replicas. To this end, we define two mathematical models which approximate the appropriate number of replicas to achieve a given level of performance. Moreover, we demonstrate the feasibility of our replication management model through simulation. The results expose the effectiveness of our models and their accuracy.

GORDA: An Open Architecture for Database Replication Extended Report

Abstract Recently, third party solutions for database replication have been enjoying an increasing popularity. Such proposals address a diversity of user requirements, namely preventing conflicting updates without the overhead of synchronous replication; clustering for scalability and availability; and heterogeneous replicas for specialized queries.

Scalable replication in database clusters

2000

In this paper, we explore data replication protocols that provide both fault tolerance and good performance without compromising consistency. We do this by combining transactional concurrency control with group communication primitives. In our approach, transactions are executed at only one site so that not all nodes incur in the overhead of producing results. To further reduce latency, we use an optimistic multicast technique that overlaps transaction execution with total order message delivery. The protocols we present in the paper provide correct executions while minimizing overhead and providing higher scalability.

Wide-Area Replication Support for Global Data Repositories

2005

Wide-area replication improves the availability and performance of globally distributed data repositories. Protocols needed for maintaining replication consistency may cause an undesirable overhead. Different use cases of repositories suggest the use of different replication protocols, each requiring different meta data. The WADIS middleware for wide-area replication support of distributed data repositories makes use of ready-made database resources such as triggers and views, employing the underlying database management system to support replication protocols, the implementation of which thus becomes more succinct and much simpler. WADIS enables the simultaneous maintenance of multiple meta data for different protocols, so that the latter can be chosen, plugged in and exchanged on the fly in order to adapt to the needs of different use cases best.

On the Path from Total Order to Database Replication

2004

The date of receipt and acceptance will be inserted by the editor Summary. We introduce ZBCast, a primitive that provides Persistent Global Total Order for messages delivered to a group of participants that can crash and subsequently recover and that can become temporarily partitioned and then remerge due to network conditions. The paper presents in detail and proves the correctness of an efficient algorithm that implements ZB - Cast on top of existing group communication infrastructure. The algorithm minimizes the amount of required forced disk writes and avoids the need for application level (end-to-en d) acknowledgments per message. We also present an extension of the algorithm that allows dynamic addition or removal of participants. We indicate how ZBCast can be employed to build a generic data replication engine that can be used to provide consistent synchronous database replication. We provide experimental results that indicate the efficiency of the approach.