On the Path from Total Order to Database Replication (original) (raw)
The date of receipt and acceptance will be inserted by the editor Summary. We introduce ZBCast, a primitive that provides Persistent Global Total Order for messages delivered to a group of participants that can crash and subsequently recover and that can become temporarily partitioned and then remerge due to network conditions. The paper presents in detail and proves the correctness of an efficient algorithm that implements ZB - Cast on top of existing group communication infrastructure. The algorithm minimizes the amount of required forced disk writes and avoids the need for application level (end-to-en d) acknowledgments per message. We also present an extension of the algorithm that allows dynamic addition or removal of participants. We indicate how ZBCast can be employed to build a generic data replication engine that can be used to provide consistent synchronous database replication. We provide experimental results that indicate the efficiency of the approach.