Dynamic fault-tolerant clock synchronization (original) (raw)

Self-stabilizing clock synchronization in the presence of Byzantine faults

Journal of the ACM, 2004

We initiate a study of bounded clock synchronization under a more severe fault model than that proposed by Lamport and Melliar-Smith [1985]. Realistic aspects of the problem of synchronizing clocks in the presence of faults are considered. One aspect is that clock synchronization is an on-going task, thus the assumption that some of the processors never fail is too optimistic. To cope with this reality, we suggest self-stabilizing protocols that stabilize in any (long enough) period in which less than a third of the processors are faulty. Another aspect is that the clock value of each processor is bounded. A single transient fault may cause the clock to reach the upper bound. Therefore, we suggest a bounded clock that wraps around when appropriate.We present two randomized self-stabilizing protocols for synchronizing bounded clocks in the presence of Byzantine processor failures. The first protocol assumes that processors have a common pulse, while the second protocol does not. A ne...

Brief announcement: linear time byzantine self-stabilizing clock synchronization

2004

Awareness of the need for robustness in distributed systems increases as distributed systems become an integral part of day-to-day systems. Tolerating Byzantine faults and possessing self-stabilizing features are sensible and important requirements of distributed systems in general, and of a fundamental task such as clock synchronization in particular. There are efficient solutions for Byzantine non-stabilizing clock synchronization as well as for non-Byzantine self-stabilizing clock synchronization. In contrast, current Byzantine self-stabilizing clock synchronization algorithms have exponential convergence time and are thus impractical. We present a linear time Byzantine self-stabilizing clock synchronization algorithm, which thus makes this task feasible. Our deterministic clock synchronization algorithm is based on the observation that all clock synchronization algorithms require events for re-synchronizing the clock values. These events usually need to happen synchronously at the different nodes. In these solutions this is fulfilled or aided by having the clocks initially close to each other and thus the actual clock values can be used for synchronizing the events. This implies that the clock values cannot differ arbitrarily, which necessarily renders these solutions to be non-stabilizing. Our scheme suggests using a tight pulse synchronization that is uncorrelated to the actual clock values. The synchronized pulses are used as the events for re-synchronizing the clock values.

Clock synchronization with faults and recoveries (extended abstract)

Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing - PODC '00, 2000

We present a convergence-function based clock synchronization algorithm, which is simple, e cient and fault-tolerant. The algorithm is tolerant of failures and allows recoveries, as long as less than a third of the processors are faulty`at the same time'. Arbitrary (Byzantine) faults are tolerated, without requiring awareness of failure or recovery. In contrast, previous clock synchronization algorithms limited the total number of faults throughout the execution, which is not realistic, or assumed fault detection.

Linear Time Byzantine Self-Stabilizing Clock Synchronization

Lecture Notes in Computer Science, 2004

Awareness of the need for robustness in distributed systems increases as distributed systems become an integral part of day-to-day systems. Tolerating Byzantine faults and possessing self-stabilizing features are sensible and important requirements of distributed systems in general, and of a fundamental task such as clock synchronization in particular. There are efficient solutions for Byzantine non-stabilizing clock synchronization as well as for non-Byzantine self-stabilizing clock synchronization. In contrast, current Byzantine self-stabilizing clock synchronization algorithms have exponential convergence time and are thus impractical. We present a linear time Byzantine self-stabilizing clock synchronization algorithm, which thus makes this task feasible. Our deterministic clock synchronization algorithm is based on the observation that all clock synchronization algorithms require events for re-synchronizing the clock values. These events usually need to happen synchronously at the different nodes. In these solutions this is fulfilled or aided by having the clocks initially close to each other and thus the actual clock values can be used for synchronizing the events. This implies that the clock values cannot differ arbitrarily, which necessarily renders these solutions to be non-stabilizing. Our scheme suggests using a tight pulse synchronization that is uncorrelated to the actual clock values. The synchronized pulses are used as the events for re-synchronizing the clock values.

Self-stabilizing Byzantine Digital Clock Synchronization

Lecture Notes in Computer Science, 2006

We present a scheme that achieves self-stabilizing Byzantine digital clock synchronization assuming a "synchronous" system. This synchronous system is established by the assumption of a common external "beat" delivered with a regularity in the order of the network message delay, thus enabling the nodes to execute in lock-step. The system can be subjected to severe transient failures with a permanent presence of Byzantine nodes. Our algorithm guarantees eventually synchronized digital clock counters, i.e. common increasing integer counters associated with each beat. We then show how to achieve regular clock synchronization, progressing at real-time rate and with high granularity, from the synchronized digital clock counters. There is one previous self-stabilizing Byzantine clock synchronization algorithm that also converges in linear time (relying on an underlying distributed pulse mechanism), but it requires to execute and terminate Byzantine agreement in between consecutive pulses. That algorithm, although it does not assume a synchronous system, cannot be easily transformed to benefit from the stronger synchronous system model in which the pulses (beats) are in the order of the message delay time apart. The only other previous self-stabilizing Byzantine digital clock synchronization algorithm operating in a similar synchronous model converges in expected exponential time. Our algorithm converges (deterministically) in linear time and utilizes the synchronous model to achieve optimal precision and a simpler algorithm. Furthermore, it does not require the use of local physical timers in order to synchronize the clock counters.

Clock Synchronization in Distributed Environment

Accurate clock synchronization is difficult to get into distributed environment. Since last decade most of the transactions are online viz. Online banking applications, database queries and real time applications. Time synchronization in a distributed system is important because time based queries can be answered only if all the distributed system has a common notion of time. In distributed system the clocks do not remain well synchronized without periodic synchronization. To maintain the global time the clocks of the nodes must resynchronized periodically. The aim of this paper is to study existing time synchronization approaches and analyze the need of a new class of clock synchronization protocol that is scalable, topology independent, fast convergence, and less application dependent in a distributed environment. The review`s presented in this paper are on the basis of different techniques for clock synchronization proposed by various researchers . This work will help in selecting appropriate methods of clock synchronization in a distributed environment.

A fault-resistant asynchronous clock function

2010

Consider an asynchronous network in a shared-memory environment consisting of n nodes. Assume that up to f of the nodes might be Byzantine (n > 12f ), where the adversary is full-information and dynamic (sometimes called adaptive). In addition, the non-Byzantine nodes may undergo transient failures. Nodes advance in atomic steps, which consist of reading all registers, performing some calculation and writing to all registers. The three main contributions of the paper are: first, the clock-function problem is defined, which is a generalization of the clock synchronization problem. This generalization encapsulates previous clock synchronization problem definitions while extending them to the current paper's model. Second, a randomized asynchronous self-stabilizing Byzantine tolerant clock synchronization algorithm is presented. In the construction of the clock synchronization algorithm, a building block that ensures different nodes advance at similar rates is developed. This feature is the third contribution of the paper. It is self-stabilizing and Byzantine tolerant and can be used as a building block for different algorithms that operate in an asynchronous self-stabilizing Byzantine model. The convergence time of the presented algorithm is exponential. Observe that in the asynchronous setting the best known full-information dynamic Byzantine agreement also has an expected exponential convergence time.

Clock Synchronization in Distributed System

In distributed systems, most of the end-to-end delay fluctuations, especially the delay fluctuations in the network, are bounded provided that the global network traffic loads are manually controlled to be light-weighted. Ethernet based communication protocols have recently gained more importance. The IEEE 1588 standard is the most widely used synchronization algorithm for Ethernet based communication protocols. The IEEE 1588 standard uses the centralized synchronization method. In this case, the malfunction of a specific node that sends the master clock can affect the performance of synchronization of every node on the network. The low synchronization performance of the nodes can cause malfunctions of the synchronous control system. For this reason, we believe the distributed synchronization method is more suitable for the industrial communication protocols.

A Unified Fault-Tolerance Protocol

Lecture Notes in Computer Science, 2004

Davies and Wakerly show that Byzantine fault tolerance can be achieved by a cascade of broadcasts and middle value select functions. We present an extension of the Davies and Wakerly protocol, the unified protocol, and its proof of correctness. We prove that it satisfies validity and agreement properties for communication of exact values. We then introduce bounded communication error into the model. Inexact communication is inherent for clock synchronization protocols. We prove that validity and agreement properties hold for inexact communication, and that exact communication is a special case. As a running example, we illustrate the unified protocol using the SPIDER family of fault-tolerant architectures. In particular we demonstrate that the SPIDER interactive consistency, distributed diagnosis, and clock synchronization protocols are instances of the unified protocol.

Comparative Study of Clock Synchronization Algorithms in Distributed Systems

In the Distributed Systems (DS) the nodes are communicating with each other using message passing. Many real-time applications such as banking systems, reservation systems that are implemented on distributed systems, it is important to execute each transaction/event in an ordered manner. Ordering of events is essential for proper allocation of available resources and mutual allocation. This can be implemented using clock synchronization. The paper presents a comparative study of clock synchronization algorithms in distributed systems. The paper also discusses time protocol such as Network Time Protocol and Simple Network Time Protocol.