Dynamic Byzantine Reliable Broadcast (original) (raw)

Oracular Byzantine Reliable Broadcast [Extended Version]

arXiv (Cornell University), 2022

Byzantine Reliable Broadcast (BRB) is a fundamental distributed computing primitive, with applications ranging from notifications to asynchronous payment systems. Motivated by practical consideration, we study Client-Server Byzantine Reliable Broadcast (CSB), a multi-shot variant of BRB whose interface is split between broadcasting clients and delivering servers. We present Draft, an optimally resilient implementation of CSB. Like most implementations of BRB, Draft guarantees both liveness and safety in an asynchronous environment. Under good conditions, however, Draft achieves unparalleled efficiency. In a moment of synchrony, free from Byzantine misbehaviour, and at the limit of infinitely many broadcasting clients, a Draft server delivers a b-bits payload at an asymptotic amortized cost of 0 signature verifications, and (log2(c) + b) bits exchanged, where c is the number of clients in the system. This is the information-theoretical minimum number of bits required to convey the payload (b bits, assuming it is compressed), along with an identifier for its sender (log 2 (c) bits, necessary to enumerate any set of c elements, and optimal if broadcasting frequencies are uniform or unknown). These two achievements have profound practical implications. Real-world BRB implementations are often bottlenecked either by expensive signature verifications, or by communication overhead. For Draft, instead, the network is the limit: a server can deliver payloads as quickly as it would receive them from an infallible oracle.

Self-stabilizing Byzantine Fault-tolerant Repeated Reliable Broadcast

arXiv (Cornell University), 2022

We study a well-known communication abstraction called Byzantine Reliable Broadcast (BRB). This abstraction is central in the design and implementation of fault-tolerant distributed systems, as many fault-tolerant distributed applications require communication with provable guarantees on message deliveries. Our study focuses on fault-tolerant implementations for message-passing systems that are prone to process-failures, such as crashes and malicious behavior. At PODC 1983, Bracha and Toueg, in short, BT, solved the BRB problem. BT has optimal resilience since it can deal with t < n/3 Byzantine processes, where n is the number of processes. The present work aims at the design of an even more robust solution than BT by expanding its fault-model with self-stabilization, a vigorous notion of fault-tolerance. In addition to tolerating Byzantine and communication failures, self-stabilizing systems can recover after the occurrence of arbitrary transient-faults. These faults represent any violation of the assumptions according to which the system was designed to operate (provided that the algorithm code remains intact). We propose, to the best of our knowledge, the first self-stabilizing Byzantine fault-tolerant (BFT) solution for repeated BRB in signature-free message-passing systems (that follows BT's problem specifications). Our contribution includes a self-stabilizing variation on a BT that solves a single-instance BRB for asynchronous systems. We also consider the problem of recycling instances of single-instance BRB. Our self-stabilizing BFT recycling for time-free systems facilitates the concurrent handling of a predefined number of BRB invocations and, by this way, can serve as the basis for self-stabilizing BFT consensus.

Broadcast protocols for distributed systems

IEEE Transactions on Parallel and Distributed Systems, 1990

We present an innovative approach to the design of faultprocessors agree on exactly the same sequence of broadcast tolerant distributed systems that avoids the several rounds of message exchange required by current protocols for consensus agreement. The messages. approach is based on broadcast communication over a local area It is easy to demonstrate that placing a total order on network, such as an Ethernet or a token ring, and on two novel protocols, broadcast messages, so that every working processor procthe Tram protocol, which provides efficient reliable broadcast communi-esses the same messages in the same order, provides an cation, and the Total protocol, which with high probability promptly immediate solution to the agreement problem. Once this total places a total order on messages and achieves distributed agreement even in the presence of fail-stoo. omission. timing, and communication faults. order is determined, distributed actions can be carried out Reliable distributed operations such as locking, update and commitment, using simple sequential fault-tolerant algorithms. The strategy typically require only a single broadcast message rather than the several is very efficient: for example, locking records in a distributed tens of messages required by current algorithms. database typically requires only a single broadcast message to claim a lock and a single broadcast message to release it.

From Consensus to Atomic Broadcast: Time-Free Byzantine-Resistant Protocols Without Signatures

The Computer Journal, 2006

This paper proposes a stack of three Byzantine-resistant protocols aimed to be used in practical distributed systems: multi-valued consensus, vector consensus and atomic broadcast. These protocols are designed as successive transformations from one to another. The first protocol, multi-valued consensus, is implemented on top of a randomized binary consensus and a reliable broadcast protocol. The protocols share a set of important structural properties. First, they do not use digital signatures constructed with public-key cryptography, a well-known performance bottleneck in this kind of protocols. Second, they are time-free, i.e. they make no synchrony assumptions, since these assumptions are often vulnerable to subtle but effective attacks. Third, they are completely decentralized, thus avoiding the cost of detecting corrupt leaders. Fourth, they have optimal resilience, i.e. they tolerate the failure of f = (n − 1)/3 out of a total of n processes. In terms of time complexity, the multi-valued consensus protocol terminates in a constant expected number of rounds, while the vector consensus and atomic broadcast protocols have O(f ) complexity. The paper also proves the equivalence between multivalued consensus and atomic broadcast in the Byzantine failure model without signatures. A similar proof is given for the equivalence between multi-valued consensus and vector consensus. These two results have theoretical relevance since they show once more that consensus is a fundamental problem in distributed systems.

Efficient Byzantine-Resilient Reliable Multicast on a Hybrid Failure Model

2002

The paper presents a new reliable multicast protocol that tolerates arbitrary faults, including Byzantine faults. This protocol is developed using a novel way of designing secure protocols which is based on a well-founded hybrid failure model. Despite our claim of arbitrary failure resilience, the protocol needs not necessarily incur the cost of "Byzantine agreement", in number of participants and round/message complexity. It can rely on the existence of a simple distributed security kernel -the TTCB -where the participants only execute crucial parts of the protocol operation, under the protection of a crash failure model. Otherwise, participants follow an arbitrary failure model.

Simple and efficient oracle-based consensus protocols for asynchronous Byzantine systems

IEEE Transactions on Dependable and Secure Computing, 2000

This paper is on the Consensus problem in asynchronous distributed systems where (up to f ) processes (among n) can exhibit a Byzantine behavior, i.e., can deviate arbitrarily from their specification. A way to solve the consensus problem in such a context consists of enriching the system with additional oracles that are powerful enough to cope with the uncertainty and unpredictability created by the combined effect of Byzantine behavior and asynchrony. Considering two types of such oracles, namely, an oracle that provides processes with random values, and a failure detector oracle, the paper presents two families of Byzantine asynchronous consensus protocols. Two of these protocols are particularly noteworthy: they allow the processes to decide in one communication step in favorable circumstances. The first is a randomized protocol that assumes n > 5f . The second one is a failure detector-based protocol that assumes n > 6f . These protocols are designed to be particularly simple and efficient in terms of communication steps, the number of messages they generate in each step, and the size of messages. So, although they are not optimal in the number of Byzantine processes that can be tolerated, they are particularly efficient when we consider the number of communication steps they require to decide, and the number and size of the messages they use. In that sense, they are practically appealing.

An ordered and reliable broadcast protocol for distributed systems

Computer Communications, 1997

The purpose of a reliable broadcast protocol is to allow groups of nodes on unreliable broadcast networks to reliably broadcast messages. A reliable broadcast protocol must guarantee two properties: (1) all of the receivers in a group receive the broadcast messages, and (2) each of the receivers orders the messages in the same sequence. In an optimistic approach to reliable broadcast protocol, a batch acknowledgement is employed for a sequence of broadcast messages, instead of one or more acknowledgements per broadcast message used in the pessimistic approach. In this paper, based on the optimistic approach, we have proposed a counter-based reliable broadcast protocol. In this protocol, the unique token ownership is circulated among all the nodes in an order specified by a token-passing-list. The system state which records related information about messages broadcast by each node is included in the token message. By appropriately updating the counter information recorded in the system state included in the token message, instead of using explicit acknowledgement messages, the proposed protocol needs fewer control messages to commit a broadcast message than other protocols, no matter whether the rate of transmission errors is high or low. Moreover, we show how to handle the flow control problem and describe the token update technique.

Asynchronous Byzantine consensus with 2f+1 processes

Proceedings of the ACM Symposium on Applied Computing, 2010

Byzantine consensus in asynchronous message-passing systems has been shown to require at least 3f + 1 processes to be solvable in several system models (e.g., with failure detectors, partial synchrony or randomization). Recently a couple of solutions to implement Byzantine fault-tolerant state-machine replication using only 2f + 1 replicas have appeared. This reduction from 3f + 1 to 2f + 1 is possible with a hybrid system model, i.e., by extending the system model with trusted/trustworthy components that constrain the power of faulty processes to have certain behaviors. Despite these important results, the problem of solving Byzantine consensus with only 2f + 1 processes is still far from being well understood. In this paper we present a methodology to transform crash consensus algorithms into Byzantine consensus algorithms with different characteristics, with the assistance of a reliable broadcast primitive that requires trusted/trustworthy components to be implemented. We exemplify the methodology with two algorithms, one that uses failure detectors and one that is randomized. We also define a new flavor of consensus and use it to solve atomic broadcast, showing the practical interest of the transformations.

Real-Time, Byzantine-Tolerant Information Dissemination in Unreliable and Untrustworthy Distributed Systems

2008

In unreliable and untrustworthy systems, information dissemination may suffer network failures and attacks from Byzantine nodes which are controlled by traitors or adversaries, and can perform destructive behaviors. Typically, Byzantine nodes together or individually "swallow" messages, or fake disseminated information. In this paper, we present an authenticationfree, gossip-based real-time information dissemination mechanism called RT-LASIRC, in which "healthy" nodes utilize Byzantine features to defend against Byzantine attacks. We show that RT-LASIRC is robust against blackhole and message-faking attacks. Our experimental studies verify RT-LASIRC's effectiveness.