Marko Vukolic - Profile on Academia.edu (original) (raw)
Papers by Marko Vukolic
SOFSEM 2011: Theory and Practice of Computer Science
Key management is concerned with operations to manage the lifecycle of cryptographic keys, for cr... more Key management is concerned with operations to manage the lifecycle of cryptographic keys, for creating, storing, distributing, deploying, and deleting keys. An important aspect is to manage the attributes of keys that govern their usage and their relation to other keys. Multiple efforts are currently underway to build and standardize key-management systems accessible over open networks: the W3C XML Key Management Specification (XKMS) [18], the IEEE P1619.3 Key Management Project [12], the OASIS Key Management Interoperability Protocol (KMIP) standardization effort [14], and the Sun Crypto Key Management System [16] are some of the most prominent ones. Cover [9] gives an up-to-date summary of the current developments. Many proprietary key-management systems are on the market, including HP StorageWorks Secure Key Manager, IBM Distributed Key Management System (DKMS), IBM Tivoli Key Lifecycle Manager (TKLM), NetApp Lifetime Key Management, and Thales/nCipher keyAuthority. The need for enterprise-wide key management systems has been recognized widely [4], and NIST, an agency of the US Government, has issued a general recommendation for key management [3].
Content cloud systems, e.g. CloudFront [1] and CloudBurst [2], in which content items are retriev... more Content cloud systems, e.g. CloudFront [1] and CloudBurst [2], in which content items are retrieved by endusers from the edge nodes of the cloud, are becoming increasingly popular. The retrieval latency in content clouds depends on content availability in the edge nodes, which in turn depends on the caching policy at the edge nodes. In case of local content unavailability (i.e., a cache miss), edge nodes resort to source selection strategies to retrieve the content items either vertically from the central server, or horizontally from other edge nodes. Consequently, managing the latency in content clouds needs to take into account several interrelated issues: asymmetric bandwidth and caching capacity for both source types as well as edge node heterogeneity in terms of caching policies and source selection strategies applied.
Robust data sharing with key-value stores
... It is important for ensuring termination of concurrent read operations that the writer first ... more ... It is important for ensuring termination of concurrent read operations that the writer first stores thevalue under the eternal key and later under the temporary key. 4.3 Atomic Register Atomicity is achieved by having a client write back its read value before returning it. ...
Key management is the Achilles' heel of cryptography. This work presents a novel Key-Lifecycle Ma... more Key management is the Achilles' heel of cryptography. This work presents a novel Key-Lifecycle Management System (KLMS), which addresses two issues that have not been addressed comprehensively so far.
The fair exchange problem is key to trading electronic items in systems of mutually untrusted par... more The fair exchange problem is key to trading electronic items in systems of mutually untrusted parties. We consider modern variants of such systems where each party is equipped with a tamper proof security module. The security modules trust each other but can only communicate by exchanging messages through their host parties. These are untrusted and could intercept and drop those messages. We show that the fair exchange problem at the level of untrusted parties can be reduced to an atomic commit problem at the level of trusted security modules. This reduction offers a new perspective with which fair exchange protocols can be designed. In particular, we present a new atomic commit protocol, called Monte Carlo NBAC, which helps build a new and practical fair exchange solution. The exchange does always terminate and no party commits the exchange with the wrong items. Furthermore, there is an upper bound on the the probability that the exchange ends up being unfair, and this bound is out of the control of the untrusted parties.
The fair exchange problem is key to trading electronic items in systems of mutually untrusted par... more The fair exchange problem is key to trading electronic items in systems of mutually untrusted parties. In modern variants of such systems, each party is equipped with a tamper proof security module. The security modules trust each other but can only communicate by exchanging messages through their host parties. These hosts are untrusted and could intercept and drop those messages. We describe a synchronous algorithm that ensures deterministic fair exchange if a majority of parties are honest, which is optimal in terms of resilience. If there is no honest majority, our algorithm degrades gracefully: it ensures that the probability of violating fairness can be made arbitrarily low. We prove that this probability is inversely proportional to the average complexity of the algorithm in terms of its number of communication rounds, and we supply the corresponding optimal probability distribution. Our algorithm uses, as an underlying building block, an early stopping subprotocol that solves, in a model with general omission failures, a specific variant of consensus we call biased consensus. Our modular approach contributes in bridging the gap between modern security (i.e., based on security modules) and traditional distributed computing (i.e., agreement with omission failures).
IEEE Computer, 2009
A distributed storage service lets clients abstract a single reliable shared storage device using... more A distributed storage service lets clients abstract a single reliable shared storage device using a collection of possibly unreliable computing units. Algorithms that implement this abstraction offer certain tradeoffs and vary according to dimensions such as complexity, the consistency semantics provided, and the types of failures tolerated.
Distributed Computing, 2010
It is considered good distributed computing practice to devise object implementations that tolera... more It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience, as well as optimal best-case complexity. We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: first class quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows algorithms relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: an atomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here. Each of our algorithms is representative of a different class of architectures, highlighting the generality of the refined quorum abstraction.
Distributed Computing, 2007
It is considered good distributed computing practice to devise object implementations that tolera... more It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience, as well as optimal best-case complexity. We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: first class quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows algorithms relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: an atomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here. Each of our algorithms is representative of a different class of architectures, highlighting the generality of the refined quorum abstraction.
This paper establishes tight bounds on the best-case time-complexity of distributed atomic read/w... more This paper establishes tight bounds on the best-case time-complexity of distributed atomic read/write storage implementations that tolerate worst-case conditions. We study asynchronous robust implementations where a writer and a set of reader processes (clients) access an atomic storage implemented over a set of 2t + b + 1 server processes of which t can fail: b of these can be malicious and the rest can fail by crashing. We define a lucky operation (read or write) as one that runs synchronously and without contention. We determine the exact conditions under which a lucky operation can be fast, namely expedited in one-communication round-trip with no data authentication. We show that every lucky write (resp., read) can be fast despite fw (resp., fr) actual failures, if and only if fw + fr ≤ t − b.
Asynchronous Byzantine Consensus: Complexity, Resilience and Authentication (Preliminary Version
Abstract. We present a consensus algorithm that tolerates Byzantine process failures and arbitrar... more Abstract. We present a consensus algorithm that tolerates Byzantine process failures and arbitrarily long periods of network asynchrony. Our algorithm is the first to match the general time-complexity lower bound of [14], for which we give a complete proof. When the necessary conditions ...
This paper presents SOAR: the first oblivious atomicity assertion with polynomial complexity. SOA... more This paper presents SOAR: the first oblivious atomicity assertion with polynomial complexity. SOAR enables to check atomicity of a single-writer multi-reader register implementation. The basic idea underlying the low overhead induced by SOAR lies in greedily checking, in a backward manner, specific points of an execution where register operations could be linearized, rather than exploring all possible precedence relations among these. We illustrate the use of SOAR by implementing it in +CAL. The performance of the resulting automatic verification outperforms comparable approaches by more than an order of magnitude already in executions with only 6 read/write operations. This difference increases to 3-4 orders of magnitude in the “negative” scenario, i.e., when checking some non-atomic execution, with only 5 operations. For example, checking atomicity of every possible execution of a single-writer single-reader (SWSR) register with at most 2 write and 3 read operations with the state of the art oblivious assertion takes more than 58 hours to complete, whereas SOAR takes just 9 seconds.
This paper establishes the first theorem relating resilience, time complexity and authentication ... more This paper establishes the first theorem relating resilience, time complexity and authentication in distributed computing. We study consensus algorithms that tolerate Byzantine failures and arbitrary long periods of asynchrony. We measure the ability of processes to reach a consensus decision in a minimal number of rounds of information exchange, as a function of (a) their ability to use authentication and (b) the number of actual process failures in those rounds, as well as of (c) the total number of failures tolerated and (d) the system constellation. The constellations considered distinguish different roles of processes, such that we can directly derive a meaningful bound on the time complexity of implementing robust general services using several replicas coordinated through consensus. To prove our theorem, we establish certain lower bounds and we give algorithms that match these bounds. The algorithms are all variants of the same generic asynchronous Byzantine consensus algorithm, which is interesting in its own right.
This paper establishes the first theorem relating resilience, round complexity and authentication... more This paper establishes the first theorem relating resilience, round complexity and authentication in distributed computing. We give an exact measure of the time complexity of consensus algorithms that tolerate Byzantine failures and arbitrary long periods of asynchrony as in the Internet. The measure expresses the ability of processes to reach a consensus decision in a minimal number of rounds of information exchange, as a function of (a) the ability to use authentication and (b) the number of actual process failures, in those rounds, as well as of (c) the total number of failures tolerated and (d) the system configuration. The measure holds for a framework where the different roles of processes are distinguished such that we can directly derive a meaningful bound on the time complexity of implementing robust general services in practical distributed systems. To prove our theorem, we establish certain lower bounds and we give algorithms that match these bounds. The algorithms are all variants of the same generic asynchronous Byzantine consensus algorithm, which is interesting in its own right.
Siam Journal on Computing, 2010
We study efficient and robust implementations of an atomic read-write data structure over an asyn... more We study efficient and robust implementations of an atomic read-write data structure over an asynchronous distributed message-passing system made of reader and writer processes, as well as a number of servers implementing the data structure. We determine the exact conditions under which every read and write involves one round of communication with the servers. These conditions relate the number of readers to the tolerated number of faulty servers and the nature of these failures. ). P. DUTTA, R. GUERRAOUI, R. R. LEVY, AND M. VUKOLIĆ from [4] in a single-writer multi-reader case, also called a SWMR register implementation [18]. In , readers and servers are the same set, the writer is one of the servers, a minority of processes may fail by crashing, i.e., halting all their activities without warning, whereas the rest of the processes execute the algorithm assigned to them. This implementation ensures atomicity by associating timestamps with every written value. To write some value v, the writer increments its local timestamp, and sends v with the new timestamp ts to all servers. Every server, on receiving such a message, stores v and ts and then sends an acknowledgment (an ack) to the writer. On receiving acks from a majority, the writer terminates the write. In a read operation, the reader first gathers value and timestamp pairs from a majority of servers, and selects the value v with the largest timestamp ts. Then the reader sends v and ts to all servers, and returns v on receiving acks from a majority of processes. Given the single-writer setting, and since only the writer introduces new timestamps in the system, the writer always knows the latest timestamp. Unlike the writer, a reader does not know the latest timestamp in the system, and hence, needs to spend one communication round-trip to discover the latest value, and then another round-trip to propagate the value to a majority of servers. The second round-trip is "required" because the latest value learned in the first round-trip might be present at only a minority of servers: a subsequent read might thus miss this value. In a sense, every read includes, in its second communication round-trip, a "write phase", with the input parameter being the value selected in the first round-trip.
This paper studies the time complexity of reading unauthenticated data from a distributed storage... more This paper studies the time complexity of reading unauthenticated data from a distributed storage made of a set of failure-prone base objects. More specifically, we consider the abstraction of a robust read/write storage that provides wait-free access to unauthenticated data over a set of base storage objects with t possible failures, out of which at most b are arbitrary and the rest are simple crash failures. We prove a 2 communication round-trip lower bound for reading from a safe storage that uses at most 2t + 2b base objects, independently of the number or round-trips needed by the writer. We then prove the lower bound tight by exhibiting a regular storage that uses 2t+b+1 base objects (optimal resilience) and features 2 communication round-trips for both read and write operations.
We study the time-complexity of robust atomic read/write storage from fault-prone storage compone... more We study the time-complexity of robust atomic read/write storage from fault-prone storage components in asynchronous message-passing systems. Robustness here means wait-free tolerating the largest possible number t of Byzantine storage component failures (optimal resilience) without relying on data authentication. We show that no single-writer multiple-reader (SWMR) robust atomic storage implementation exists if (a) read operations complete in less than four communication round-trips (rounds), and (b) the time-complexity of write operations is constant. More precisely, we present two lower bounds. The first is a read lower bound stating that three rounds of communication are necessary to read from a SWMR robust atomic storage. The second is a write lower bound, showing that Ω(log(t)) write rounds are necessary to read in three rounds from such a storage. Applied to known results, our lower bounds close a fundamental gap: we show that time-optimal robust atomic storage can be obtained using well-known transformations from regular to atomic storage and existing time-optimal regular storage implementations.
Several manufacturers have recently started to equip their hardware with security modules. These ... more Several manufacturers have recently started to equip their hardware with security modules. These typically consist of smart cards or special microprocessors. Examples include the "Embedded Security Subsystem" within the recent IBM Thinkpad or the IBM 4758 secure co-processor board . In fact, a large body of computer and device manufacturers has founded the Trusted Computing Group (TCG) [9] to promote this idea.
SOFSEM 2011: Theory and Practice of Computer Science
Key management is concerned with operations to manage the lifecycle of cryptographic keys, for cr... more Key management is concerned with operations to manage the lifecycle of cryptographic keys, for creating, storing, distributing, deploying, and deleting keys. An important aspect is to manage the attributes of keys that govern their usage and their relation to other keys. Multiple efforts are currently underway to build and standardize key-management systems accessible over open networks: the W3C XML Key Management Specification (XKMS) [18], the IEEE P1619.3 Key Management Project [12], the OASIS Key Management Interoperability Protocol (KMIP) standardization effort [14], and the Sun Crypto Key Management System [16] are some of the most prominent ones. Cover [9] gives an up-to-date summary of the current developments. Many proprietary key-management systems are on the market, including HP StorageWorks Secure Key Manager, IBM Distributed Key Management System (DKMS), IBM Tivoli Key Lifecycle Manager (TKLM), NetApp Lifetime Key Management, and Thales/nCipher keyAuthority. The need for enterprise-wide key management systems has been recognized widely [4], and NIST, an agency of the US Government, has issued a general recommendation for key management [3].
Content cloud systems, e.g. CloudFront [1] and CloudBurst [2], in which content items are retriev... more Content cloud systems, e.g. CloudFront [1] and CloudBurst [2], in which content items are retrieved by endusers from the edge nodes of the cloud, are becoming increasingly popular. The retrieval latency in content clouds depends on content availability in the edge nodes, which in turn depends on the caching policy at the edge nodes. In case of local content unavailability (i.e., a cache miss), edge nodes resort to source selection strategies to retrieve the content items either vertically from the central server, or horizontally from other edge nodes. Consequently, managing the latency in content clouds needs to take into account several interrelated issues: asymmetric bandwidth and caching capacity for both source types as well as edge node heterogeneity in terms of caching policies and source selection strategies applied.
Robust data sharing with key-value stores
... It is important for ensuring termination of concurrent read operations that the writer first ... more ... It is important for ensuring termination of concurrent read operations that the writer first stores thevalue under the eternal key and later under the temporary key. 4.3 Atomic Register Atomicity is achieved by having a client write back its read value before returning it. ...
Key management is the Achilles' heel of cryptography. This work presents a novel Key-Lifecycle Ma... more Key management is the Achilles' heel of cryptography. This work presents a novel Key-Lifecycle Management System (KLMS), which addresses two issues that have not been addressed comprehensively so far.
The fair exchange problem is key to trading electronic items in systems of mutually untrusted par... more The fair exchange problem is key to trading electronic items in systems of mutually untrusted parties. We consider modern variants of such systems where each party is equipped with a tamper proof security module. The security modules trust each other but can only communicate by exchanging messages through their host parties. These are untrusted and could intercept and drop those messages. We show that the fair exchange problem at the level of untrusted parties can be reduced to an atomic commit problem at the level of trusted security modules. This reduction offers a new perspective with which fair exchange protocols can be designed. In particular, we present a new atomic commit protocol, called Monte Carlo NBAC, which helps build a new and practical fair exchange solution. The exchange does always terminate and no party commits the exchange with the wrong items. Furthermore, there is an upper bound on the the probability that the exchange ends up being unfair, and this bound is out of the control of the untrusted parties.
The fair exchange problem is key to trading electronic items in systems of mutually untrusted par... more The fair exchange problem is key to trading electronic items in systems of mutually untrusted parties. In modern variants of such systems, each party is equipped with a tamper proof security module. The security modules trust each other but can only communicate by exchanging messages through their host parties. These hosts are untrusted and could intercept and drop those messages. We describe a synchronous algorithm that ensures deterministic fair exchange if a majority of parties are honest, which is optimal in terms of resilience. If there is no honest majority, our algorithm degrades gracefully: it ensures that the probability of violating fairness can be made arbitrarily low. We prove that this probability is inversely proportional to the average complexity of the algorithm in terms of its number of communication rounds, and we supply the corresponding optimal probability distribution. Our algorithm uses, as an underlying building block, an early stopping subprotocol that solves, in a model with general omission failures, a specific variant of consensus we call biased consensus. Our modular approach contributes in bridging the gap between modern security (i.e., based on security modules) and traditional distributed computing (i.e., agreement with omission failures).
IEEE Computer, 2009
A distributed storage service lets clients abstract a single reliable shared storage device using... more A distributed storage service lets clients abstract a single reliable shared storage device using a collection of possibly unreliable computing units. Algorithms that implement this abstraction offer certain tradeoffs and vary according to dimensions such as complexity, the consistency semantics provided, and the types of failures tolerated.
Distributed Computing, 2010
It is considered good distributed computing practice to devise object implementations that tolera... more It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience, as well as optimal best-case complexity. We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: first class quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows algorithms relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: an atomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here. Each of our algorithms is representative of a different class of architectures, highlighting the generality of the refined quorum abstraction.
Distributed Computing, 2007
It is considered good distributed computing practice to devise object implementations that tolera... more It is considered good distributed computing practice to devise object implementations that tolerate contention, periods of asynchrony and a large number of failures, but perform fast if few failures occur, the system is synchronous and there is no contention. This paper initiates the first study of quorum systems that help design such implementations by encompassing, at the same time, optimal resilience, as well as optimal best-case complexity. We introduce the notion of a refined quorum system (RQS) of some set S as a set of three classes of subsets (quorums) of S: first class quorums are also second class quorums, themselves being also third class quorums. First class quorums have large intersections with all other quorums, second class quorums typically have smaller intersections with those of the third class, the latter simply correspond to traditional quorums. Intuitively, under uncontended and synchronous conditions, a distributed object implementation would expedite an operation if a quorum of the first class is accessed, then degrade gracefully depending on whether a quorum of the second or the third class is accessed. Our notion of refined quorum system is devised assuming a general adversary structure, and this basically allows algorithms relying on refined quorum systems to relax the assumption of independent process failures, often questioned in practice. We illustrate the power of refined quorums by introducing two new optimal Byzantine-resilient distributed object implementations: an atomic storage and a consensus algorithm. Both match previously established resilience and best-case complexity lower bounds, closing open gaps, as well as new complexity bounds we establish here. Each of our algorithms is representative of a different class of architectures, highlighting the generality of the refined quorum abstraction.
This paper establishes tight bounds on the best-case time-complexity of distributed atomic read/w... more This paper establishes tight bounds on the best-case time-complexity of distributed atomic read/write storage implementations that tolerate worst-case conditions. We study asynchronous robust implementations where a writer and a set of reader processes (clients) access an atomic storage implemented over a set of 2t + b + 1 server processes of which t can fail: b of these can be malicious and the rest can fail by crashing. We define a lucky operation (read or write) as one that runs synchronously and without contention. We determine the exact conditions under which a lucky operation can be fast, namely expedited in one-communication round-trip with no data authentication. We show that every lucky write (resp., read) can be fast despite fw (resp., fr) actual failures, if and only if fw + fr ≤ t − b.
Asynchronous Byzantine Consensus: Complexity, Resilience and Authentication (Preliminary Version
Abstract. We present a consensus algorithm that tolerates Byzantine process failures and arbitrar... more Abstract. We present a consensus algorithm that tolerates Byzantine process failures and arbitrarily long periods of network asynchrony. Our algorithm is the first to match the general time-complexity lower bound of [14], for which we give a complete proof. When the necessary conditions ...
This paper presents SOAR: the first oblivious atomicity assertion with polynomial complexity. SOA... more This paper presents SOAR: the first oblivious atomicity assertion with polynomial complexity. SOAR enables to check atomicity of a single-writer multi-reader register implementation. The basic idea underlying the low overhead induced by SOAR lies in greedily checking, in a backward manner, specific points of an execution where register operations could be linearized, rather than exploring all possible precedence relations among these. We illustrate the use of SOAR by implementing it in +CAL. The performance of the resulting automatic verification outperforms comparable approaches by more than an order of magnitude already in executions with only 6 read/write operations. This difference increases to 3-4 orders of magnitude in the “negative” scenario, i.e., when checking some non-atomic execution, with only 5 operations. For example, checking atomicity of every possible execution of a single-writer single-reader (SWSR) register with at most 2 write and 3 read operations with the state of the art oblivious assertion takes more than 58 hours to complete, whereas SOAR takes just 9 seconds.
This paper establishes the first theorem relating resilience, time complexity and authentication ... more This paper establishes the first theorem relating resilience, time complexity and authentication in distributed computing. We study consensus algorithms that tolerate Byzantine failures and arbitrary long periods of asynchrony. We measure the ability of processes to reach a consensus decision in a minimal number of rounds of information exchange, as a function of (a) their ability to use authentication and (b) the number of actual process failures in those rounds, as well as of (c) the total number of failures tolerated and (d) the system constellation. The constellations considered distinguish different roles of processes, such that we can directly derive a meaningful bound on the time complexity of implementing robust general services using several replicas coordinated through consensus. To prove our theorem, we establish certain lower bounds and we give algorithms that match these bounds. The algorithms are all variants of the same generic asynchronous Byzantine consensus algorithm, which is interesting in its own right.
This paper establishes the first theorem relating resilience, round complexity and authentication... more This paper establishes the first theorem relating resilience, round complexity and authentication in distributed computing. We give an exact measure of the time complexity of consensus algorithms that tolerate Byzantine failures and arbitrary long periods of asynchrony as in the Internet. The measure expresses the ability of processes to reach a consensus decision in a minimal number of rounds of information exchange, as a function of (a) the ability to use authentication and (b) the number of actual process failures, in those rounds, as well as of (c) the total number of failures tolerated and (d) the system configuration. The measure holds for a framework where the different roles of processes are distinguished such that we can directly derive a meaningful bound on the time complexity of implementing robust general services in practical distributed systems. To prove our theorem, we establish certain lower bounds and we give algorithms that match these bounds. The algorithms are all variants of the same generic asynchronous Byzantine consensus algorithm, which is interesting in its own right.
Siam Journal on Computing, 2010
We study efficient and robust implementations of an atomic read-write data structure over an asyn... more We study efficient and robust implementations of an atomic read-write data structure over an asynchronous distributed message-passing system made of reader and writer processes, as well as a number of servers implementing the data structure. We determine the exact conditions under which every read and write involves one round of communication with the servers. These conditions relate the number of readers to the tolerated number of faulty servers and the nature of these failures. ). P. DUTTA, R. GUERRAOUI, R. R. LEVY, AND M. VUKOLIĆ from [4] in a single-writer multi-reader case, also called a SWMR register implementation [18]. In , readers and servers are the same set, the writer is one of the servers, a minority of processes may fail by crashing, i.e., halting all their activities without warning, whereas the rest of the processes execute the algorithm assigned to them. This implementation ensures atomicity by associating timestamps with every written value. To write some value v, the writer increments its local timestamp, and sends v with the new timestamp ts to all servers. Every server, on receiving such a message, stores v and ts and then sends an acknowledgment (an ack) to the writer. On receiving acks from a majority, the writer terminates the write. In a read operation, the reader first gathers value and timestamp pairs from a majority of servers, and selects the value v with the largest timestamp ts. Then the reader sends v and ts to all servers, and returns v on receiving acks from a majority of processes. Given the single-writer setting, and since only the writer introduces new timestamps in the system, the writer always knows the latest timestamp. Unlike the writer, a reader does not know the latest timestamp in the system, and hence, needs to spend one communication round-trip to discover the latest value, and then another round-trip to propagate the value to a majority of servers. The second round-trip is "required" because the latest value learned in the first round-trip might be present at only a minority of servers: a subsequent read might thus miss this value. In a sense, every read includes, in its second communication round-trip, a "write phase", with the input parameter being the value selected in the first round-trip.
This paper studies the time complexity of reading unauthenticated data from a distributed storage... more This paper studies the time complexity of reading unauthenticated data from a distributed storage made of a set of failure-prone base objects. More specifically, we consider the abstraction of a robust read/write storage that provides wait-free access to unauthenticated data over a set of base storage objects with t possible failures, out of which at most b are arbitrary and the rest are simple crash failures. We prove a 2 communication round-trip lower bound for reading from a safe storage that uses at most 2t + 2b base objects, independently of the number or round-trips needed by the writer. We then prove the lower bound tight by exhibiting a regular storage that uses 2t+b+1 base objects (optimal resilience) and features 2 communication round-trips for both read and write operations.
We study the time-complexity of robust atomic read/write storage from fault-prone storage compone... more We study the time-complexity of robust atomic read/write storage from fault-prone storage components in asynchronous message-passing systems. Robustness here means wait-free tolerating the largest possible number t of Byzantine storage component failures (optimal resilience) without relying on data authentication. We show that no single-writer multiple-reader (SWMR) robust atomic storage implementation exists if (a) read operations complete in less than four communication round-trips (rounds), and (b) the time-complexity of write operations is constant. More precisely, we present two lower bounds. The first is a read lower bound stating that three rounds of communication are necessary to read from a SWMR robust atomic storage. The second is a write lower bound, showing that Ω(log(t)) write rounds are necessary to read in three rounds from such a storage. Applied to known results, our lower bounds close a fundamental gap: we show that time-optimal robust atomic storage can be obtained using well-known transformations from regular to atomic storage and existing time-optimal regular storage implementations.
Several manufacturers have recently started to equip their hardware with security modules. These ... more Several manufacturers have recently started to equip their hardware with security modules. These typically consist of smart cards or special microprocessors. Examples include the "Embedded Security Subsystem" within the recent IBM Thinkpad or the IBM 4758 secure co-processor board . In fact, a large body of computer and device manufacturers has founded the Trusted Computing Group (TCG) [9] to promote this idea.