João Antunes | Universidade de Lisboa (original) (raw)
Papers by João Antunes
This paper presents Worm-IT, a new intrusion-tolerant group communication system with a membershi... more This paper presents Worm-IT, a new intrusion-tolerant group communication system with a membership service and a view-synchronous atomic multicast primitive. The system is intrusion-tolerant in the sense that it behaves correctly even if some nodes are corrupted and become malicious. It is based on a novel approach that enhances the environment with a special secure distributed component used by the protocols to execute securely a few crucial operations.
In this paper, we present the middleware architecture of MAFTIA, an ESPRIT project aiming at deve... more In this paper, we present the middleware architecture of MAFTIA, an ESPRIT project aiming at developing an open architecture for transactional operations on the Internet. The former is a modular and scalable cryptographic group-oriented middleware suite, suitable for supporting reliable multi-party interactions under partial synchrony models, and subject to malicious as well as accidental faults
Abstract The application of dependability concepts and techniques to the design of secure distrib... more Abstract The application of dependability concepts and techniques to the design of secure distributed systems is raising a considerable amount of interest in both communities under the designation of intrusion tolerance. However, practical intrusion-tolerant replicated systems based on the state machine approach (SMA) can handle at most f Byzantine components out of a total of n= 3f+ 1, which is the maximum resilience in asynchronous systems.
Abstract Mobile computing allows ubiquitous and continuous access to computing resources while th... more Abstract Mobile computing allows ubiquitous and continuous access to computing resources while the users travel or work at a client's site. The flexibility introduced by mobile computing brings new challenges to the area of fault tolerance. Failures that were rare with fixed hosts become common, and host disconnection makes fault detection and message coordination difficult. This paper describes a new checkpoint protocol that is well adapted to mobile environments.
Abstract This paper describes and evaluates a coordinated checkpoint protocol that uses time to e... more Abstract This paper describes and evaluates a coordinated checkpoint protocol that uses time to eliminate several performance overheads that are present in traditional protocols. The time-based protocol does not have to exchange coordination messages, does not need to add information to the processes' messages, and only accesses stable storage when checkpoints are saved. This protocol uses a simple initialization procedure to set checkpoint timers at the different processes.
The paper starts by introducing a new dimension along which distributed systems resilience may be... more The paper starts by introducing a new dimension along which distributed systems resilience may be evaluated-exhaustion-safety. A node-exhaustion-safe intrusion-tolerant distributed system is a system that assuredly does not suffer more than the assumed number of node failures (eg, crash, Byzantine). We show that it is not possible to build this kind of systems under the asynchronous model.
Abstract This paper describes the design, implementation, and evaluation of a run-time system for... more Abstract This paper describes the design, implementation, and evaluation of a run-time system for clusters of workstations that allows the rapid testing of checkpoint protocols with standard benchmarks. To achieve this goal, RENEW provides a flexible set of operations that facilitates the integration of a protocol in the system with reduced programming effort. To support a broad range of applications, RENEW exports, as its external interface, the industry endorsed Message Passing Interface (MPI).
Abstract Genetic algorithms have been used successfully as a global optimization method when the ... more Abstract Genetic algorithms have been used successfully as a global optimization method when the search space is very large. To characterize and analyze the performance of genetic algorithms on a cluster of workstations, a parallel version of the GENESIS 5.0 was developed using PVM 3.3. This version, called VMGENESIS, was used to study a nonlinear least-squares problem.
Abstract Previous works have studied how to use proactive recovery to build intrusion-tolerant re... more Abstract Previous works have studied how to use proactive recovery to build intrusion-tolerant replicated systems that are resilient to any number of faults, as long as recoveries are faster than an upper-bound on fault production assumed at system deployment time. In this paper, we propose a complementary approach that combines proactive recovery with services that allow correct replicas to react and recover replicas that they detect or suspect to be compromised.
In the past few decades, critical infrastructures have become largely computerised and interconne... more In the past few decades, critical infrastructures have become largely computerised and interconnected all over the world. This generated the problem of achieving resilience of critical information infrastructures against computer-borne attacks and severe faults. Governments and industry have been pushing an immense research effort in information and systems security, but we believe the complexity of the problem prevents it from being solved using classical security methods.
Abstract This document describes the complete specification of the APIs and Protocols for the MAF... more Abstract This document describes the complete specification of the APIs and Protocols for the MAFTIA Middleware. The architecture of the middleware subsystem has been described in a previous document, where the several modules and services were introduced: Activity Services; Communication Services; Network Abstraction; Trusted and Untrusted Components.
Abstract Today's critical infrastructures like the power grid are essentially physical processes ... more Abstract Today's critical infrastructures like the power grid are essentially physical processes controlled by computers connected by networks. They are usually as vulnerable as any other interconnected computer system, but their failure has a high socio-economic impact. The report describes a new construct for the protection of these infrastructures, based on distributed algorithms and mechanisms implemented between a set of devices called CIS.
Consensus is a classical distributed systems problem with both theoretical and practical interest... more Consensus is a classical distributed systems problem with both theoretical and practical interest. Asynchronous Byzantine consensus is currently at the core of some solutions for the implementation of highly-resilient computing services. This paper surveys Byzantine consensus in message-passing distributed systems, by presenting the main theoretical results in the area, the main classes of algorithms and by discussing important issues like the performance and resilience of these algorithms.
Abstract. This paper proposes a simple reliable multicast protocol that tolerates arbitrary fault... more Abstract. This paper proposes a simple reliable multicast protocol that tolerates arbitrary faults, including malicious faults such as intrusions. The goal is to show a novel way of designing intrusion-tolerant protocols based on a wellfounded hybrid fault model. This model is based on a simple distributed security kernel–the TTCB–which is used by the processes only to execute securely critical steps of the protocol. Otherwise, the processes and their communication can be attacked in unlimited ways.
Abstract In many emerging wireless scenarios, consensus among nodes represents an important task ... more Abstract In many emerging wireless scenarios, consensus among nodes represents an important task that must be accomplished in a timely and dependable manner. However, the sharing of the radio medium and the typical communication failures of such environments may seriously hinder this operation. In the paper, we perform a practical evaluation of an existing randomized consensus protocol that is resilient to message collisions and omissions.
Abstract: Currently, a large number of electronic transactions are performed with credit or debit... more Abstract: Currently, a large number of electronic transactions are performed with credit or debit cards at terminals located in merchant stores, such as Point of Sale Devices. The success of this form of payment, however, has an associated cost due to the management and maintenance of the many equipments from different generations and manufacturers. In particular, there is an important cost related to the deployment of new software upgrades for the devices, since in most cases human intervention is required.
Abstract This document briefly describes three prototypes, each one corresponding to a subset of ... more Abstract This document briefly describes three prototypes, each one corresponding to a subset of the MAFTIA middleware architecture. Together, these prototypes represent the most important components of the architecture, and constitute Deliverable D25-Running Lab Prototype of MAFTIA Middleware. The code distribution of the prototypes is available for review, and it includes a more extensive documentation.
Abstract. Replication has been used to build intrusion-tolerant systems, which are able to tolera... more Abstract. Replication has been used to build intrusion-tolerant systems, which are able to tolerate a limited number intrusions before the system is compromised. An important limitation of intrusion-tolerant systems is that if the system's replicas are similar, once a flaw is discovered and exploited in one replica, then it is easy to replicate it on the other replicas, compromising the whole system. To circumvent this limitation one must find a way to make these exploits occur independently.
Resumo As ferramentas de análise estática facilitam a detecção de anomalias ou erros de codificaç... more Resumo As ferramentas de análise estática facilitam a detecção de anomalias ou erros de codificação existentes numa aplicação. Estas ferramentas vêm ajudar a eliminar lapsos cometidos pelos programadores, podendo ter um impacto significativo no ciclo de desenvolvimento de um produto, permitindo poupar tempo e dinheiro.
Abstract. The communication between computer systems is dictated by network protocols, which dete... more Abstract. The communication between computer systems is dictated by network protocols, which determine how the network components interact with each other. Knowing the specification of a network protocol can greatly improve the security and dependability of both the design of the protocol and the applications implementing it.
This paper presents Worm-IT, a new intrusion-tolerant group communication system with a membershi... more This paper presents Worm-IT, a new intrusion-tolerant group communication system with a membership service and a view-synchronous atomic multicast primitive. The system is intrusion-tolerant in the sense that it behaves correctly even if some nodes are corrupted and become malicious. It is based on a novel approach that enhances the environment with a special secure distributed component used by the protocols to execute securely a few crucial operations.
In this paper, we present the middleware architecture of MAFTIA, an ESPRIT project aiming at deve... more In this paper, we present the middleware architecture of MAFTIA, an ESPRIT project aiming at developing an open architecture for transactional operations on the Internet. The former is a modular and scalable cryptographic group-oriented middleware suite, suitable for supporting reliable multi-party interactions under partial synchrony models, and subject to malicious as well as accidental faults
Abstract The application of dependability concepts and techniques to the design of secure distrib... more Abstract The application of dependability concepts and techniques to the design of secure distributed systems is raising a considerable amount of interest in both communities under the designation of intrusion tolerance. However, practical intrusion-tolerant replicated systems based on the state machine approach (SMA) can handle at most f Byzantine components out of a total of n= 3f+ 1, which is the maximum resilience in asynchronous systems.
Abstract Mobile computing allows ubiquitous and continuous access to computing resources while th... more Abstract Mobile computing allows ubiquitous and continuous access to computing resources while the users travel or work at a client's site. The flexibility introduced by mobile computing brings new challenges to the area of fault tolerance. Failures that were rare with fixed hosts become common, and host disconnection makes fault detection and message coordination difficult. This paper describes a new checkpoint protocol that is well adapted to mobile environments.
Abstract This paper describes and evaluates a coordinated checkpoint protocol that uses time to e... more Abstract This paper describes and evaluates a coordinated checkpoint protocol that uses time to eliminate several performance overheads that are present in traditional protocols. The time-based protocol does not have to exchange coordination messages, does not need to add information to the processes' messages, and only accesses stable storage when checkpoints are saved. This protocol uses a simple initialization procedure to set checkpoint timers at the different processes.
The paper starts by introducing a new dimension along which distributed systems resilience may be... more The paper starts by introducing a new dimension along which distributed systems resilience may be evaluated-exhaustion-safety. A node-exhaustion-safe intrusion-tolerant distributed system is a system that assuredly does not suffer more than the assumed number of node failures (eg, crash, Byzantine). We show that it is not possible to build this kind of systems under the asynchronous model.
Abstract This paper describes the design, implementation, and evaluation of a run-time system for... more Abstract This paper describes the design, implementation, and evaluation of a run-time system for clusters of workstations that allows the rapid testing of checkpoint protocols with standard benchmarks. To achieve this goal, RENEW provides a flexible set of operations that facilitates the integration of a protocol in the system with reduced programming effort. To support a broad range of applications, RENEW exports, as its external interface, the industry endorsed Message Passing Interface (MPI).
Abstract Genetic algorithms have been used successfully as a global optimization method when the ... more Abstract Genetic algorithms have been used successfully as a global optimization method when the search space is very large. To characterize and analyze the performance of genetic algorithms on a cluster of workstations, a parallel version of the GENESIS 5.0 was developed using PVM 3.3. This version, called VMGENESIS, was used to study a nonlinear least-squares problem.
Abstract Previous works have studied how to use proactive recovery to build intrusion-tolerant re... more Abstract Previous works have studied how to use proactive recovery to build intrusion-tolerant replicated systems that are resilient to any number of faults, as long as recoveries are faster than an upper-bound on fault production assumed at system deployment time. In this paper, we propose a complementary approach that combines proactive recovery with services that allow correct replicas to react and recover replicas that they detect or suspect to be compromised.
In the past few decades, critical infrastructures have become largely computerised and interconne... more In the past few decades, critical infrastructures have become largely computerised and interconnected all over the world. This generated the problem of achieving resilience of critical information infrastructures against computer-borne attacks and severe faults. Governments and industry have been pushing an immense research effort in information and systems security, but we believe the complexity of the problem prevents it from being solved using classical security methods.
Abstract This document describes the complete specification of the APIs and Protocols for the MAF... more Abstract This document describes the complete specification of the APIs and Protocols for the MAFTIA Middleware. The architecture of the middleware subsystem has been described in a previous document, where the several modules and services were introduced: Activity Services; Communication Services; Network Abstraction; Trusted and Untrusted Components.
Abstract Today's critical infrastructures like the power grid are essentially physical processes ... more Abstract Today's critical infrastructures like the power grid are essentially physical processes controlled by computers connected by networks. They are usually as vulnerable as any other interconnected computer system, but their failure has a high socio-economic impact. The report describes a new construct for the protection of these infrastructures, based on distributed algorithms and mechanisms implemented between a set of devices called CIS.
Consensus is a classical distributed systems problem with both theoretical and practical interest... more Consensus is a classical distributed systems problem with both theoretical and practical interest. Asynchronous Byzantine consensus is currently at the core of some solutions for the implementation of highly-resilient computing services. This paper surveys Byzantine consensus in message-passing distributed systems, by presenting the main theoretical results in the area, the main classes of algorithms and by discussing important issues like the performance and resilience of these algorithms.
Abstract. This paper proposes a simple reliable multicast protocol that tolerates arbitrary fault... more Abstract. This paper proposes a simple reliable multicast protocol that tolerates arbitrary faults, including malicious faults such as intrusions. The goal is to show a novel way of designing intrusion-tolerant protocols based on a wellfounded hybrid fault model. This model is based on a simple distributed security kernel–the TTCB–which is used by the processes only to execute securely critical steps of the protocol. Otherwise, the processes and their communication can be attacked in unlimited ways.
Abstract In many emerging wireless scenarios, consensus among nodes represents an important task ... more Abstract In many emerging wireless scenarios, consensus among nodes represents an important task that must be accomplished in a timely and dependable manner. However, the sharing of the radio medium and the typical communication failures of such environments may seriously hinder this operation. In the paper, we perform a practical evaluation of an existing randomized consensus protocol that is resilient to message collisions and omissions.
Abstract: Currently, a large number of electronic transactions are performed with credit or debit... more Abstract: Currently, a large number of electronic transactions are performed with credit or debit cards at terminals located in merchant stores, such as Point of Sale Devices. The success of this form of payment, however, has an associated cost due to the management and maintenance of the many equipments from different generations and manufacturers. In particular, there is an important cost related to the deployment of new software upgrades for the devices, since in most cases human intervention is required.
Abstract This document briefly describes three prototypes, each one corresponding to a subset of ... more Abstract This document briefly describes three prototypes, each one corresponding to a subset of the MAFTIA middleware architecture. Together, these prototypes represent the most important components of the architecture, and constitute Deliverable D25-Running Lab Prototype of MAFTIA Middleware. The code distribution of the prototypes is available for review, and it includes a more extensive documentation.
Abstract. Replication has been used to build intrusion-tolerant systems, which are able to tolera... more Abstract. Replication has been used to build intrusion-tolerant systems, which are able to tolerate a limited number intrusions before the system is compromised. An important limitation of intrusion-tolerant systems is that if the system's replicas are similar, once a flaw is discovered and exploited in one replica, then it is easy to replicate it on the other replicas, compromising the whole system. To circumvent this limitation one must find a way to make these exploits occur independently.
Resumo As ferramentas de análise estática facilitam a detecção de anomalias ou erros de codificaç... more Resumo As ferramentas de análise estática facilitam a detecção de anomalias ou erros de codificação existentes numa aplicação. Estas ferramentas vêm ajudar a eliminar lapsos cometidos pelos programadores, podendo ter um impacto significativo no ciclo de desenvolvimento de um produto, permitindo poupar tempo e dinheiro.
Abstract. The communication between computer systems is dictated by network protocols, which dete... more Abstract. The communication between computer systems is dictated by network protocols, which determine how the network components interact with each other. Knowing the specification of a network protocol can greatly improve the security and dependability of both the design of the protocol and the applications implementing it.