Robert Strom - Academia.edu (original) (raw)

Papers by Robert Strom

Research paper thumbnail of Extending typestate checking using conditional liveness analysis

IEEE Transactions on Software Engineering, 1993

We present a practical extension to typestate checking which is capable of proving programs free ... more We present a practical extension to typestate checking which is capable of proving programs free of uninitialized variable errors even when these programs contain conditionally initialized variables where the initialization of a variable depends upon the equality of one or more '@tagn variables to a constant. The user need not predeclare the relationship between a conditionally initialized variable and its tags, and this relationship may change from one point in the progrqm to another. Our technique generalizes liveness analysis to conditional liveness analysis. Like typestate checking, our technique incorporates a dataflow analysis algorithm in which each point in a program is labeled with a lattice point describing statically tracked information, including the initialization of variables. The labeling is then used to check for programming errors such as referencing a variable which may be uninitialized. Our technique incorporates a more expressive lattice, including predicates of the form: "I is initialized if y equals 2." Because the number of tags per variable is small, the added complexity of the analysis is usually small. The efficiency of our technique is due, to a large extent, to the fact that we use a backwards analysis of the program (instead of the forward analysis used in the original typestate checking algorithm). Our results suggest that backwards analysis-tracking only those properties which need to hold to make the subsequent statements correct-can be more efficient than forward analysis-tracking all properties which are made true by the preceding statements. We conclude with some additional applications of our techniques to program checking. Index r e m-Conditionals, dataflow analysis, liveness analysis, program correctness, typestate checking. 'These annotations indicate the change of typestate that a parameter will occur in the function body (e.g., will become initialized or become uninitialized). Without these annotations, one could not typestate check a module without seeing the code body of the function being called. With these annotations, one can prove a module to be typestate correct independent of the function bodies being invoked.

Research paper thumbnail of High-level language support for programming distributed systems

Proceedings of the 1992 International Conference on Computer Languages

This paper presents a strategy to simplify the programming of heterogeneous distributed systems. ... more This paper presents a strategy to simplify the programming of heterogeneous distributed systems. Our approach is based on integrating a high-level distributed programming model, called the process model, directly into programming languages. Distributed applications written in such languages are portable across di erent e n vironments, are shorter, and are simpler to develop than similar applications developed using conventional approaches. In this paper, we discuss the process model, and present o verviews of Hermes and Concert C, t wo languages that implement this model. Hermes is a secure, representation-independent language designed explicitly around the process model. Concert C is the C language augmented with a small set of extensions to support the process model while allowing reuse of existing C code. Hermes has been prototyped; an implementation of Concert C is in development.

Research paper thumbnail of High-level language support for programming distributed systems

Proceedings of the 1992 International Conference on Computer Languages

This paper presents a strategy to simplify the programming of heterogeneous distributed systems. ... more This paper presents a strategy to simplify the programming of heterogeneous distributed systems. Our approach is based on integrating a high-level distributed programming model, called the process model, directly into programming languages. Distributed applications written in such languages are portable across di erent e n vironments, are shorter, and are simpler to develop than similar applications developed using conventional approaches. In this paper, we discuss the process model, and present o verviews of Hermes and Concert C, t wo languages that implement this model. Hermes is a secure, representation-independent language designed explicitly around the process model. Concert C is the C language augmented with a small set of extensions to support the process model while allowing reuse of existing C code. Hermes has been prototyped; an implementation of Concert C is in development.

Research paper thumbnail of Exactly-once delivery in a content-based publish-subscribe system

Proceedings International Conference on Dependable Systems and Networks

This paper presents a general knowledge model for propagating information in a content-based publ... more This paper presents a general knowledge model for propagating information in a content-based publish-subscribe system. The model is used to derive an efficient and scalable protocol for exactly-once delivery to large numbers (tens of thousands per broker) of content-based subscribers in either publisher order or uniform total order. Our protocol allows intermediate content filtering at each hop, but requires persistent storage only at the publishing site. It is tolerant of message drops, message reorderings, node failures, and link failures, and maintains only "soft" state at intermediate nodes. We evaluate the performance of our implementation both under failure-free conditions and with fault injection.

Research paper thumbnail of Exactly-once delivery in a content-based publish-subscribe system

Proceedings International Conference on Dependable Systems and Networks

This paper presents a general knowledge model for propagating information in a content-based publ... more This paper presents a general knowledge model for propagating information in a content-based publish-subscribe system. The model is used to derive an efficient and scalable protocol for exactly-once delivery to large numbers (tens of thousands per broker) of content-based subscribers in either publisher order or uniform total order. Our protocol allows intermediate content filtering at each hop, but requires persistent storage only at the publishing site. It is tolerant of message drops, message reorderings, node failures, and link failures, and maintains only "soft" state at intermediate nodes. We evaluate the performance of our implementation both under failure-free conditions and with fault injection.

Research paper thumbnail of Information flow based event distribution middleware

Proceedings. 19th IEEE International Conference on Distributed Computing Systems. Workshops on Electronic Commerce and Web-based Applications. Middleware

Event distribution middleware supports the integration of distributed applications by accepting e... more Event distribution middleware supports the integration of distributed applications by accepting events from information producers and disseminating applicable events to interested consumers. In this paper we present a flexible new model, the Information Flow Graph (IFG), for specifying the flow of information in such a system. We illustrate the use of the IFG for: (1) content-based publish/subscribe; (2) stateless event transformations that consolidate events from diverse sources; and (3) stateful event interpretation functions for deriving trends, summaries, and alarms from published events and for defining equivalent event sequences. We introduce two techniques for efficient implementation of such systems: (1) a flow graph rewriting optimization which allows stateless IFGs to be converted to a form which can exploit efficient multicast technology developed for content-based publish/subscribe systems; and (2) an algorithm for converting a sequence of events to the shortest equivalent sequence of events with respect to an event interpretation function.

Research paper thumbnail of An efficient multicast protocol for content-based publish-subscribe systems

Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003)

The publish/subscribe (or pub/sub) paradigm is a simple and easy to use model for interconnecting... more The publish/subscribe (or pub/sub) paradigm is a simple and easy to use model for interconnecting applications in a distributed environment. Many existing pub/sub systems are based on pre-defined subjects, and hence are able to exploit multicast technologies to provide scalability and availability. An emerging alternative to subject-based systems, known as content-based systems, allow information consumers to request events based on the content of published messages. This model is considerably more flexible than subject-based pub/sub, however it was previously not known how to efficiently multicast published messages to interested content-based subscribers within a network of broker (or router) machines. This shortcoming limits the applicability of content-based pub/sub in large or geographically distributed settings. In this paper, we develop and evaluate a novel and efficient technique for multicasting within a network of brokers in a content-based subscription system, thereby showing that content-based pub/sub can be deployed in large or geographically distributed settings.

Research paper thumbnail of Typestate: A programming language concept for enhancing software reliability

IEEE Transactions on Software Engineering, 1986

Typestate-A programming language concept for enhancing software reliability. RE STROM, S YEMINI I... more Typestate-A programming language concept for enhancing software reliability. RE STROM, S YEMINI IEEE Transactions on Software Engineering 12, 157-171, 1/1986. The programming concept designated 'typestate', which ...

Research paper thumbnail of Placement Strategies for Internet-Scale Data Stream Systems

IEEE Internet Computing, 2008

Optimally assigning streaming tasks to network machines is a key factor that influences a large d... more Optimally assigning streaming tasks to network machines is a key factor that influences a large data-stream-processing system's performance. Although researchers have prototyped and investigated various algorithms for task placement in data stream management systems, taxonomies and surveys of such algorithms are currently unavailable. To tackle this knowledge gap, the authors identify a set of core placement design characteristics and use them to compare eight placement algorithms. They also present a heuristic decision tree that can help designers judge how suitable a given placement solution might be to specific problems.

Research paper thumbnail of A recoverable object store

[1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track

Research paper thumbnail of Transparent Recovery of Mach Applications

We have built a software layer on top of Mach 2.5 that recovers multitask Mach applications from ... more We have built a software layer on top of Mach 2.5 that recovers multitask Mach applications from fail-stop failures. The layer implements Optimistic Recovery (OR), a mechanism for transparent recovery from failing tasks and processors, based on asynchronous checkpointing and logging of inter--task messages. OR recovers from failure by restoring a checkpoint and replaying the logged messages. The current prototype supports message communication via sends and receives, simple port operations, and task interactions through the environment manager. This paper discusses the issues Mach raised for this implementation, the structure of the OR layer, the design of future enhancements, and comparisons with other recovery techniques. 1 Introduction During the lifetime of a distributed system one or more of its tasks may fail. Rather than abort the entire system due to a single failure, it is preferable to make the computation fault--tolerant---that is, the computation should continue to funct...

Research paper thumbnail of Smart Middleware and Light Ends (SMILE) for Simplifying Data Integration

Abstract: SMILE is a stateful publish-subscribe system that allows subscribers to request continu... more Abstract: SMILE is a stateful publish-subscribe system that allows subscribers to request continually updated derived views, specified as relational algebraic expressions over published data histories. The derived views can be specified using aggregations, joins, and other transforms. We achieve these for applications that do not require ACID properties but only require that the information they receive is never false and arrives eventually. We formalize this by introducing an “eventual correctness ” guarantee and our implementation enforces it using a monotonic type system. We present preliminary performance results of our implementation.

Research paper thumbnail of Exploiting Event Stream Interpretation in Publish-Subscribe Systems

Publish-subscribe messaging middleware typically offers limited and low-level options for quality... more Publish-subscribe messaging middleware typically offers limited and low-level options for quality of service, such as best-effort delivery versus reliable delivery, or ordered versus unordered. We propose a new, high-level approach to specifying quality of service, in which the consumer specifies an event stream interpretation function that maps an event stream into a state that represents the consumer's semantics of the stream. Under this approach, the system may deliver either the subscribed event stream, or any alternative stream whose image under the interpretation function yields the same state. Event stream interpretation gives consumers the ability to more accurately specify the tolerable distortions of perfect message delivery, and gives middleware implementations the exibility to use more ecient protocols for message delivery and failure recovery while preserving application safety.

Research paper thumbnail of Edinburgh, Scotland, UK Workshop Organization Program Co-Chairs

Research paper thumbnail of 2009 29th IEEE International Conference on Distributed Computing Systems Deterministic Replay for Transparent Recovery in Component-Oriented Middleware

Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distr... more Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distributed event-processing middleware systems consisting of networks of stateful software components that communicate by either one-way (send) or twoway (call) messages. The approach is based on transparently augmenting each component to produce a deterministic component whose state can be recovered by checkpoint and replay. Determinism is achieved by augmenting messages with virtual times, and by scheduling message handling in virtual time order. Scheduling delays are reduced by computing virtual times with estimators: deterministic functions that approximate the expected real times of arrival. We describe our algorithms, show how Java components can be transparently augmented with checkpointing code and with good estimators, discuss how our deterministic runtime can be tuned to reduce overhead, and provide experimental results to measure the overhead of determinism relative to non-determi...

Research paper thumbnail of Deterministic Replay for Transparent Recovery in Component-Oriented Middleware

Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distr... more Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distributed event-processing middleware systems consisting of networks of stateful software components that communicate by either one-way (send) or twoway (call) messages. The approach is based on transparently augmenting each component to produce a deterministic component whose state can be recovered by checkpoint and replay. Determinism is achieved by augmenting messages with virtual times, and by scheduling message handling in virtual time order. Scheduling delays are reduced by computing virtual times with estimators: deterministic functions that approximate the expected real times of arrival. We describe our algorithms, show how Java components can be transparently augmented with checkpointing code and with good estimators, discuss how our deterministic runtime can be tuned to reduce overhead, and provide experimental results to measure the overhead of determinism relative to non-determi...

Research paper thumbnail of Abstract Restoring Consistent Global States of Distributed Computations

We present a mechanism for restoring any consistent global state of a distributed computation. Th... more We present a mechanism for restoring any consistent global state of a distributed computation. This capability can form the baais of support for rollback and replay of computations, an activity we view aa essential in a comprehensive environment for debugging distributed programs. Our mechanism records occasional state checkpoints and logs all messages communicated between processes. Our mechanism offers flexibility in the following ways: any consistent global state of the computation can be restored; execution can be replayed either exactly as it occurred initially or with user-controlled variations; there is no need to know a prioti what states might be of interest. In addition, if checkpoints and logs are written to stable storage, our mechanism can be used to restore states of computations that cause the system to crash. 1

Research paper thumbnail of Point

ACM SIGPLAN Notices, 1996

Research paper thumbnail of Performance Modeling and Placement of Transforms for Stateful Mediations

In this paper we propose a new technique for placing large delivey plans for streaming systems on... more In this paper we propose a new technique for placing large delivey plans for streaming systems on a network of machines to optimize efficiency measures such as latency. In the model we consider, there is a large network of machines and the different fixed end-points of the network act as publishers and subscribers of information. Information demanded by subscribers is a transformed view of the information published by the publishers. The transformed view is the outcome of an acyclic network of simple transformations operating on the publishers’ information or some intermediate transformed view of it. We propose algorithms for the optimal placement of the acyclic transform network on the network of machines. As an example scenario to evaluate the efficacy of our algorithms we consider SQL queries on streaming relational tables. The transform network in this case is the SQL operator tree for the query. We first show how to model the performance of individual operators acting on distri...

Research paper thumbnail of Smart Middleware and Light Ends ( SMILE ) for Simplifying Data Integration

SMILE is a stateful publish-subscribe system that allows subscribers to request continually updat... more SMILE is a stateful publish-subscribe system that allows subscribers to request continually updated derived views, specified as relational algebraic expressions over published data histories. The derived views can be specified using aggregations, joins, and other transforms. We achieve these for applications that do not require ACID properties but only require that the information they receive is never false and arrives eventually. We formalize this by introducing an “eventual correctness” guarantee and our implementation enforces it using a monotonic type system. We present preliminary performance results of our

Research paper thumbnail of Extending typestate checking using conditional liveness analysis

IEEE Transactions on Software Engineering, 1993

We present a practical extension to typestate checking which is capable of proving programs free ... more We present a practical extension to typestate checking which is capable of proving programs free of uninitialized variable errors even when these programs contain conditionally initialized variables where the initialization of a variable depends upon the equality of one or more '@tagn variables to a constant. The user need not predeclare the relationship between a conditionally initialized variable and its tags, and this relationship may change from one point in the progrqm to another. Our technique generalizes liveness analysis to conditional liveness analysis. Like typestate checking, our technique incorporates a dataflow analysis algorithm in which each point in a program is labeled with a lattice point describing statically tracked information, including the initialization of variables. The labeling is then used to check for programming errors such as referencing a variable which may be uninitialized. Our technique incorporates a more expressive lattice, including predicates of the form: "I is initialized if y equals 2." Because the number of tags per variable is small, the added complexity of the analysis is usually small. The efficiency of our technique is due, to a large extent, to the fact that we use a backwards analysis of the program (instead of the forward analysis used in the original typestate checking algorithm). Our results suggest that backwards analysis-tracking only those properties which need to hold to make the subsequent statements correct-can be more efficient than forward analysis-tracking all properties which are made true by the preceding statements. We conclude with some additional applications of our techniques to program checking. Index r e m-Conditionals, dataflow analysis, liveness analysis, program correctness, typestate checking. 'These annotations indicate the change of typestate that a parameter will occur in the function body (e.g., will become initialized or become uninitialized). Without these annotations, one could not typestate check a module without seeing the code body of the function being called. With these annotations, one can prove a module to be typestate correct independent of the function bodies being invoked.

Research paper thumbnail of High-level language support for programming distributed systems

Proceedings of the 1992 International Conference on Computer Languages

This paper presents a strategy to simplify the programming of heterogeneous distributed systems. ... more This paper presents a strategy to simplify the programming of heterogeneous distributed systems. Our approach is based on integrating a high-level distributed programming model, called the process model, directly into programming languages. Distributed applications written in such languages are portable across di erent e n vironments, are shorter, and are simpler to develop than similar applications developed using conventional approaches. In this paper, we discuss the process model, and present o verviews of Hermes and Concert C, t wo languages that implement this model. Hermes is a secure, representation-independent language designed explicitly around the process model. Concert C is the C language augmented with a small set of extensions to support the process model while allowing reuse of existing C code. Hermes has been prototyped; an implementation of Concert C is in development.

Research paper thumbnail of High-level language support for programming distributed systems

Proceedings of the 1992 International Conference on Computer Languages

This paper presents a strategy to simplify the programming of heterogeneous distributed systems. ... more This paper presents a strategy to simplify the programming of heterogeneous distributed systems. Our approach is based on integrating a high-level distributed programming model, called the process model, directly into programming languages. Distributed applications written in such languages are portable across di erent e n vironments, are shorter, and are simpler to develop than similar applications developed using conventional approaches. In this paper, we discuss the process model, and present o verviews of Hermes and Concert C, t wo languages that implement this model. Hermes is a secure, representation-independent language designed explicitly around the process model. Concert C is the C language augmented with a small set of extensions to support the process model while allowing reuse of existing C code. Hermes has been prototyped; an implementation of Concert C is in development.

Research paper thumbnail of Exactly-once delivery in a content-based publish-subscribe system

Proceedings International Conference on Dependable Systems and Networks

This paper presents a general knowledge model for propagating information in a content-based publ... more This paper presents a general knowledge model for propagating information in a content-based publish-subscribe system. The model is used to derive an efficient and scalable protocol for exactly-once delivery to large numbers (tens of thousands per broker) of content-based subscribers in either publisher order or uniform total order. Our protocol allows intermediate content filtering at each hop, but requires persistent storage only at the publishing site. It is tolerant of message drops, message reorderings, node failures, and link failures, and maintains only "soft" state at intermediate nodes. We evaluate the performance of our implementation both under failure-free conditions and with fault injection.

Research paper thumbnail of Exactly-once delivery in a content-based publish-subscribe system

Proceedings International Conference on Dependable Systems and Networks

This paper presents a general knowledge model for propagating information in a content-based publ... more This paper presents a general knowledge model for propagating information in a content-based publish-subscribe system. The model is used to derive an efficient and scalable protocol for exactly-once delivery to large numbers (tens of thousands per broker) of content-based subscribers in either publisher order or uniform total order. Our protocol allows intermediate content filtering at each hop, but requires persistent storage only at the publishing site. It is tolerant of message drops, message reorderings, node failures, and link failures, and maintains only "soft" state at intermediate nodes. We evaluate the performance of our implementation both under failure-free conditions and with fault injection.

Research paper thumbnail of Information flow based event distribution middleware

Proceedings. 19th IEEE International Conference on Distributed Computing Systems. Workshops on Electronic Commerce and Web-based Applications. Middleware

Event distribution middleware supports the integration of distributed applications by accepting e... more Event distribution middleware supports the integration of distributed applications by accepting events from information producers and disseminating applicable events to interested consumers. In this paper we present a flexible new model, the Information Flow Graph (IFG), for specifying the flow of information in such a system. We illustrate the use of the IFG for: (1) content-based publish/subscribe; (2) stateless event transformations that consolidate events from diverse sources; and (3) stateful event interpretation functions for deriving trends, summaries, and alarms from published events and for defining equivalent event sequences. We introduce two techniques for efficient implementation of such systems: (1) a flow graph rewriting optimization which allows stateless IFGs to be converted to a form which can exploit efficient multicast technology developed for content-based publish/subscribe systems; and (2) an algorithm for converting a sequence of events to the shortest equivalent sequence of events with respect to an event interpretation function.

Research paper thumbnail of An efficient multicast protocol for content-based publish-subscribe systems

Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003)

The publish/subscribe (or pub/sub) paradigm is a simple and easy to use model for interconnecting... more The publish/subscribe (or pub/sub) paradigm is a simple and easy to use model for interconnecting applications in a distributed environment. Many existing pub/sub systems are based on pre-defined subjects, and hence are able to exploit multicast technologies to provide scalability and availability. An emerging alternative to subject-based systems, known as content-based systems, allow information consumers to request events based on the content of published messages. This model is considerably more flexible than subject-based pub/sub, however it was previously not known how to efficiently multicast published messages to interested content-based subscribers within a network of broker (or router) machines. This shortcoming limits the applicability of content-based pub/sub in large or geographically distributed settings. In this paper, we develop and evaluate a novel and efficient technique for multicasting within a network of brokers in a content-based subscription system, thereby showing that content-based pub/sub can be deployed in large or geographically distributed settings.

Research paper thumbnail of Typestate: A programming language concept for enhancing software reliability

IEEE Transactions on Software Engineering, 1986

Typestate-A programming language concept for enhancing software reliability. RE STROM, S YEMINI I... more Typestate-A programming language concept for enhancing software reliability. RE STROM, S YEMINI IEEE Transactions on Software Engineering 12, 157-171, 1/1986. The programming concept designated 'typestate', which ...

Research paper thumbnail of Placement Strategies for Internet-Scale Data Stream Systems

IEEE Internet Computing, 2008

Optimally assigning streaming tasks to network machines is a key factor that influences a large d... more Optimally assigning streaming tasks to network machines is a key factor that influences a large data-stream-processing system's performance. Although researchers have prototyped and investigated various algorithms for task placement in data stream management systems, taxonomies and surveys of such algorithms are currently unavailable. To tackle this knowledge gap, the authors identify a set of core placement design characteristics and use them to compare eight placement algorithms. They also present a heuristic decision tree that can help designers judge how suitable a given placement solution might be to specific problems.

Research paper thumbnail of A recoverable object store

[1988] Proceedings of the Twenty-First Annual Hawaii International Conference on System Sciences. Volume II: Software track

Research paper thumbnail of Transparent Recovery of Mach Applications

We have built a software layer on top of Mach 2.5 that recovers multitask Mach applications from ... more We have built a software layer on top of Mach 2.5 that recovers multitask Mach applications from fail-stop failures. The layer implements Optimistic Recovery (OR), a mechanism for transparent recovery from failing tasks and processors, based on asynchronous checkpointing and logging of inter--task messages. OR recovers from failure by restoring a checkpoint and replaying the logged messages. The current prototype supports message communication via sends and receives, simple port operations, and task interactions through the environment manager. This paper discusses the issues Mach raised for this implementation, the structure of the OR layer, the design of future enhancements, and comparisons with other recovery techniques. 1 Introduction During the lifetime of a distributed system one or more of its tasks may fail. Rather than abort the entire system due to a single failure, it is preferable to make the computation fault--tolerant---that is, the computation should continue to funct...

Research paper thumbnail of Smart Middleware and Light Ends (SMILE) for Simplifying Data Integration

Abstract: SMILE is a stateful publish-subscribe system that allows subscribers to request continu... more Abstract: SMILE is a stateful publish-subscribe system that allows subscribers to request continually updated derived views, specified as relational algebraic expressions over published data histories. The derived views can be specified using aggregations, joins, and other transforms. We achieve these for applications that do not require ACID properties but only require that the information they receive is never false and arrives eventually. We formalize this by introducing an “eventual correctness ” guarantee and our implementation enforces it using a monotonic type system. We present preliminary performance results of our implementation.

Research paper thumbnail of Exploiting Event Stream Interpretation in Publish-Subscribe Systems

Publish-subscribe messaging middleware typically offers limited and low-level options for quality... more Publish-subscribe messaging middleware typically offers limited and low-level options for quality of service, such as best-effort delivery versus reliable delivery, or ordered versus unordered. We propose a new, high-level approach to specifying quality of service, in which the consumer specifies an event stream interpretation function that maps an event stream into a state that represents the consumer's semantics of the stream. Under this approach, the system may deliver either the subscribed event stream, or any alternative stream whose image under the interpretation function yields the same state. Event stream interpretation gives consumers the ability to more accurately specify the tolerable distortions of perfect message delivery, and gives middleware implementations the exibility to use more ecient protocols for message delivery and failure recovery while preserving application safety.

Research paper thumbnail of Edinburgh, Scotland, UK Workshop Organization Program Co-Chairs

Research paper thumbnail of 2009 29th IEEE International Conference on Distributed Computing Systems Deterministic Replay for Transparent Recovery in Component-Oriented Middleware

Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distr... more Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distributed event-processing middleware systems consisting of networks of stateful software components that communicate by either one-way (send) or twoway (call) messages. The approach is based on transparently augmenting each component to produce a deterministic component whose state can be recovered by checkpoint and replay. Determinism is achieved by augmenting messages with virtual times, and by scheduling message handling in virtual time order. Scheduling delays are reduced by computing virtual times with estimators: deterministic functions that approximate the expected real times of arrival. We describe our algorithms, show how Java components can be transparently augmented with checkpointing code and with good estimators, discuss how our deterministic runtime can be tuned to reduce overhead, and provide experimental results to measure the overhead of determinism relative to non-determi...

Research paper thumbnail of Deterministic Replay for Transparent Recovery in Component-Oriented Middleware

Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distr... more Abstract—We present and evaluate a low-overhead approach for achieving high-availability in distributed event-processing middleware systems consisting of networks of stateful software components that communicate by either one-way (send) or twoway (call) messages. The approach is based on transparently augmenting each component to produce a deterministic component whose state can be recovered by checkpoint and replay. Determinism is achieved by augmenting messages with virtual times, and by scheduling message handling in virtual time order. Scheduling delays are reduced by computing virtual times with estimators: deterministic functions that approximate the expected real times of arrival. We describe our algorithms, show how Java components can be transparently augmented with checkpointing code and with good estimators, discuss how our deterministic runtime can be tuned to reduce overhead, and provide experimental results to measure the overhead of determinism relative to non-determi...

Research paper thumbnail of Abstract Restoring Consistent Global States of Distributed Computations

We present a mechanism for restoring any consistent global state of a distributed computation. Th... more We present a mechanism for restoring any consistent global state of a distributed computation. This capability can form the baais of support for rollback and replay of computations, an activity we view aa essential in a comprehensive environment for debugging distributed programs. Our mechanism records occasional state checkpoints and logs all messages communicated between processes. Our mechanism offers flexibility in the following ways: any consistent global state of the computation can be restored; execution can be replayed either exactly as it occurred initially or with user-controlled variations; there is no need to know a prioti what states might be of interest. In addition, if checkpoints and logs are written to stable storage, our mechanism can be used to restore states of computations that cause the system to crash. 1

Research paper thumbnail of Point

ACM SIGPLAN Notices, 1996

Research paper thumbnail of Performance Modeling and Placement of Transforms for Stateful Mediations

In this paper we propose a new technique for placing large delivey plans for streaming systems on... more In this paper we propose a new technique for placing large delivey plans for streaming systems on a network of machines to optimize efficiency measures such as latency. In the model we consider, there is a large network of machines and the different fixed end-points of the network act as publishers and subscribers of information. Information demanded by subscribers is a transformed view of the information published by the publishers. The transformed view is the outcome of an acyclic network of simple transformations operating on the publishers’ information or some intermediate transformed view of it. We propose algorithms for the optimal placement of the acyclic transform network on the network of machines. As an example scenario to evaluate the efficacy of our algorithms we consider SQL queries on streaming relational tables. The transform network in this case is the SQL operator tree for the query. We first show how to model the performance of individual operators acting on distri...

Research paper thumbnail of Smart Middleware and Light Ends ( SMILE ) for Simplifying Data Integration

SMILE is a stateful publish-subscribe system that allows subscribers to request continually updat... more SMILE is a stateful publish-subscribe system that allows subscribers to request continually updated derived views, specified as relational algebraic expressions over published data histories. The derived views can be specified using aggregations, joins, and other transforms. We achieve these for applications that do not require ACID properties but only require that the information they receive is never false and arrives eventually. We formalize this by introducing an “eventual correctness” guarantee and our implementation enforces it using a monotonic type system. We present preliminary performance results of our