Joseph L Hellerstein - Profile on Academia.edu (original) (raw)

Papers by Joseph L Hellerstein

USENIX Annual Technical Conference, Jul 11, 2018

SLAOrchestrator is a new system designed to reduce the price increases necessary to support perfo... more SLAOrchestrator is a new system designed to reduce the price increases necessary to support performance SLAs in cloud analytics systems. SLAOrchestrator is designed for SLAs that guarantee per-query execution times. Its core architecture consists of a double learning loop that improves both SLAs and resource management over time. It further utilizes an efficient combination of elastic query scheduling and multi-tenant resource provisioning algorithms to reduce the costs of performance guarantees.

YSCOPE: A Shell for Building Expert Systems for Solving Computer-Performance Problems

Int. CMG Conference, 1985

Expert Systems in Data Processing: Applications Using IBM's Knowledge Tool

2004 IEEE/IFIP Network Operations and Management Symposium (IEEE Cat. No.04CH37507)

Report for early dissemination of its contents. In view of the transfer of copyright to the outsi... more Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g. , payment of royalties). Copies may be requested from IBM T.

Expert Systems in Data Processing Applications Using IBM Knowledge Tool

Cmg, 1989

Rules of thumb for selecting metrics for detecting performance problems

System and Method for Delivering an Integrated Server Administration Platform

Expert Systems in Data Processing Applications Using IBM Knowledge Tool

Computer Measurement Group Conference, 1989

Science Translational Medicine, 2013

An open challenge to model breast cancer prognosis revealed that collaboration and transparency e... more An open challenge to model breast cancer prognosis revealed that collaboration and transparency enhanced the power of prognostic models.

YES/MVS and the automation of operations for large computer complexes

IBM Systems Journal, 1986

The Yorktown Expert SystemlMVS Manager (known as YESIMVS) is an experimental expert system that a... more The Yorktown Expert SystemlMVS Manager (known as YESIMVS) is an experimental expert system that as- sists with the operation of a large computer complex. The first version of YESIMVS (called YESIMVS I) was used regularly in the computing center of IBM&amp;#x27;s Thomas J. Watson Research Center for most of a year. Based on the experience gained in developing and us-

arXiv (Cornell University), Jan 14, 2023

Computational models are increasingly used in high-impact decision making in science, engineering... more Computational models are increasingly used in high-impact decision making in science, engineering, and medicine. The National Aeronautics and Space Administration (NASA) uses computational models to perform complex experiments that are otherwise prohibitively expensive or require a microgravity environment. Similarly, the Food and Drug Administration (FDA) and European Medicines Agency (EMA) have began accepting models and simulations as form of evidence for pharmaceutical and medical device approval. It is crucial that computational models meet a standard of credibility when using them in high-stakes decision making. For this reason, institutes including NASA, the FDA, and the EMA have developed standards to promote and assess the credibility of computational models and simulations. However, due to the breadth of models these institutes assess, these credibility standards are mostly qualitative and avoid making specific recommendations. On the other hand, modeling and simulation in systems biology is a narrow domain and several standards are already in place. As systems biology models increase in complexity and influence, the development of a credibility assessment system is crucial. Here we review existing standards in systems biology, credibility standards in other science, engineering, and medical fields, and propose the development of a credibility standard for systems biology models. 1 Current Standards in Systems Biology Klipp et al. describe standards as agreed-upon formats used to enhance information exchange and mutual understanding 10. In the field of systems biology, standards are a means to share information about experiments, models, data formats, nomenclature, and graphical representations of biochemical systems. Standardized means of information exchange improve model reuse, expandability, and integration as well as allowing communication between tools. In a survey of 125 systems biologists, most thought of standards as essential to their field, primarily for the purpose of reproducing and checking simulation results, both essential aspects of credibility 10. A multitude of standards exist in systems biology for processes from annotation to dissemination. Although there is

2010 Seventh International Conference on the Quantitative Evaluation of Systems, 2010

2012 IEEE Network Operations and Management Symposium, 2012

Cloud providers such as Google are interested in fostering research on the daunting technical cha... more Cloud providers such as Google are interested in fostering research on the daunting technical challenges they face in supporting planetary-scale distributed systems, but no academic organizations have similar scale systems on which to experiment. Fortunately, good research can still be done using traces of real-life production workloads, but there are risks in releasing such data, including inadvertently disclosing confidential or proprietary information, as happened with the Netflix Prize data. This paper discusses these risks, and our approach to them, which we call systematic obfuscation. It protects proprietary and personal data while leaving it possible to answer interesting research questions. We explain and motivate some of the risks and concerns and propose how they can best be mitigated, using as an example our recent publication of a monthlong trace of a production system workload on a 11k-machine cluster.

NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327)

Policy-based management provides a means for IT systems to operate according to business needs. U... more Policy-based management provides a means for IT systems to operate according to business needs. Unfortunately, there is often an "impedance mismatch" between the policies administrators want and the controls they are given. Consider the Apache web server. Administrators want to control CPU and memory utilizations, but this must be done indirectly by manipulating tuning parameters such as MaxClients and KeepAlive. There has been much interest in using feedback control to bridge the impedance mismatch. However, these efforts have focused on a single metric that is manipulated by a single control and hence have not considered interactions between controls such as those that are common in computing systems. This paper shows how multiple-input, multiple-output (MIMO) control theory can be used to enforce policies for interrelated metrics. MIMO is used both to model the target system, Apache in our case, and to design feedback controllers. The MIMO model captures the interactions between KA and MC, and can be used to identify infeasible metric policies. In addition, MIMO control techniques can provide considerable benefit in handling trade-offs between speed of metric convergence and sensitivity to random fluctuations while enforcing the desired policies.

Lecture Notes in Computer Science, 2002

The rapid growth of eCommerce increasingly means business revenues depend on providing good quali... more The rapid growth of eCommerce increasingly means business revenues depend on providing good quality of service (QoS) for web site interactions. Traditionally, system administrators have been responsible for optimizing tuning parameters, a process that is time-consuming and skills-intensive, and therefore high cost. This paper describes an approach to automating parameter tuning using a fuzzy controller that employs rules incorporating qualitative knowledge of the effect of tuning parameters. An example of such qualitative knowledge in the Apache web server is "MaxClients has a concave upward effect on response times." Our studies using a real Apache web server suggest that such a scheme can improve performance without human intervention. Further, we show that the controller can automatically adapt to changes in workloads.

Proceedings of the 41st IEEE Conference on Decision and Control, 2002.

This paper develops a first principles approach to constructing parameterized transfer function m... more This paper develops a first principles approach to constructing parameterized transfer function models for an abstraction of admission control, the M/M/1/K queueing system. We linearize this system using the first order model y(k + 1) = ay(k) + bu(k), where y is the output (e.g., number in system) and u is buffer size. The pole a is estimated as the lag 1 autocorrelation of y at steady state, and b is estimated using dy/du. With these analytic models for a and b, we study the effects of workload (i.e., arrival and service rates) and sample times. We show that a and b move in opposite directions at large utilizations, an effect that can have significant implications on closed loop poles. Further, the DC gain for response time and number in system drops to 0 as buffer size increases, and the DC gain of number in system converges to 0.5 as workload intensity becomes large. These insights may aid in designing robust and/or adaptive controllers for computing systems. Last, our models provide insight into why the integral control of a Lotus Notes email server has an oscillatory response to a change in reference value.

12th IEEE International Conference and Workshops on the Engineering of Computer-Based Systems (ECBS'05)

The high cost of operating large computing installations has motivated a broad interest in reduci... more The high cost of operating large computing installations has motivated a broad interest in reducing the need for human intervention by making systems self-managing. This paper explores the extent to which control theory can provide an architectural and analytic foundation for building self-managing systems, either from new components or layering on top of existing components. Further, we propose a deployable testbed for autonomic computing (DTAC) that we believe will reduce the barriers to addressing key research problems in autonomic computing. The initial DTAC architecture is described along with several problems that it can be used to investigate.

Lecture Notes in Computer Science, 2003

Optimizing configuration parameters is time-consuming and skills-intensive. This paper proposes a... more Optimizing configuration parameters is time-consuming and skills-intensive. This paper proposes a generic approach to automating this task. By generic, we mean that the approach is relatively independent of the target system for which the optimization is done. Our approach uses online adjustment of configuration parameters to discover the system's performance characteristics. Doing so creates two challenges: (1) handling interdependencies between configuration parameters and (2) minimizing the deleterious effects on production workload while the optimization is underway. Our approach addresses (1) by including in the architecture a rule-based component that handles interdependencies between configuration parameters. For (2), we use a feedback mechanism for online optimization that searches the parameter space in a way that generally avoids poor performance at intermediate steps. Our studies of a DB2 Universal Database Server under an e-commerce workload indicate that our approach can be effective in practice.

Lecture Notes in Computer Science, 2003

Administrative utilities (e.g., filesystem and database backups, garbage collection in the Java V... more Administrative utilities (e.g., filesystem and database backups, garbage collection in the Java Virtual Machines) are an essential part of the operation of production systems. Since production work can be severely degraded by the execution of such utilities, it is desirable to have policies of the form "There should be no more than an x% degradation of production work due to utility execution." Two challenges arise in providing such policies: (1) providing an effective mechanism for throttling the resource consumption of utilities and (2) continuously translating from policy expressions of "degradation units" into the appropriate settings for the throttling mechanism. We address (1) by using self-imposed sleep, a technique that forces utilities to slow down their processing by a configurable amount. We address (2) by employing an online estimation scheme in combination with a feedback loop. This throttling system is autonomous and adaptive and allows the system to self-manage its utilities to limit their performance impact, with only high-level policy input from the administrator. We demonstrate the effectiveness of these approaches in a prototype system that incorporates these capabilities into IBM's DB2 Universal Database server.

USENIX Annual Technical Conference, Jul 11, 2018

YSCOPE: A Shell for Building Expert Systems for Solving Computer-Performance Problems

Int. CMG Conference, 1985

Expert Systems in Data Processing: Applications Using IBM's Knowledge Tool

2004 IEEE/IFIP Network Operations and Management Symposium (IEEE Cat. No.04CH37507)

Expert Systems in Data Processing Applications Using IBM Knowledge Tool

Cmg, 1989

Rules of thumb for selecting metrics for detecting performance problems

System and Method for Delivering an Integrated Server Administration Platform

Expert Systems in Data Processing Applications Using IBM Knowledge Tool

Computer Measurement Group Conference, 1989

Science Translational Medicine, 2013

YES/MVS and the automation of operations for large computer complexes

IBM Systems Journal, 1986

arXiv (Cornell University), Jan 14, 2023

2010 Seventh International Conference on the Quantitative Evaluation of Systems, 2010

2012 IEEE Network Operations and Management Symposium, 2012

NOMS 2002. IEEE/IFIP Network Operations and Management Symposium. ' Management Solutions for the New Communications World'(Cat. No.02CH37327)

Lecture Notes in Computer Science, 2002

Proceedings of the 41st IEEE Conference on Decision and Control, 2002.

12th IEEE International Conference and Workshops on the Engineering of Computer-Based Systems (ECBS'05)

Lecture Notes in Computer Science, 2003