Dylan J McNamee - Academia.edu (original) (raw)
Papers by Dylan J McNamee
Parallel and Distributed Processing Techniques and Applications, 1997
Reservation-based scheduling delivers a proportion of the CPU to jobs over a period of time. In t... more Reservation-based scheduling delivers a proportion of the CPU to jobs over a period of time. In this paper we argue that automatically determining and assigning this period is both possible and useful in general purpose soft real-time environments such as personal computers and information appliances. The goal of period adaptation is to select the period over which a job is guaranteed to receive its portion of the CPU dynamically and automatically. The choice of period represents a trade-off between the amount of jitter observed by the job and the overall efficiency of the system. Secondary effects of period include quantization error, job priority, changes in memory behavior, and battery life of portable devices. In addition to discussing these issues in detail, we present the design and evaluation of a mechanism for period adaptation based on feedback control. Together with an existing proportion allocation mechanism, this period adapter merges the benefits of best-effort and reservation-based systems by providing the fine-grain control of reservation-based scheduling without requiring applications to specify their own resource needs in advance.
Proceedings of SPIE, Dec 29, 1997
Device independent I/O has been a holy grail to operating system designers since the early days o... more Device independent I/O has been a holy grail to operating system designers since the early days of UNIX. Unfortunately, existing operating systems fall short of this goal for multimedia applications. Techniques such as caching and sequential read-ahead can help mask I/O latency in some cases, but in others they increase latency and add substantial jitter. Multimedia applications, such as video players, are sensitive to vagaries in performance since I/O latency and jitter affect the quality of presentation. Our solution uses adaptive prefetching to reduce both latency and jitter. Applications submit file access plans to the prefetcher, which then generates I/O requests to the operating system and manages the buffer cache to isolate the application from variations in device performance. Our experiments show device independence can be achieved: an MPEG video player sees the same latency when reading from a local disk or an NFS server. Moreover, our approach reduces jitter substantially.
Proceedings of SPIE, Dec 22, 2000
A packet scheduler is an operating system component that controls the allocation of network inter... more A packet scheduler is an operating system component that controls the allocation of network interface bandwidth to outgoing network flows. By deciding which packet to send next, packet schedulers not only determine how bandwidth is shared among flows, but also play a key role in determining the rate and timing behavior of individual flows. The recent explosion of rate and timing-sensitive flows, particularly in the context of multimedia applications, has focused new interest on packet schedulers. Next generation packet schedulers must not only ensure separation among flows and meet real-time performance constraints, they must also support dynamic fine-grain reallocation of bandwidth for flows with variable-bit-rate requirements. Unfortunately, today's packet schedulers either do not support rate and timing sensitive flows, or do so with reservation systems that are relatively coarse-grain and inflexible. This paper makes two contributions. First it shows how bandwidth requirements can be inferred directly from real-rate flows, without requiring explicit specifications from the application. Second, it presents the design, implementation and performance evaluation of a rate-matching packet scheduler that uses these inferred requirements to automatically and dynamically control the bandwidth allocation to flows.
Achieving maximum performance in message-passing programs requires that calculation and communica... more Achieving maximum performance in message-passing programs requires that calculation and communication be overlapped. However, the program transformations required to achieve this overlap are error-prone and add significant complexity to the application program. We argue that calculation/communication overlap can be achieved easily and consistently by executing multiple threads of control on each processor, and that this approach is practical on message-passing architectures without any special hardware support. We present timing data for a typical message-passing application, to demonstrate the advantages of our scheme.
Proceedings of SPIE, Mar 1, 1998
This paper describes the design and implementation of a real-time, streaming, Internet video and ... more This paper describes the design and implementation of a real-time, streaming, Internet video and audio player. The player has a number of advanced features including dynamic adaptation to changes in available bandwidth, latency and latency variation; a multi-dimensional media scaling capability driven by user-specified quality of service (QoS) requirements; and support for complex content comprising multiple synchronized video and audio streams. The player was developed as part of the QUASAR t project at Oregon Graduate Institute, is freely available, and serves as a testbed for research in adaptive resource management and QoS control.
In this paper we propose changing the decades-old practice of allocating CPU to threads based on ... more In this paper we propose changing the decades-old practice of allocating CPU to threads based on priority to a scheme based on proportion and period. Our scheme allocates to each thread a percentage of CPU cycles over a period of time, and uses a feedback-based adaptive scheduler to assign automatically both proportion and period. Applications with known requirements, such as isochronous software devices, can bypass the adaptive scheduler by specifying their desired proportion and/or period. As a result, our scheme provides reservations to applications that need them, and the benefits of proportion and period to those that do not. Adaptive scheduling using proportion and period has several distinct benefits over either fixed or adaptive priority based schemes: finer grain control of allocation, lower variance in the amount of cycles allocated to a thread, and avoidance of accidental priority inversion and starvation, including defense against denial-of-service attacks. This paper describes our design of an adaptive controller and proportion-period scheduler, its implementation in Linux, and presents experimental validation of our approach.
This paper presents an algorithm for scheduling parallel applications in large-scale, multiuser, ... more This paper presents an algorithm for scheduling parallel applications in large-scale, multiuser, heterogeneous distributed systems. The approach is primarily targeted at systems that harvest idle cycles in general-purpose workstation networks, but is also applicable to clustered computer systems and massively parallel processors. The algorithm handles unequal processor capacities, multiple architecture types and dynamic variations in the number of processes and available processors. Scheduling decisions are driven by the desire to minimize turnaround time while maintaining fairness among competing applications. For efficiency, the virtual processors (VPs) of each application are gang scheduled on some subset of the available physical processors.
In this paper we propose to use feedback control to automatically allocate disk bandwidth in orde... more In this paper we propose to use feedback control to automatically allocate disk bandwidth in order to match the rate of disk I/O to the real-rate [13] needs of applications. We describe a model for adaptive resource management based on measuring the relative progress of stages in a producer-consumer pipeline. We show how to use prefetching to transform a passive disk into an active data producer whose progress can be controlled via feedback. Our progress-based framework allows the integrated control of multiple resources. The resulting system automatically adapts to varying application rates as well as to varying device latencies.
Lecture Notes in Computer Science, 2016
The Software Analysis Workbench (SAW) is a system for translating programs into logical expressio... more The Software Analysis Workbench (SAW) is a system for translating programs into logical expressions, transforming these expressions, and using external reasoning tools (such as SAT and SMT solvers) to prove properties about them. In the implementation of this translation, SAW combines efficient symbolic execution techniques in a novel way. It has been used most extensively to prove that implementations of cryptographic algorithms are functionally equivalent to reference specifications, but can also be used to identify inputs to programs that will lead to outputs with particular properties, and prove other properties about programs. In this paper, we describe the structure of the SAW system and present experimental results demonstrating the benefits of its implementation techniques.
Application domains such as multimedia, databases, and parallel computing, require operating syst... more Application domains such as multimedia, databases, and parallel computing, require operating system services with high performance and high functionality. Existing operating systems provide fixed interfaces and implementations to system services and resources. This makes them inappropriate for applications whose resource demands and usage patterns are poorly matched by the services provided. The SPIN operating system enables system services to be defined in an application-specific fashion, through an extensible microkernel. It offers applications fine-grained control over a machine's logical and physical resources through run-time adaptation of the system to application requirements.
Springer eBooks, 1997
Multimedia applications are sensitive to 110 latency and jitter when accessing data in secondary ... more Multimedia applications are sensitive to 110 latency and jitter when accessing data in secondary storage. Transparent adaptive prefetching (TAP) uses software feedback to provide multimedia applications with file system quality of service (QoS) guarantees. We are investigating how QoS requirements can be communicated and how they can be met by adaptive resource management. A preliminary test of adaptive prefetching is presented.
CPU scheduling and admission testing for multimedia applications have been extensively studied, a... more CPU scheduling and admission testing for multimedia applications have been extensively studied, and various solutions have been proposed using assorted simplifying assumptions. However, we believe that the complexity and dynamic behavior of multimedia applications and systems make static solutions hard to apply in real-world situations. We are analyzing the difficulties that arise when applying the rate-monotonic (RM) scheduling algorithm and the corresponding admission tests for CPU management, in the context of real multimedia applications running on real systems. RM requires statically predictable, periodic workloads, and while multimedia applications appear to be periodic, in practice they exhibit numerous variabilities in workload. Our study suggests the need for more adaptive scheduling mechanisms, which would allow complex applications to dynamically respond to variations in workload and resource availability. Furthermore, we believe there is a need for a more abstract characterization of applications and resources for admission testing purposes. We conclude that adaptive CPU scheduling policies should address the needs of CPU scheduling and reservation for current multimedia applications.
The Mach external pager interface allows applications to supply their own routines for moving pag... more The Mach external pager interface allows applications to supply their own routines for moving pages to and from second-level store. Mach doesn't allow applications to choose their own page replacement policy, however. Some applications have access patterns that may make least recently used page replacement inappropriate. In this paper, we describe an extension to the external pager interface that allows the programmer to specify the page replacement policy as well as the backing storage for a region of virtual memory.
When user-level threads are built on top of traditional kernel threads, they can exhibit poor per... more When user-level threads are built on top of traditional kernel threads, they can exhibit poor performance or even incorrect behavior in the face of blocking kernel operations such as I/O, page faults, and processor preemption. This problem can be solved by building user-level threads on top of a new kernel entity, the scheduler activation. The goal of the effort described in this paper was to implement scheduler activations in the Mach 3.0 operating system. We describe the design decisions made, the kernel modifications required, and our additions to the CThreads thread library to take advantage of the new kernel structure. We also isolate the performance costs incurred due to scheduler activations support, and empirically demonstrate that these costs are outweighed by the benefits of this approach.
Proceedings of the 6th workshop on ACM SIGOPS European workshop Matching operating systems to application needs - EW 6, 1994
Files are a tried and true operating system abstraction. They present a simple byte-stream model ... more Files are a tried and true operating system abstraction. They present a simple byte-stream model of I/O that has proven intuitive for application programmers and efficient for operating system builders. However, current file systems do not provide good support for adaptive continuous media (CM) applications – an increasingly important class of applications that exhibit complex access patterns and are particularly sensitive to variations in I/O performance. To address these problems we propose synthetic files. Synthetic files are specialized views of underlying regular files, and convert complex file access patterns into simple sequential synthetic file access patterns. Synthetic file construction can be viewed as a declarative meta-interface for I/O, enabling application-driven prefetching strategies that can hide device access latency even for applications with complex access patterns. Synthetic files can be realized dynamically, incrementally, or even optimistically. In this paper we outline a feedback-driven, incremental creation strategy that hides variations in device access latency for QoS-adaptive CM applications. This project was supported in part by DARPA contracts/grants N66001-97-C-8522, N66001-97-C-8523, and F19628-95-C-0193, and by Tektronix, Inc. and Intel Corporation.
This document explains how to use BibFrame on machines at the University of Washing- ton. BibFram... more This document explains how to use BibFrame on machines at the University of Washing- ton. BibFrame is an interface between BibTeX and Frame. If you use Frame now (for any- thing more than reading this document) BibFrame will be a huge time-saver. If you still use LaTeX, BibFrame could just be that “last straw” that encourages you to move from the 1970’s technology of TeX, and into the 1980’s. As an utter and total aside, this document can also serve as a template for you to use to make your own Frame documents that roughly resemble LaTeX’s document format.
Parallel and Distributed Processing Techniques and Applications, 1997
Reservation-based scheduling delivers a proportion of the CPU to jobs over a period of time. In t... more Reservation-based scheduling delivers a proportion of the CPU to jobs over a period of time. In this paper we argue that automatically determining and assigning this period is both possible and useful in general purpose soft real-time environments such as personal computers and information appliances. The goal of period adaptation is to select the period over which a job is guaranteed to receive its portion of the CPU dynamically and automatically. The choice of period represents a trade-off between the amount of jitter observed by the job and the overall efficiency of the system. Secondary effects of period include quantization error, job priority, changes in memory behavior, and battery life of portable devices. In addition to discussing these issues in detail, we present the design and evaluation of a mechanism for period adaptation based on feedback control. Together with an existing proportion allocation mechanism, this period adapter merges the benefits of best-effort and reservation-based systems by providing the fine-grain control of reservation-based scheduling without requiring applications to specify their own resource needs in advance.
Proceedings of SPIE, Dec 29, 1997
Device independent I/O has been a holy grail to operating system designers since the early days o... more Device independent I/O has been a holy grail to operating system designers since the early days of UNIX. Unfortunately, existing operating systems fall short of this goal for multimedia applications. Techniques such as caching and sequential read-ahead can help mask I/O latency in some cases, but in others they increase latency and add substantial jitter. Multimedia applications, such as video players, are sensitive to vagaries in performance since I/O latency and jitter affect the quality of presentation. Our solution uses adaptive prefetching to reduce both latency and jitter. Applications submit file access plans to the prefetcher, which then generates I/O requests to the operating system and manages the buffer cache to isolate the application from variations in device performance. Our experiments show device independence can be achieved: an MPEG video player sees the same latency when reading from a local disk or an NFS server. Moreover, our approach reduces jitter substantially.
Proceedings of SPIE, Dec 22, 2000
A packet scheduler is an operating system component that controls the allocation of network inter... more A packet scheduler is an operating system component that controls the allocation of network interface bandwidth to outgoing network flows. By deciding which packet to send next, packet schedulers not only determine how bandwidth is shared among flows, but also play a key role in determining the rate and timing behavior of individual flows. The recent explosion of rate and timing-sensitive flows, particularly in the context of multimedia applications, has focused new interest on packet schedulers. Next generation packet schedulers must not only ensure separation among flows and meet real-time performance constraints, they must also support dynamic fine-grain reallocation of bandwidth for flows with variable-bit-rate requirements. Unfortunately, today's packet schedulers either do not support rate and timing sensitive flows, or do so with reservation systems that are relatively coarse-grain and inflexible. This paper makes two contributions. First it shows how bandwidth requirements can be inferred directly from real-rate flows, without requiring explicit specifications from the application. Second, it presents the design, implementation and performance evaluation of a rate-matching packet scheduler that uses these inferred requirements to automatically and dynamically control the bandwidth allocation to flows.
Achieving maximum performance in message-passing programs requires that calculation and communica... more Achieving maximum performance in message-passing programs requires that calculation and communication be overlapped. However, the program transformations required to achieve this overlap are error-prone and add significant complexity to the application program. We argue that calculation/communication overlap can be achieved easily and consistently by executing multiple threads of control on each processor, and that this approach is practical on message-passing architectures without any special hardware support. We present timing data for a typical message-passing application, to demonstrate the advantages of our scheme.
Proceedings of SPIE, Mar 1, 1998
This paper describes the design and implementation of a real-time, streaming, Internet video and ... more This paper describes the design and implementation of a real-time, streaming, Internet video and audio player. The player has a number of advanced features including dynamic adaptation to changes in available bandwidth, latency and latency variation; a multi-dimensional media scaling capability driven by user-specified quality of service (QoS) requirements; and support for complex content comprising multiple synchronized video and audio streams. The player was developed as part of the QUASAR t project at Oregon Graduate Institute, is freely available, and serves as a testbed for research in adaptive resource management and QoS control.
In this paper we propose changing the decades-old practice of allocating CPU to threads based on ... more In this paper we propose changing the decades-old practice of allocating CPU to threads based on priority to a scheme based on proportion and period. Our scheme allocates to each thread a percentage of CPU cycles over a period of time, and uses a feedback-based adaptive scheduler to assign automatically both proportion and period. Applications with known requirements, such as isochronous software devices, can bypass the adaptive scheduler by specifying their desired proportion and/or period. As a result, our scheme provides reservations to applications that need them, and the benefits of proportion and period to those that do not. Adaptive scheduling using proportion and period has several distinct benefits over either fixed or adaptive priority based schemes: finer grain control of allocation, lower variance in the amount of cycles allocated to a thread, and avoidance of accidental priority inversion and starvation, including defense against denial-of-service attacks. This paper describes our design of an adaptive controller and proportion-period scheduler, its implementation in Linux, and presents experimental validation of our approach.
This paper presents an algorithm for scheduling parallel applications in large-scale, multiuser, ... more This paper presents an algorithm for scheduling parallel applications in large-scale, multiuser, heterogeneous distributed systems. The approach is primarily targeted at systems that harvest idle cycles in general-purpose workstation networks, but is also applicable to clustered computer systems and massively parallel processors. The algorithm handles unequal processor capacities, multiple architecture types and dynamic variations in the number of processes and available processors. Scheduling decisions are driven by the desire to minimize turnaround time while maintaining fairness among competing applications. For efficiency, the virtual processors (VPs) of each application are gang scheduled on some subset of the available physical processors.
In this paper we propose to use feedback control to automatically allocate disk bandwidth in orde... more In this paper we propose to use feedback control to automatically allocate disk bandwidth in order to match the rate of disk I/O to the real-rate [13] needs of applications. We describe a model for adaptive resource management based on measuring the relative progress of stages in a producer-consumer pipeline. We show how to use prefetching to transform a passive disk into an active data producer whose progress can be controlled via feedback. Our progress-based framework allows the integrated control of multiple resources. The resulting system automatically adapts to varying application rates as well as to varying device latencies.
Lecture Notes in Computer Science, 2016
The Software Analysis Workbench (SAW) is a system for translating programs into logical expressio... more The Software Analysis Workbench (SAW) is a system for translating programs into logical expressions, transforming these expressions, and using external reasoning tools (such as SAT and SMT solvers) to prove properties about them. In the implementation of this translation, SAW combines efficient symbolic execution techniques in a novel way. It has been used most extensively to prove that implementations of cryptographic algorithms are functionally equivalent to reference specifications, but can also be used to identify inputs to programs that will lead to outputs with particular properties, and prove other properties about programs. In this paper, we describe the structure of the SAW system and present experimental results demonstrating the benefits of its implementation techniques.
Application domains such as multimedia, databases, and parallel computing, require operating syst... more Application domains such as multimedia, databases, and parallel computing, require operating system services with high performance and high functionality. Existing operating systems provide fixed interfaces and implementations to system services and resources. This makes them inappropriate for applications whose resource demands and usage patterns are poorly matched by the services provided. The SPIN operating system enables system services to be defined in an application-specific fashion, through an extensible microkernel. It offers applications fine-grained control over a machine's logical and physical resources through run-time adaptation of the system to application requirements.
Springer eBooks, 1997
Multimedia applications are sensitive to 110 latency and jitter when accessing data in secondary ... more Multimedia applications are sensitive to 110 latency and jitter when accessing data in secondary storage. Transparent adaptive prefetching (TAP) uses software feedback to provide multimedia applications with file system quality of service (QoS) guarantees. We are investigating how QoS requirements can be communicated and how they can be met by adaptive resource management. A preliminary test of adaptive prefetching is presented.
CPU scheduling and admission testing for multimedia applications have been extensively studied, a... more CPU scheduling and admission testing for multimedia applications have been extensively studied, and various solutions have been proposed using assorted simplifying assumptions. However, we believe that the complexity and dynamic behavior of multimedia applications and systems make static solutions hard to apply in real-world situations. We are analyzing the difficulties that arise when applying the rate-monotonic (RM) scheduling algorithm and the corresponding admission tests for CPU management, in the context of real multimedia applications running on real systems. RM requires statically predictable, periodic workloads, and while multimedia applications appear to be periodic, in practice they exhibit numerous variabilities in workload. Our study suggests the need for more adaptive scheduling mechanisms, which would allow complex applications to dynamically respond to variations in workload and resource availability. Furthermore, we believe there is a need for a more abstract characterization of applications and resources for admission testing purposes. We conclude that adaptive CPU scheduling policies should address the needs of CPU scheduling and reservation for current multimedia applications.
The Mach external pager interface allows applications to supply their own routines for moving pag... more The Mach external pager interface allows applications to supply their own routines for moving pages to and from second-level store. Mach doesn't allow applications to choose their own page replacement policy, however. Some applications have access patterns that may make least recently used page replacement inappropriate. In this paper, we describe an extension to the external pager interface that allows the programmer to specify the page replacement policy as well as the backing storage for a region of virtual memory.
When user-level threads are built on top of traditional kernel threads, they can exhibit poor per... more When user-level threads are built on top of traditional kernel threads, they can exhibit poor performance or even incorrect behavior in the face of blocking kernel operations such as I/O, page faults, and processor preemption. This problem can be solved by building user-level threads on top of a new kernel entity, the scheduler activation. The goal of the effort described in this paper was to implement scheduler activations in the Mach 3.0 operating system. We describe the design decisions made, the kernel modifications required, and our additions to the CThreads thread library to take advantage of the new kernel structure. We also isolate the performance costs incurred due to scheduler activations support, and empirically demonstrate that these costs are outweighed by the benefits of this approach.
Proceedings of the 6th workshop on ACM SIGOPS European workshop Matching operating systems to application needs - EW 6, 1994
Files are a tried and true operating system abstraction. They present a simple byte-stream model ... more Files are a tried and true operating system abstraction. They present a simple byte-stream model of I/O that has proven intuitive for application programmers and efficient for operating system builders. However, current file systems do not provide good support for adaptive continuous media (CM) applications – an increasingly important class of applications that exhibit complex access patterns and are particularly sensitive to variations in I/O performance. To address these problems we propose synthetic files. Synthetic files are specialized views of underlying regular files, and convert complex file access patterns into simple sequential synthetic file access patterns. Synthetic file construction can be viewed as a declarative meta-interface for I/O, enabling application-driven prefetching strategies that can hide device access latency even for applications with complex access patterns. Synthetic files can be realized dynamically, incrementally, or even optimistically. In this paper we outline a feedback-driven, incremental creation strategy that hides variations in device access latency for QoS-adaptive CM applications. This project was supported in part by DARPA contracts/grants N66001-97-C-8522, N66001-97-C-8523, and F19628-95-C-0193, and by Tektronix, Inc. and Intel Corporation.
This document explains how to use BibFrame on machines at the University of Washing- ton. BibFram... more This document explains how to use BibFrame on machines at the University of Washing- ton. BibFrame is an interface between BibTeX and Frame. If you use Frame now (for any- thing more than reading this document) BibFrame will be a huge time-saver. If you still use LaTeX, BibFrame could just be that “last straw” that encourages you to move from the 1970’s technology of TeX, and into the 1980’s. As an utter and total aside, this document can also serve as a template for you to use to make your own Frame documents that roughly resemble LaTeX’s document format.