David Nagle - Academia.edu (original) (raw)

Papers by David Nagle

Research paper thumbnail of Designing Computer Systems with MEMS-based Storage (CMU-CS-00-137)

Research paper thumbnail of Towards higher disk head utilization: extracting free bandwidth from busy disk drives

Operating Systems Design and Implementation, Oct 22, 2000

Research paper thumbnail of A cost-effective, high-bandwidth storage architecture

Sigplan Notices, Oct 1, 1998

Research paper thumbnail of Monster : a tool for analyzing the interaction between operating systems and computer architectures

Fine-grained measurements" involve the monitoring of events that can change as frequently as once... more Fine-grained measurements" involve the monitoring of events that can change as frequently as once every machine cycle.

Research paper thumbnail of Sharding the shards: managing datastore locality at scale with Akkio

Operating Systems Design and Implementation, Oct 8, 2018

Research paper thumbnail of Building firewalls with intelligent network interface cards

Abstract: "The primary method for protecting networks today is to use a firewall: a boundary... more Abstract: "The primary method for protecting networks today is to use a firewall: a boundary separating the protected network from the untrusted Internet. However, these firewalls offer no protection from internal attacks, scale poorly due to limited firewall processing capacity, and do not support mobile computing. Distributing a firewall to each network host avoids many of these problems, but weakens the security guarantees of the network since it places the firewall under the control of the host OS. Leveraging the increasing capability of embedded-VLSI, including network-specific processors, we propose a Network Interface Card (NIC) based distributed firewall. Supporting the same (and more) functions as a centralized firewall, NIC-based firewalls provide significant benefits including: scalability, easier client customization, sharing application/OS state to enable application-level filtering, and the ability to block misbehaving hosts at the source, the host itself. We desc...

Research paper thumbnail of Optimal allocation of on-chip memory for multiple-API operating systems

Proceedings of 21 International Symposium on Computer Architecture

Research paper thumbnail of Integrity and performance in network attached storage

Lecture Notes in Computer Science, 1999

Research paper thumbnail of Vertical benchmarks for CAD

Proceedings of the 36th annual ACM/IEEE Design Automation Conference, 1999

Research paper thumbnail of Instruction fetching

Proceedings of the 22nd annual international symposium on Computer architecture - ISCA '95, 1995

Research paper thumbnail of Design tradeoffs for software-managed TLBs

ACM Transactions on Computer Systems, 1994

An increasing number of architectures provide virtual memory support through software-managed TLB... more An increasing number of architectures provide virtual memory support through software-managed TLBs. However, software management can impose considerable penalties that are highly dependent on the operating system's structure and its use of virtual memory. This work explores software-managed TLB design tradeoffs and their interaction with a range of monolithic and microkernel operating systems. Through hardware monitoring and simulation, we explore TLB performance for benchmarks running on a MIPS R2000-based workstation running Ultrix, OSF/1, and three versions of Mach 3.0.

Research paper thumbnail of Kernel-based memory simulation (extended abstract)

ACM SIGMETRICS Performance Evaluation Review, 1994

Research paper thumbnail of Trap-driven simulation with Tapeworm II

ACM SIGOPS Operating Systems Review, 1994

Tapeworm II is a software-based simulation tool that evaluates the cache and TLB performance of m... more Tapeworm II is a software-based simulation tool that evaluates the cache and TLB performance of multiple-task and operating system intensive workloads. Tapeworm resides in an OS kernel and causes a host machine's hardware to drive simulations with kernel traps instead of with address traces, as is conventionally done. This allows Tapeworm to quickly and accurately capture complete memory referencing behavior with a limited degradation in overall system performance. This paper compares trap-driven simulation, as implemented in Tapeworm, with the more common technique of trace-driven memory simulation with respect to speed, accuracy, portability and flexibility.

Research paper thumbnail of Spanner

ACM Transactions on Computer Systems, 2013

Research paper thumbnail of Minimizing floating-point power dissipation via bit-width reduction

... We compare our variable bit-width multiplier with a baseline fixed-width 24x24 bit Wallace Tr... more ... We compare our variable bit-width multiplier with a baseline fixed-width 24x24 bit Wallace Treemultiplier. The layout of this Wallace Tree multiplier was generated by Epoch's cell gen-erator in the same 0.5u process as used in the design of the digit-serial multiplier. ...

Research paper thumbnail of Monster: A tool for analyzing the interaction between operating systems and computer architectures

Fine-grained measurements" involve the monitoring of events that can change as frequently as once... more Fine-grained measurements" involve the monitoring of events that can change as frequently as once every machine cycle.

Research paper thumbnail of ObjectStorage: Scalable Bandwidth for HPC Clusters

Research paper thumbnail of Trap-driven memory simulation with Tapeworm II

ACM Transactions on Modeling and Computer Simulation, 1997

Research paper thumbnail of Filesystems for Network-Attached Secure Disks

Network-attached storage enables network-striped data transfers directly between client and stora... more Network-attached storage enables network-striped data transfers directly between client and storage to provide clients with scalable bandwidth on large transfers. Network-attached storage also decouples policy and enforcement of access control, avoiding unnecessary reverification of protection checks, reducing file manager work and increasing scalability. It eliminates the expense of a server computer devoted to copying data between peripheral network and client network. This architecture better matches storage technology's sustained data rates, now 80 Mb/s and growing at 40% per year. Finally, it enables self-managing storage to counter the increasing cost of data management. The availability of cost-effective network-attached storage depends on it becoming a storage commodity, which in turn depends on its utility to a broad segment of the storage market. Specifically, multiple distributed and parallel filesystems must benefit from network-attached storage's requirement for secure, direct access between client and storage, for reusable, asynchronous access protection checks, and for increased license to efficiently manage underlying storage media. In this paper, we describe a prototype network-attached secure disk interface and filesystems adapted to network-attached storage implementing Sun's NFS, Transarc's AFS, a network-striped NFS variant, and an informed prefetching NFS variant. Our experimental implementations demonstrate bandwidth and workload scaling and aggressive optimization of application access patterns. Our experience with applications and filesystems adapted to run on network-attached secure disks emphasizes the much greater cost of client network messaging relative to peripheral bus messaging, which offsets some of the expected scaling results.

Research paper thumbnail of Modeling and Scheduling of MEMS-Based Storage Devices

Research paper thumbnail of Designing Computer Systems with MEMS-based Storage (CMU-CS-00-137)

Research paper thumbnail of Towards higher disk head utilization: extracting free bandwidth from busy disk drives

Operating Systems Design and Implementation, Oct 22, 2000

Research paper thumbnail of A cost-effective, high-bandwidth storage architecture

Sigplan Notices, Oct 1, 1998

Research paper thumbnail of Monster : a tool for analyzing the interaction between operating systems and computer architectures

Fine-grained measurements" involve the monitoring of events that can change as frequently as once... more Fine-grained measurements" involve the monitoring of events that can change as frequently as once every machine cycle.

Research paper thumbnail of Sharding the shards: managing datastore locality at scale with Akkio

Operating Systems Design and Implementation, Oct 8, 2018

Research paper thumbnail of Building firewalls with intelligent network interface cards

Abstract: "The primary method for protecting networks today is to use a firewall: a boundary... more Abstract: "The primary method for protecting networks today is to use a firewall: a boundary separating the protected network from the untrusted Internet. However, these firewalls offer no protection from internal attacks, scale poorly due to limited firewall processing capacity, and do not support mobile computing. Distributing a firewall to each network host avoids many of these problems, but weakens the security guarantees of the network since it places the firewall under the control of the host OS. Leveraging the increasing capability of embedded-VLSI, including network-specific processors, we propose a Network Interface Card (NIC) based distributed firewall. Supporting the same (and more) functions as a centralized firewall, NIC-based firewalls provide significant benefits including: scalability, easier client customization, sharing application/OS state to enable application-level filtering, and the ability to block misbehaving hosts at the source, the host itself. We desc...

Research paper thumbnail of Optimal allocation of on-chip memory for multiple-API operating systems

Proceedings of 21 International Symposium on Computer Architecture

Research paper thumbnail of Integrity and performance in network attached storage

Lecture Notes in Computer Science, 1999

Research paper thumbnail of Vertical benchmarks for CAD

Proceedings of the 36th annual ACM/IEEE Design Automation Conference, 1999

Research paper thumbnail of Instruction fetching

Proceedings of the 22nd annual international symposium on Computer architecture - ISCA '95, 1995

Research paper thumbnail of Design tradeoffs for software-managed TLBs

ACM Transactions on Computer Systems, 1994

An increasing number of architectures provide virtual memory support through software-managed TLB... more An increasing number of architectures provide virtual memory support through software-managed TLBs. However, software management can impose considerable penalties that are highly dependent on the operating system's structure and its use of virtual memory. This work explores software-managed TLB design tradeoffs and their interaction with a range of monolithic and microkernel operating systems. Through hardware monitoring and simulation, we explore TLB performance for benchmarks running on a MIPS R2000-based workstation running Ultrix, OSF/1, and three versions of Mach 3.0.

Research paper thumbnail of Kernel-based memory simulation (extended abstract)

ACM SIGMETRICS Performance Evaluation Review, 1994

Research paper thumbnail of Trap-driven simulation with Tapeworm II

ACM SIGOPS Operating Systems Review, 1994

Tapeworm II is a software-based simulation tool that evaluates the cache and TLB performance of m... more Tapeworm II is a software-based simulation tool that evaluates the cache and TLB performance of multiple-task and operating system intensive workloads. Tapeworm resides in an OS kernel and causes a host machine's hardware to drive simulations with kernel traps instead of with address traces, as is conventionally done. This allows Tapeworm to quickly and accurately capture complete memory referencing behavior with a limited degradation in overall system performance. This paper compares trap-driven simulation, as implemented in Tapeworm, with the more common technique of trace-driven memory simulation with respect to speed, accuracy, portability and flexibility.

Research paper thumbnail of Spanner

ACM Transactions on Computer Systems, 2013

Research paper thumbnail of Minimizing floating-point power dissipation via bit-width reduction

... We compare our variable bit-width multiplier with a baseline fixed-width 24x24 bit Wallace Tr... more ... We compare our variable bit-width multiplier with a baseline fixed-width 24x24 bit Wallace Treemultiplier. The layout of this Wallace Tree multiplier was generated by Epoch's cell gen-erator in the same 0.5u process as used in the design of the digit-serial multiplier. ...

Research paper thumbnail of Monster: A tool for analyzing the interaction between operating systems and computer architectures

Fine-grained measurements" involve the monitoring of events that can change as frequently as once... more Fine-grained measurements" involve the monitoring of events that can change as frequently as once every machine cycle.

Research paper thumbnail of ObjectStorage: Scalable Bandwidth for HPC Clusters

Research paper thumbnail of Trap-driven memory simulation with Tapeworm II

ACM Transactions on Modeling and Computer Simulation, 1997

Research paper thumbnail of Filesystems for Network-Attached Secure Disks

Network-attached storage enables network-striped data transfers directly between client and stora... more Network-attached storage enables network-striped data transfers directly between client and storage to provide clients with scalable bandwidth on large transfers. Network-attached storage also decouples policy and enforcement of access control, avoiding unnecessary reverification of protection checks, reducing file manager work and increasing scalability. It eliminates the expense of a server computer devoted to copying data between peripheral network and client network. This architecture better matches storage technology's sustained data rates, now 80 Mb/s and growing at 40% per year. Finally, it enables self-managing storage to counter the increasing cost of data management. The availability of cost-effective network-attached storage depends on it becoming a storage commodity, which in turn depends on its utility to a broad segment of the storage market. Specifically, multiple distributed and parallel filesystems must benefit from network-attached storage's requirement for secure, direct access between client and storage, for reusable, asynchronous access protection checks, and for increased license to efficiently manage underlying storage media. In this paper, we describe a prototype network-attached secure disk interface and filesystems adapted to network-attached storage implementing Sun's NFS, Transarc's AFS, a network-striped NFS variant, and an informed prefetching NFS variant. Our experimental implementations demonstrate bandwidth and workload scaling and aggressive optimization of application access patterns. Our experience with applications and filesystems adapted to run on network-attached secure disks emphasizes the much greater cost of client network messaging relative to peripheral bus messaging, which offsets some of the expected scaling results.

Research paper thumbnail of Modeling and Scheduling of MEMS-Based Storage Devices