john bent | University of Wisconsin Milwaukee (original) (raw)
Papers by john bent
Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by ... more Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLC for the National Nuclear Security Administration of the US Department of Energy under contract DE-AC52-06NA25396. By acceptance of ...
ABSTRACT The exponential growth of datasets has been well-known for many years and has been obser... more ABSTRACT The exponential growth of datasets has been well-known for many years and has been observed across a wide range of disciplines, including checkpoints for tightly-coupled parallel scientific simulations, genetic databases for bioinfor-matics research, and many others. With disk capacity rapidly outpacing disk bandwidth traditional methods of handling I/O may become ineffective. To address these trends we propose using a new architecture for parallel computing with multiple avenues for improvement of data management. To this end we analyze and demonstrate the performance benefits of one such avenue, namely taking advantage of information about upcoming tasks to asynchronously prestage data prior to job execution and present a discussion of the hardware necessary for implementing it in a real environment as well as a prototype implementation using the Makeflow workflow language and a custom hierarchical master-worker driver.
Browse by Title. MINDS@UW Home >; Browse by Title. Browse by Title. 0-9; A; B; C; D; E; F; G; ... more Browse by Title. MINDS@UW Home >; Browse by Title. Browse by Title. 0-9; A; B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; Q; R; S; T; U; V; W; X; Y; Z. Or enter first few letters:Browse for items that begin with these letters. Sort by: title Order: ascending Results: 5. << Showing 9115-9134 of 18296. >>. Intelligent Routing Using Network Processors: Guiding Design Through Analysis. ��� Seshadri, Madhu Sudanan; Bent ...
In the petascale era, the storage stack used by the extreme scale high performance computing comm... more In the petascale era, the storage stack used by the extreme scale high performance computing community is fairly homogeneous across sites. On the compute edge of the stack, file system clients or IO forwarding services direct IO over an interconnect network to a relatively small set of IO nodes. These nodes forward the requests over a secondary storage network to a spindle-based parallel file system. Unfortunately, this architecture will become unviable in the exascale era.
We describe NeST, a flexible software-only storage appliance designed to meet the storage needs o... more We describe NeST, a flexible software-only storage appliance designed to meet the storage needs of the Grid. NeST has three key features that make it wellsuited for deployment in a Grid environment. First, NeST provides a generic data transfer architecture that supports multiple data transfer protocols (including GridFTP and NFS), and allows for the easy addition of new protocols. Second, NeST is dynamic, adapting itself on-the-fly so that it runs effectively on a wide range of hardware and software platforms. Third, NeST is Grid-aware, implying that features that are necessary for integration into the Grid, such as storage space guarantees, mechanisms for resource and data discovery, user authentication, and quality of service, are a part of the NeST infrastructure. We include a practical discussion about building grid tools using the NeST software.
ABSTRACT The I/O bottleneck in high-performance computing is becoming worse as application data c... more ABSTRACT The I/O bottleneck in high-performance computing is becoming worse as application data continues to grow. In this work, we explore how patterns of I/O within these applications can significantly affect the effectiveness of the underlying storage systems and how these same patterns can be utilized to improve many aspects of the I/O stack and mitigate the I/O bottleneck. We offer three main contributions in this paper. First, we develop and evaluate algorithms by which I/O patterns can be efficiently discovered and described. Second, we implement one such algorithm to reduce the metadata quantity in a virtual parallel file system by up to several orders of magnitude, thereby increasing the performance of writes and reads by up to 40 and 480 percent respectively. Third, we build a prototype file system with pattern-aware prefetching and evaluate it to show a 46 percent reduction in I/O latency. Finally, we believe that efficient pattern discovery and description, coupled with the observed predictability of complex patterns within many high-performance applications, offers significant potential to enable many additional I/O optimizations.
We present a study of six batch-pipeline scientific workloads that are candidates for execution o... more We present a study of six batch-pipeline scientific workloads that are candidates for execution on computational grids. Whereas other studies focus on the behavior of single applications, this study characterizes workloads composed of pipelines of sequential processes that use file storage for communication and also share measurements of the memory, CPU, and I/O requirements of individual components as well as analyses of I/O sharing within complete batches. We conclude with a discussion of the ramifications of these workloads for end-to-end scalability and overall system design.
We introduce Shear, a user-level software tool that characterizes RAID storage arrays. Shear empl... more We introduce Shear, a user-level software tool that characterizes RAID storage arrays. Shear employs a set of controlled algorithms combined with statistical techniques to automatically determine the important properties of a RAID system, including the number of disks, chunk size, level of redundancy, and layout scheme. We illustrate the correctness of Shear by running it upon numerous simulated configurations, and then verify its real-world applicability by running Shear on both software-based and hardware-based RAID systems. Finally, we demonstrate the utility of Shear through three case studies. First, we show how Shear can be used in a storage management environment to verify RAID construction and detect failures. Second, we demonstrate how Shear can be used to extract detailed characteristics about the individual disks within an array. Third, we show how an operating system can use Shear to automatically tune its storage subsystems to specific RAID configurations.
Due to the explosive growth in the size of scientific data sets, data-intensive computing is an e... more Due to the explosive growth in the size of scientific data sets, data-intensive computing is an emerging trend in computational science. Many application scientists are looking to integrate data-intensive computing into computationalintensive High Performance Computing facilities, particularly for data analytics. We have observed several scientific applications which must migrate their data from an HPC storage system to a data-intensive one. There is a gap between the data semantics of HPC storage and data-intensive system, hence, once migrated, the data must be further refined and reorganized. This reorganization requires at least two complete scans through the data set and then at least one MapReduce program to prepare the data before analyzing it. Running multiple MapReduce phases causes significant overhead for the application, in the form of excessive I/O operations. For every MapReduce application that must be run in order to complete the desired data analysis, a distributed read and write operation on the file system must be performed. Our contribution is to extend Map-Reduce to eliminate the multiple scans and also reduce the number of pre-processing MapReduce programs. We have added additional expressiveness to the MapReduce language to allow users to specify the logical semantics of their data such that 1) the data can be analyzed without running multiple data pre-processing MapReduce programs, and 2) the data can be simultaneously reorganized as it is migrated to the data-intensive file system. Using our augmented Map-Reduce system, MapReduce with Access Patterns (MRAP), we have demonstrated up to 33% throughput improvement in one real application, and up to 70% in an I/O kernel of another application.
In this work we present an scientific application that has been given a Hadoop MapReduce implemen... more In this work we present an scientific application that has been given a Hadoop MapReduce implementation. We also discuss other scientific fields of supercomputing that could benefit from a MapReduce implementation. We recognize in this work that Hadoop has potential benefit for more applications than simply datamining, but that it is not a panacea for all data intensive applications.
There is high demand for I/O tracing in High Performance Computing (HPC). It enables in-depth ana... more There is high demand for I/O tracing in High Performance Computing (HPC). It enables in-depth analysis of distributed applications and file system performance tuning. It also aids distributed application debugging. Finally, it facilitates collaboration within and between government, industrial, and academic institutions by enabling the generation of replayable I/O traces, which can be easily distributed and anonymized as necessary to protect confidential or sensitive information. As a response to this demand for tracing tools, various means of I/O trace generation exist. We first survey the I/O Tracing Framework landscape, exploring three popular such frameworks: LANL-Trace [3], Tracefs [1], and //TRACE 1 [2]. We next develop an I/O Tracing Framework taxonomy. The purpose of this taxonomy is to assist I/O Tracing Framework users in formalizing their tracing requirements, and to provide the developers of I/O Tracing Frameworks a language to categorize the functionality and performance of them. The taxonomy categorizes I/O Tracing Framework features such as the type of data captured, trace replayability, and anonymization. The taxonomy also considers elapsed-time overhead and performance overhead. Finally, we provide a case study in the use of our new taxonomy, revisiting all three I/O Tracing Frameworks explored in our survey, to formally classify the features of each.
Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by ... more Los Alamos National Laboratory, an affirmative action/equal opportunity employer, is operated by the Los Alamos National Security, LLC for the National Nuclear Security Administration of the US Department of Energy under contract DE-AC52-06NA25396. By acceptance of ...
ABSTRACT The exponential growth of datasets has been well-known for many years and has been obser... more ABSTRACT The exponential growth of datasets has been well-known for many years and has been observed across a wide range of disciplines, including checkpoints for tightly-coupled parallel scientific simulations, genetic databases for bioinfor-matics research, and many others. With disk capacity rapidly outpacing disk bandwidth traditional methods of handling I/O may become ineffective. To address these trends we propose using a new architecture for parallel computing with multiple avenues for improvement of data management. To this end we analyze and demonstrate the performance benefits of one such avenue, namely taking advantage of information about upcoming tasks to asynchronously prestage data prior to job execution and present a discussion of the hardware necessary for implementing it in a real environment as well as a prototype implementation using the Makeflow workflow language and a custom hierarchical master-worker driver.
Browse by Title. MINDS@UW Home >; Browse by Title. Browse by Title. 0-9; A; B; C; D; E; F; G; ... more Browse by Title. MINDS@UW Home >; Browse by Title. Browse by Title. 0-9; A; B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; Q; R; S; T; U; V; W; X; Y; Z. Or enter first few letters:Browse for items that begin with these letters. Sort by: title Order: ascending Results: 5. << Showing 9115-9134 of 18296. >>. Intelligent Routing Using Network Processors: Guiding Design Through Analysis. ��� Seshadri, Madhu Sudanan; Bent ...
In the petascale era, the storage stack used by the extreme scale high performance computing comm... more In the petascale era, the storage stack used by the extreme scale high performance computing community is fairly homogeneous across sites. On the compute edge of the stack, file system clients or IO forwarding services direct IO over an interconnect network to a relatively small set of IO nodes. These nodes forward the requests over a secondary storage network to a spindle-based parallel file system. Unfortunately, this architecture will become unviable in the exascale era.
We describe NeST, a flexible software-only storage appliance designed to meet the storage needs o... more We describe NeST, a flexible software-only storage appliance designed to meet the storage needs of the Grid. NeST has three key features that make it wellsuited for deployment in a Grid environment. First, NeST provides a generic data transfer architecture that supports multiple data transfer protocols (including GridFTP and NFS), and allows for the easy addition of new protocols. Second, NeST is dynamic, adapting itself on-the-fly so that it runs effectively on a wide range of hardware and software platforms. Third, NeST is Grid-aware, implying that features that are necessary for integration into the Grid, such as storage space guarantees, mechanisms for resource and data discovery, user authentication, and quality of service, are a part of the NeST infrastructure. We include a practical discussion about building grid tools using the NeST software.
ABSTRACT The I/O bottleneck in high-performance computing is becoming worse as application data c... more ABSTRACT The I/O bottleneck in high-performance computing is becoming worse as application data continues to grow. In this work, we explore how patterns of I/O within these applications can significantly affect the effectiveness of the underlying storage systems and how these same patterns can be utilized to improve many aspects of the I/O stack and mitigate the I/O bottleneck. We offer three main contributions in this paper. First, we develop and evaluate algorithms by which I/O patterns can be efficiently discovered and described. Second, we implement one such algorithm to reduce the metadata quantity in a virtual parallel file system by up to several orders of magnitude, thereby increasing the performance of writes and reads by up to 40 and 480 percent respectively. Third, we build a prototype file system with pattern-aware prefetching and evaluate it to show a 46 percent reduction in I/O latency. Finally, we believe that efficient pattern discovery and description, coupled with the observed predictability of complex patterns within many high-performance applications, offers significant potential to enable many additional I/O optimizations.
We present a study of six batch-pipeline scientific workloads that are candidates for execution o... more We present a study of six batch-pipeline scientific workloads that are candidates for execution on computational grids. Whereas other studies focus on the behavior of single applications, this study characterizes workloads composed of pipelines of sequential processes that use file storage for communication and also share measurements of the memory, CPU, and I/O requirements of individual components as well as analyses of I/O sharing within complete batches. We conclude with a discussion of the ramifications of these workloads for end-to-end scalability and overall system design.
We introduce Shear, a user-level software tool that characterizes RAID storage arrays. Shear empl... more We introduce Shear, a user-level software tool that characterizes RAID storage arrays. Shear employs a set of controlled algorithms combined with statistical techniques to automatically determine the important properties of a RAID system, including the number of disks, chunk size, level of redundancy, and layout scheme. We illustrate the correctness of Shear by running it upon numerous simulated configurations, and then verify its real-world applicability by running Shear on both software-based and hardware-based RAID systems. Finally, we demonstrate the utility of Shear through three case studies. First, we show how Shear can be used in a storage management environment to verify RAID construction and detect failures. Second, we demonstrate how Shear can be used to extract detailed characteristics about the individual disks within an array. Third, we show how an operating system can use Shear to automatically tune its storage subsystems to specific RAID configurations.
Due to the explosive growth in the size of scientific data sets, data-intensive computing is an e... more Due to the explosive growth in the size of scientific data sets, data-intensive computing is an emerging trend in computational science. Many application scientists are looking to integrate data-intensive computing into computationalintensive High Performance Computing facilities, particularly for data analytics. We have observed several scientific applications which must migrate their data from an HPC storage system to a data-intensive one. There is a gap between the data semantics of HPC storage and data-intensive system, hence, once migrated, the data must be further refined and reorganized. This reorganization requires at least two complete scans through the data set and then at least one MapReduce program to prepare the data before analyzing it. Running multiple MapReduce phases causes significant overhead for the application, in the form of excessive I/O operations. For every MapReduce application that must be run in order to complete the desired data analysis, a distributed read and write operation on the file system must be performed. Our contribution is to extend Map-Reduce to eliminate the multiple scans and also reduce the number of pre-processing MapReduce programs. We have added additional expressiveness to the MapReduce language to allow users to specify the logical semantics of their data such that 1) the data can be analyzed without running multiple data pre-processing MapReduce programs, and 2) the data can be simultaneously reorganized as it is migrated to the data-intensive file system. Using our augmented Map-Reduce system, MapReduce with Access Patterns (MRAP), we have demonstrated up to 33% throughput improvement in one real application, and up to 70% in an I/O kernel of another application.
In this work we present an scientific application that has been given a Hadoop MapReduce implemen... more In this work we present an scientific application that has been given a Hadoop MapReduce implementation. We also discuss other scientific fields of supercomputing that could benefit from a MapReduce implementation. We recognize in this work that Hadoop has potential benefit for more applications than simply datamining, but that it is not a panacea for all data intensive applications.
There is high demand for I/O tracing in High Performance Computing (HPC). It enables in-depth ana... more There is high demand for I/O tracing in High Performance Computing (HPC). It enables in-depth analysis of distributed applications and file system performance tuning. It also aids distributed application debugging. Finally, it facilitates collaboration within and between government, industrial, and academic institutions by enabling the generation of replayable I/O traces, which can be easily distributed and anonymized as necessary to protect confidential or sensitive information. As a response to this demand for tracing tools, various means of I/O trace generation exist. We first survey the I/O Tracing Framework landscape, exploring three popular such frameworks: LANL-Trace [3], Tracefs [1], and //TRACE 1 [2]. We next develop an I/O Tracing Framework taxonomy. The purpose of this taxonomy is to assist I/O Tracing Framework users in formalizing their tracing requirements, and to provide the developers of I/O Tracing Frameworks a language to categorize the functionality and performance of them. The taxonomy categorizes I/O Tracing Framework features such as the type of data captured, trace replayability, and anonymization. The taxonomy also considers elapsed-time overhead and performance overhead. Finally, we provide a case study in the use of our new taxonomy, revisiting all three I/O Tracing Frameworks explored in our survey, to formally classify the features of each.