Jonathan C Hardwick | Minnesota State University, Mankato (original) (raw)
Papers by Jonathan C Hardwick
In this paper we construct families of bit sequences using combinatorial methods. Each sequence i... more In this paper we construct families of bit sequences using combinatorial methods. Each sequence is derived by converting a collection of numbers encoding certain combinatorial numerics from objects exhibiting symmetry in various dimensions. Using the algorithms first described in [1] we show that the NIST testing suite described in publication 800-22 does not detect these symmetries hidden within these sequences.
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that NESL'S performance is competitive with that of machine-specific codes for regular dense data, and is often superior for irregular data.
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that NESL'S performance is competitive with that of machine-specific codes for regular dense data, and is often superior for irregular data.
In this paper we construct families of bit sequences using combinatorial methods. Each sequence i... more In this paper we construct families of bit sequences using combinatorial methods. Each sequence is derived by converting a collection of numbers encoding certain combinatorial numerics from objects exhibiting symmetry in various dimensions. Using the algorithms first described in [1] we show that the NIST testing suite described in publication 800-22 does not detect these symmetries hidden within these sequences.
In this paper we develop the notion of an operation on an Erdös-Rényi graph to generate higher di... more In this paper we develop the notion of an operation on an Erdös-Rényi graph to generate higher dimensional complexes that exhibit a high degree of randomness. In particular, the constructed combinatorial objects have a face structure encoded by their f-vector which is then converted into a random sequence of bits. Empirical results are obtained by implementing the National Institute of standards and Technology suite 800-22 revision 1a.
Sigplan Notices, Jul 1, 1993
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that NESL'S performance is competitive with that of machine-specific codes for regular dense data, and is often superior for irregular data.
This paper describes the design and implementation in MPI of the parallel vector library CVL, whi... more This paper describes the design and implementation in MPI of the parallel vector library CVL, which is used as the basis for implementing nested data-parallel languages such as NESL and Proteus. We outline the features of CVL, and compare the ease of writing and debugging the portable MPI implementation with our experiences writing previous versions in CM-2 Paris, CM-5 CMMD, and PVM 3.0. We give initial performance results for MPI CVL running on the SP-1, Paragon, and CM-5, and compare them with previous versions of CVL running on the CM-2, CM-5, and Cray C90. We discuss the features of MPI that helped and hindered the effort, and make a plea for better support for certain primitives. Finally, we discuss the design limitationsof CVL when implemented on current RISC-based MPP architectures, and outline our plans to overcome this by using MPI as a compiler target. CVL and associated languages are available via FTP. 1 CVL overview CVL (C Vector Library [6]) is a library of over 220 low-level vector functions callable from C. It provides an abstract vector memory model that is independent of the underlying architecture, and was designed so that efficient implementations could be developed for a wide variety of parallel machines. Machine-specific versions currently exist for the Connection Machines CM-2 and CM-5, the Cray Y-MP and Y-MP/C90, and the MasPar MP-1 and
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personaliz... more Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalized communication (AAPC) over communication networks such as meshes, hypercubes and omega switches. However, the constant factors of these algorithms are often an obscure function of system parameters such as link speed, processor clock rate, and memory access time. In this paper we investigate these architectural factors, showing the impact of the communication style, the network routing table, and most importantly, the local memory system, on AAPC performance and permutation routing on the Cray T3D. The fast hardware barriers on the T3D permit a straightforward AAPC implementation using routing phases separated by barriers, which improve performance by controlling congestion. However, we found that a practical implementation was difficult, and the resulting AAPC performance was less than expected. After detailed analysis, several corrections were made to the AAPC algorithm and to the machine's routing table, raising the performance from 41% to 74% of the nominal bisection bandwidth of the network. Most AAPC performance measurements are for permuting large, contiguous blocks of data (i.e., every processor has an array of P contiguous elements to be sent to every other processor). In practice, sorting and true h,h permutation routing 1 require data elements to be gathered from their source location into a buffer, transferred over the network, and scattered into their final location in a destination array. We obtain an optimal T3D implementation by chaining local and remote memory operations together. We quantify the implementation's efficiency both experimentally and theoretically, using the recently-introduced copy transfer model, and present results for a counting sort based on this AAPC implementation.
A common problem that sales consultants face in the field is the selection of an appropriate hard... more A common problem that sales consultants face in the field is the selection of an appropriate hardware and software configuration for web farms. Over-provisioning means that the tender will be expensive while under-provisioning will lead to a configuration that does not meet the customer criteria. Indy is a performance modeling environment which allows developers to create custom modeling applications. We have constructed an Indy-based application for defining web farm workloads and topologies. The paper presents an optimization framework that allows the consultant to easily find configurations that meet customers' criteria. The system searches the solution space creating possible configurations, using the web farm models to predict their performance. The optimization tool is then employed to select an optimal configuration. Rather than using a fixed algorithm, the framework provides an infrastructure for implementing multiple optimization algorithms. In this way, the appropriate algorithm can be selected to match the requirements of different types of problem. The framework incorporates a number of novel techniques, including caching results between problem runs, an XML based configuration language, and an effective method of comparing configurations. We have applied the system to a typical web farm configuration problem and results have been obtained for three standard optimization algorithms.
This manual is a supplement to the language de nition of Nesl version 3.1. It describes how to us... more This manual is a supplement to the language de nition of Nesl version 3.1. It describes how to use the Nesl system interactively and covers features for accessing on-line help, debugging, pro ling, executing programs on remote machines, using Nesl with GNU Emacs, and installing and customizing the Nesl system.
These are the lecture notes for CS 15-840B, a hands-on class in programming parallel algorithms. ... more These are the lecture notes for CS 15-840B, a hands-on class in programming parallel algorithms. The class was taught in the fall of 1992 by Guy Blelloch, using the programming language NESL. It stressed the clean and concise expression of a variety of parallel algorithms. About 35 graduate students attended the class, of whom 28 took it for credit.
This paper describes the derivation of an empirically efficient parallel two-dimensional Delaunay... more This paper describes the derivation of an empirically efficient parallel two-dimensional Delaunay triangulation program from a theoretically efficient CREW PRAM algorithm. Compared to previous work, the resulting implementation is not limited to dataaets with a uniform dktribution of points, achieves significantly better speedups over good serial code, and is widely portable due to its use of MPI as a communication mechanism. Results are presented for a loosely-coupled cluster of workstations, a d~t ributed-memory mult icomputer, and a shared-memory multiprocessor. The MachL avelli toolkit used to transform the nested data parallelism inherent in the divide-and-conquer algorithm into achievable task and data parallelism is also described and compared to previous techniques.
This poster covers my ongoing experiment in applying Linda Nelson’s concept of specifications gra... more This poster covers my ongoing experiment in applying Linda Nelson’s concept of specifications grading to the upper-level courses in information technology. I lay out my objections to traditional methods of grading, describe the core techniques of specifications grading and how they may potentially solve these problems, and explain the choices that I’ve made in using these techniques in my classroom. The experiment is not finished, so don’t expect final answers, but initial results are promising – ask me again after student evaluations are in!
Journal of Parallel and Distributed Computing, Apr 1, 1994
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested data-parallel function calls. These features allow the concise description of parallel algorithms on irregular data structures, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machines CM-2 and CM-5, the Cray C90, and serial workstations. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares linefitting, median finding, and a sparse-matrix vector product. These results show that NESL's performance is competitive with that of machine-specific code for regular dense data, and is often superior for irregular data.
Proceedings of the 53rd ACM Technical Symposium on Computer Science Education, 2022
Building on known best practices, this paper describes a newly implemented project-based curricul... more Building on known best practices, this paper describes a newly implemented project-based curriculum for undergraduate computer science majors. The program, designed to provide all students the known benefits of the project-based approach, is being implemented in a context that focuses heavily on drawing students of color, women, and other underrepresented groups into STEM careers. Our Fall 2021 upper-division entering student cohort is 70% female and 80% people of color. Our new computer science program builds on our strong and internationally recognized track record in applying project-based learning (PBL) in engineering. We believe our PBL CS program to be the first program of its type in North America. Key components of the program are: project-based learning across multiple courses in the upper-division, project-based learning within lower-division courses, straightforward articulation with regional and national community colleges that are minority serving institutions (MSIs), industry and research connections that drive projects, and vertical integration of teams across upper-division semesters. Core to this is a commitment to addressing access and equity issues. The foundation for the new upper-division program is a revamped lower-division core curriculum that also feeds into the information technology and information systems majors offered at our school. The core curriculum is designed to reduce barriers to a degree in computing while encouraging more students to explore a degree in the math-intensive computer science program. This curricula initiative paper describes the curriculum, the supporting theory, implementation challenges, and current outcomes.
This paper describes the design and implementation in MPI of the parallel vector library CVL, whi... more This paper describes the design and implementation in MPI of the parallel vector library CVL, which is used as the basis for implementing nested data-parallel languages such as NESL and Proteus. We compare the ease of writing and debugging the portable MPI implementation of CVL with our experiences writing previous versions in CM-2 Paris, CM-5 CMMD, and PVM, and give initial performance results for MPI CVL running on an IBM SP-1, Intel Paragon, and TMC CM-5
In this paper we construct families of bit sequences using combinatorial methods. Each sequence i... more In this paper we construct families of bit sequences using combinatorial methods. Each sequence is derived by converting a collection of numbers encoding certain combinatorial numerics from objects exhibiting symmetry in various dimensions. Using the algorithms first described in [1] we show that the NIST testing suite described in publication 800-22 does not detect these symmetries hidden within these sequences.
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that NESL'S performance is competitive with that of machine-specific codes for regular dense data, and is often superior for irregular data.
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that NESL'S performance is competitive with that of machine-specific codes for regular dense data, and is often superior for irregular data.
In this paper we construct families of bit sequences using combinatorial methods. Each sequence i... more In this paper we construct families of bit sequences using combinatorial methods. Each sequence is derived by converting a collection of numbers encoding certain combinatorial numerics from objects exhibiting symmetry in various dimensions. Using the algorithms first described in [1] we show that the NIST testing suite described in publication 800-22 does not detect these symmetries hidden within these sequences.
In this paper we develop the notion of an operation on an Erdös-Rényi graph to generate higher di... more In this paper we develop the notion of an operation on an Erdös-Rényi graph to generate higher dimensional complexes that exhibit a high degree of randomness. In particular, the constructed combinatorial objects have a face structure encoded by their f-vector which is then converted into a random sequence of bits. Empirical results are obtained by implementing the National Institute of standards and Technology suite 800-22 revision 1a.
Sigplan Notices, Jul 1, 1993
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested dataparallel function calls. These features allow the concise description of parallel algorithms on irregular data, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machine CM-2, the Cray Y-MP C90, and serial machines. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares line-fitting, median finding, and a sparse-matrix vector product. These results show that NESL'S performance is competitive with that of machine-specific codes for regular dense data, and is often superior for irregular data.
This paper describes the design and implementation in MPI of the parallel vector library CVL, whi... more This paper describes the design and implementation in MPI of the parallel vector library CVL, which is used as the basis for implementing nested data-parallel languages such as NESL and Proteus. We outline the features of CVL, and compare the ease of writing and debugging the portable MPI implementation with our experiences writing previous versions in CM-2 Paris, CM-5 CMMD, and PVM 3.0. We give initial performance results for MPI CVL running on the SP-1, Paragon, and CM-5, and compare them with previous versions of CVL running on the CM-2, CM-5, and Cray C90. We discuss the features of MPI that helped and hindered the effort, and make a plea for better support for certain primitives. Finally, we discuss the design limitationsof CVL when implemented on current RISC-based MPP architectures, and outline our plans to overcome this by using MPI as a compiler target. CVL and associated languages are available via FTP. 1 CVL overview CVL (C Vector Library [6]) is a library of over 220 low-level vector functions callable from C. It provides an abstract vector memory model that is independent of the underlying architecture, and was designed so that efficient implementations could be developed for a wide variety of parallel machines. Machine-specific versions currently exist for the Connection Machines CM-2 and CM-5, the Cray Y-MP and Y-MP/C90, and the MasPar MP-1 and
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personaliz... more Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalized communication (AAPC) over communication networks such as meshes, hypercubes and omega switches. However, the constant factors of these algorithms are often an obscure function of system parameters such as link speed, processor clock rate, and memory access time. In this paper we investigate these architectural factors, showing the impact of the communication style, the network routing table, and most importantly, the local memory system, on AAPC performance and permutation routing on the Cray T3D. The fast hardware barriers on the T3D permit a straightforward AAPC implementation using routing phases separated by barriers, which improve performance by controlling congestion. However, we found that a practical implementation was difficult, and the resulting AAPC performance was less than expected. After detailed analysis, several corrections were made to the AAPC algorithm and to the machine's routing table, raising the performance from 41% to 74% of the nominal bisection bandwidth of the network. Most AAPC performance measurements are for permuting large, contiguous blocks of data (i.e., every processor has an array of P contiguous elements to be sent to every other processor). In practice, sorting and true h,h permutation routing 1 require data elements to be gathered from their source location into a buffer, transferred over the network, and scattered into their final location in a destination array. We obtain an optimal T3D implementation by chaining local and remote memory operations together. We quantify the implementation's efficiency both experimentally and theoretically, using the recently-introduced copy transfer model, and present results for a counting sort based on this AAPC implementation.
A common problem that sales consultants face in the field is the selection of an appropriate hard... more A common problem that sales consultants face in the field is the selection of an appropriate hardware and software configuration for web farms. Over-provisioning means that the tender will be expensive while under-provisioning will lead to a configuration that does not meet the customer criteria. Indy is a performance modeling environment which allows developers to create custom modeling applications. We have constructed an Indy-based application for defining web farm workloads and topologies. The paper presents an optimization framework that allows the consultant to easily find configurations that meet customers' criteria. The system searches the solution space creating possible configurations, using the web farm models to predict their performance. The optimization tool is then employed to select an optimal configuration. Rather than using a fixed algorithm, the framework provides an infrastructure for implementing multiple optimization algorithms. In this way, the appropriate algorithm can be selected to match the requirements of different types of problem. The framework incorporates a number of novel techniques, including caching results between problem runs, an XML based configuration language, and an effective method of comparing configurations. We have applied the system to a typical web farm configuration problem and results have been obtained for three standard optimization algorithms.
This manual is a supplement to the language de nition of Nesl version 3.1. It describes how to us... more This manual is a supplement to the language de nition of Nesl version 3.1. It describes how to use the Nesl system interactively and covers features for accessing on-line help, debugging, pro ling, executing programs on remote machines, using Nesl with GNU Emacs, and installing and customizing the Nesl system.
These are the lecture notes for CS 15-840B, a hands-on class in programming parallel algorithms. ... more These are the lecture notes for CS 15-840B, a hands-on class in programming parallel algorithms. The class was taught in the fall of 1992 by Guy Blelloch, using the programming language NESL. It stressed the clean and concise expression of a variety of parallel algorithms. About 35 graduate students attended the class, of whom 28 took it for credit.
This paper describes the derivation of an empirically efficient parallel two-dimensional Delaunay... more This paper describes the derivation of an empirically efficient parallel two-dimensional Delaunay triangulation program from a theoretically efficient CREW PRAM algorithm. Compared to previous work, the resulting implementation is not limited to dataaets with a uniform dktribution of points, achieves significantly better speedups over good serial code, and is widely portable due to its use of MPI as a communication mechanism. Results are presented for a loosely-coupled cluster of workstations, a d~t ributed-memory mult icomputer, and a shared-memory multiprocessor. The MachL avelli toolkit used to transform the nested data parallelism inherent in the divide-and-conquer algorithm into achievable task and data parallelism is also described and compared to previous techniques.
This poster covers my ongoing experiment in applying Linda Nelson’s concept of specifications gra... more This poster covers my ongoing experiment in applying Linda Nelson’s concept of specifications grading to the upper-level courses in information technology. I lay out my objections to traditional methods of grading, describe the core techniques of specifications grading and how they may potentially solve these problems, and explain the choices that I’ve made in using these techniques in my classroom. The experiment is not finished, so don’t expect final answers, but initial results are promising – ask me again after student evaluations are in!
Journal of Parallel and Distributed Computing, Apr 1, 1994
This paper gives an overview of the implementation of NESL, a portable nested data-parallel langu... more This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as well as nested data-parallel function calls. These features allow the concise description of parallel algorithms on irregular data structures, such as sparse matrices and graphs. In addition, they maintain the advantages of data-parallel languages: a simple programming model and portability. The current NESL implementation is based on an intermediate language called VCODE and a library of vector routines called CVL. It runs on the Connection Machines CM-2 and CM-5, the Cray C90, and serial workstations. We compare initial benchmark results of NESL with those of machine-specific code on these machines for three algorithms: least-squares linefitting, median finding, and a sparse-matrix vector product. These results show that NESL's performance is competitive with that of machine-specific code for regular dense data, and is often superior for irregular data.
Proceedings of the 53rd ACM Technical Symposium on Computer Science Education, 2022
Building on known best practices, this paper describes a newly implemented project-based curricul... more Building on known best practices, this paper describes a newly implemented project-based curriculum for undergraduate computer science majors. The program, designed to provide all students the known benefits of the project-based approach, is being implemented in a context that focuses heavily on drawing students of color, women, and other underrepresented groups into STEM careers. Our Fall 2021 upper-division entering student cohort is 70% female and 80% people of color. Our new computer science program builds on our strong and internationally recognized track record in applying project-based learning (PBL) in engineering. We believe our PBL CS program to be the first program of its type in North America. Key components of the program are: project-based learning across multiple courses in the upper-division, project-based learning within lower-division courses, straightforward articulation with regional and national community colleges that are minority serving institutions (MSIs), industry and research connections that drive projects, and vertical integration of teams across upper-division semesters. Core to this is a commitment to addressing access and equity issues. The foundation for the new upper-division program is a revamped lower-division core curriculum that also feeds into the information technology and information systems majors offered at our school. The core curriculum is designed to reduce barriers to a degree in computing while encouraging more students to explore a degree in the math-intensive computer science program. This curricula initiative paper describes the curriculum, the supporting theory, implementation challenges, and current outcomes.
This paper describes the design and implementation in MPI of the parallel vector library CVL, whi... more This paper describes the design and implementation in MPI of the parallel vector library CVL, which is used as the basis for implementing nested data-parallel languages such as NESL and Proteus. We compare the ease of writing and debugging the portable MPI implementation of CVL with our experiences writing previous versions in CM-2 Paris, CM-5 CMMD, and PVM, and give initial performance results for MPI CVL running on an IBM SP-1, Intel Paragon, and TMC CM-5