Partitioning Hardware and Software for Reconfigurable Supercomputing Applications: A Case Study (original) (raw)

Promises and Pitfalls of Reconfigurable Supercomputing

Reconfigurable supercomputing (RSC) combines programmable logic chips with high performance microprocessors, all communicating over a high bandwidth, low latency interconnection network. Reconfigurable hardware has demonstrated an order of magnitude speedup on compute-intensive kernels in science and engineering. However, translating high level algorithms to programmable hardware is a formidable barrier to the use of these resources by scientific programmers . A library-based approach has been suggested, so that the software application can call standard library functions that have been optimized for hardware. The potential benefits of this approach are evaluated on several large scientific supercomputing applications. It is found that hardware linear algebra libraries would be of little benefit to the applications analyzed. To maximize performance of supercomputing applications on RSC, it is necessary to identify kernels of high computational density that can be mapped to hardware, carefully partition software and hardware to reduce communications overhead, and optimize memory bandwidth on the FPGAs. Two case studies that follow this approach are summarized, and, based on experience with these applications, directions for future reconfigurable supercomputing architectures are outlined.

Virtualizing and sharing reconfigurable resources in High-Performance Reconfigurable Computing systems

2008

Reconfigurable Computers (HPRCs) are parallel computers but with added FPGA chips. Examples of such systems are the Cray XT5 h and Cray XD1, the SRC-7 and SRC-6, and the SGI Altix/RASC. The execution of parallel applications on HPRCs mainly follows the Single-Program Multiple-Data (SPMD) model, which is largely the case in traditional High-Performance Computers (HPCs). In addition, the prevailing usage of FPGAs in such systems has been as coprocessors. The overall system resources, however, are often underutilized because of the asymmetric distribution of the reconfigurable processors relative to the conventional processors. This asymmetry is often a challenge for using the SPMD programming model on these systems. In this work, we propose a resource virtualization solution based on Partial Run-Time Reconfiguration (PRTR). This technique will allow sharing the reconfigurable processors among the underutilized processors. We will present our virtualization infrastructure augmented with an analytical investigation. We will verify our proposed concepts with experimental implementations using the Cray XD1 as a testbed. It will be shown that this approach is quite promising and will allow full exploitation of the system resources with fair sharing of the reconfigurable processors among the microprocessors. Our approach is general and can be applied to any of the available HPRC systems.