An overview of the BlueGene/L Supercomputer (original) (raw)
Related papers
X/02 $17.00 (c) 2002 IEEE 1 An Overview of the BlueGene/L Supercomputer
This paper gives an overview of the BlueGene/L Supercomputer. This is a jointly funded research partnership between IBM and the Lawrence Livermore National Laboratory as part of the United States Department of Energy ASCI Advanced Architecture Research Program. Application performance and scaling studies have recently been initiated with partners at a number of academic and government institutions, including the San Diego Supercomputer Center and the California Institute of Technology. This massively parallel system of 65,536 nodes is based on a new architecture that exploits system-on-a-chip technology to deliver target peak processing power of 360 teraFLOPS (trillion floating-point operations per second). The machine is scheduled to be operational in the 2004-2005 time frame, at price/performance and power consumption/performance targets unobtainable with conventional architectures.
Unlocking the performance of the BlueGene/L supercomputer
2004
Abstract The BlueGene/L supercomputer is expected to deliver new levels of application performance by providing a combination of good single-node computational performance and high scalability. To achieve good single-node performance, the BlueGene/L design includes a special dual floating-point unit on each processor and the ability to use two processors per node. BlueGene/L also includes both a torus and a tree network to achieve high scalability.
Nuclear Physics B - Proceedings Supplements, 2003
The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32×32×64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.
The Blue Gene/L Supercomputer: A Hardware and Software Story
International Journal of Parallel Programming, 2007
The Blue Gene/L system at the Department of Energy Lawrence Livermore National Laboratory in Livermore, California is the world’s most powerful supercomputer. It has achieved groundbreaking performance in both standard benchmarks as well as real scientific applications. In that process, it has enabled new science that simply could not be done before. Blue Gene/L was developed by a relatively small team of dedicated scientists and engineers. This article is both a description of the Blue Gene/L supercomputer as well as an account of how that system was designed, developed, and delivered. It reports on the technical characteristics of the system that made it possible to build such a powerful supercomputer. It also reports on how teams across the world worked around the clock to accomplish this milestone of high-performance computing.
A Performance and Scalability Analysis of the BlueGene/L Architecture
2004
This paper is structured as follows. Section 2 gives an architectural description of BlueGene/L. Section 3 analyzes the issue of "computational noise" -the effect that the operating system has on the system and application performance. Section 4 describes the performance characteristics of the communication networks. Section 5 deals with single processor performance. Section 6 addresses application performance and scalability, including performance prediction. Most of the results are taken from a 512-node machine running at 500MHz. Also included is a comparison of the predicted performance of BlueGene/L against the performance of ASCI Q and early results from a larger 2048 node BlueGene/L machine clocked at 700MHz. Finally the analysis is summarized in section 7.