X/02 $17.00 (c) 2002 IEEE 1 An Overview of the BlueGene/L Supercomputer (original) (raw)

An overview of the BlueGene/L supercomputer

… , ACM/IEEE 2002 …, 2002

This paper gives an overview of the BlueGene/L Supercomputer. This is a jointly funded research partnership between IBM and the Lawrence Livermore National Laboratory as part of the United States Department of Energy ASCI Advanced Architecture Research Program. Application performance and scaling studies have recently been initiated with partners at a number of academic and government institutions, including the San Diego Supercomputer Center and the California Institute of Technology. This massively parallel system of 65,536 nodes is based on a new architecture that exploits system-on-a-chip technology to deliver target peak processing power of 360 teraFLOPS (trillion floating-point operations per second). The machine is scheduled to be operational in the 2004-2005 time frame, at price/performance and power consumption/performance targets unobtainable with conventional architectures.

Unlocking the performance of the BlueGene/L supercomputer

2004

Abstract The BlueGene/L supercomputer is expected to deliver new levels of application performance by providing a combination of good single-node computational performance and high scalability. To achieve good single-node performance, the BlueGene/L design includes a special dual floating-point unit on each processor and the ability to use two processors per node. BlueGene/L also includes both a torus and a tree network to achieve high scalability.

The BlueGene/L supercomputer

Nuclear Physics B - Proceedings Supplements, 2003

The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32×32×64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.

The Blue Gene/L Supercomputer: A Hardware and Software Story

International Journal of Parallel Programming, 2007

The Blue Gene/L system at the Department of Energy Lawrence Livermore National Laboratory in Livermore, California is the world’s most powerful supercomputer. It has achieved groundbreaking performance in both standard benchmarks as well as real scientific applications. In that process, it has enabled new science that simply could not be done before. Blue Gene/L was developed by a relatively small team of dedicated scientists and engineers. This article is both a description of the Blue Gene/L supercomputer as well as an account of how that system was designed, developed, and delivered. It reports on the technical characteristics of the system that made it possible to build such a powerful supercomputer. It also reports on how teams across the world worked around the clock to accomplish this milestone of high-performance computing.

Future generation supercomputers I

ACM SIGARCH Computer Architecture News, 2007

As a result of the increasing requirements of present and future computation intensive applications, there have been many fundamentally divergent approaches such as the Blue-Gene, TRIPS, HERO, Cascade spurred in order to provide increased performance at node level in supercomputing clusters. The design of the node architecture should be such that 'Cost-Effective Supercomputing' is realized without compromising on the requirements of the ever-performance hungry grand challenge applications. However, to increase performance at the cluster level, scalability and likewise tackling the mapping complexity across the large cluster of nodes becomes critical. The potential of such a node architecture can be fully exploited only with an appropriate cluster architecture. In an attempt to address these issues for efficient and Cost-Effective Supercomputing, we propose a novel paradigm for designing High Performance Clusters, in two papers. In paper-II, we discuss the design of operating...