Chip multi-processor scalability for single-threaded applications (original) (raw)

Extending multicore architectures to exploit hybrid parallelism in single-thread applications

2007

Abstract Chip multiprocessors with multiple simpler cores are gaining popularity because they have the potential to drive future performance gains without exacerbating the problems of power dissipation and complexity. Current chip multiprocessors increase throughput by utilizing multiple cores to perform computation in parallel. These designs provide real benefits for server-class applications that are explicitly multi-threaded.

Multi-Core Processors: New Way to Achieve High System Performance

International Symposium on Parallel Computing in Electrical Engineering (PARELEC'06)

Multi-core processors represent an evolutionary change in conventional computing as well setting the new trend for high performance computing (HPC)but parallelism is nothing new. Intel has a long history with the concept of parallelism and the development of hardware-enhanced threading capabilities. Intel has been delivering threadingcapable products for more than a decade. The move toward chip-level multiprocessing architectures with a large number of cores continues to offer dramatically increased performance and power characteristics. Nonetheless, this move also presents significant challenges. This paper will describe how far the industry has progressed and evaluates some of the challenges we are facing with multi-core processors and some of the solutions that have been developed.

Piranha: a scalable architecture based on single-chip multiprocessing

2000

The microprocessor industry is currently struggling with higher development costs and longer design times that arise from exceedingly complex processors that are pushing the limits of instructionlevel parallelism. Meanwhile, such designs are especially ill suited for important commercial applications, such as on-line transaction processing (OLTP), which suffer from large memory stall times and exhibit little instruction-level parallelism. Given that commercial applications constitute by far the most important market for high-performance servers, the above trends emphasize the need to consider alternative processor designs that specifically target such workloads. The abundance of explicit thread-level parallelism in commercial workloads, along with advances in semiconductor integration density, identify chip multiprocessing (CMP) as potentially the most promising approach for designing processors targeted at commercial servers.

Multi-Core Processors: New Way to Achieve High System Performance. Multi-Core Processors: New Way to Achieve High System Performance

Multi-core processors represent an evolutionary change in conventional computing as well setting the new trend for high performance computing (HPC)-but parallelism is nothing new. Intel has a long history with the concept of parallelism and the development of hardware-enhanced threading capabilities. Intel has been delivering threading-capable products for more than a decade. The move toward chip-level multiprocessing architectures with a large number of cores continues to offer dramatically increased performance and power characteristics. Nonetheless, this move also presents significant challenges. This paper will describe how far the industry has progressed and evaluates some of the challenges we are facing with multi-core processors and some of the solutions that have been developed.

A survey of processors with explicit multithreading

ACM Computing Surveys, 2003

Hardware multithreading is becoming a generally applied technique in the next generation of microprocessors. Several multithreaded processors are announced by industry or already into production in the areas of high-performance microprocessors, media, and network processors.

Development of a simultaneously threaded multi-core processor

… Technologies for the …, 2005

Simultaneous Multithreading (SMT) is becoming one of the major trends in the design of future generations of microarchitectures. Its key strength comes from its ability to exploit both threadlevel and instruction-level parallelism; it uses hardware resources efficiently. Nevertheless, SMT has its limitations: contention between threads may cause conflicts; lack of scalability, additional pipeline stages, and inefficient handling of long latency operations. Alternatively, Chip Multiprocessors (CMP) are highly scalable and easy to program. On the other hand, they are expensive and suffer from cache coherence and memory consistency problems. This paper proposes a microarchitecture that exploits parallelism at instruction, thread, and processor levels. It merges both concepts of SMT and CMP. Like CMP, multiple cores are used on a single chip. Hardware resources are replicated in each core except for the secondary-level cache which is shared amongst all cores. The processor applies the SMT technique within each core to make full use of available hardware resources. Moreover, the communication overhead is reduced due to the inter-independence between cores. Results show that the proposed microarchitecture outperforms both SMT and CMP. In addition, resources are more evenly distributed amongst running threads.

Chip Multiprocessors–A Cost-effective Alternative to Simultaneous Multithreading

In this paper we describe the principles of the chip multiprocessor architecture, overview design alternatives and present some example processors of this type. We discuss the results of several simulations where chip multiprocessor was compared to other advanced processor architectures including superscalars and simultaneous multithreading processors. Although simultaneous multithreading seems to be most efficient when compared architectures have equal total issue bandwidth, chip multiprocessor may outperform simultaneous multithreading when implemented with equal number of transistors.

Simultaneous multithreading: Maximizing on-chip parallelism

Proceedings of the 22nd annual …, 1995

This paper examines simultaneous multithreading, a technique permitting several independent threads to issue instructions to a superscalar's multiple functional units in a single cycle. We present several models of simultaneous multithreading and compare them with alternative organizations: a wide superscalar, a fine-grain multithreaded processor, and single-chip, multiple-issue multiprocessing architectures. Our results show that both (single-threaded) superscalar and fine-grain multithreaded architectures are limited in their ability to utilize the resources of a wide-issue processor. Simultaneous multithreading has the potential to achieve 4 times the throughput of a superscalar, and double that of fine-grain multithreading. We evaluate several cache configurations made possible by this type of organization and evaluate tradeoffs between them. We also show that simultaneous multithreading is an attractive alternative to single-chip multiprocessors; simultaneous multithreaded processors with a variety of organizations outperform corresponding conventional multiprocessors with similar execution resources.

Multi-core processors-An overview

Arxiv preprint arXiv:1110.3535, 2011

Microprocessors have revolutionized the world we live in and continuous efforts are being made to manufacture not only faster chips but also smarter ones. A number of techniques such as data level parallelism, instruction level parallelism and hyper threading (Intel's HT) already exists which have dramatically improved the performance of microprocessor cores. [1, 2] This paper briefs on evolution of multi-core processors followed by introducing the technology and its advantages in today's world. The paper concludes by detailing on the challenges currently faced by multi-core processors and how the industry is trying to address these issues.