Compiler Integrated Multiprocessor Simulation (original) (raw)

Shaman: a distributed simulator for shared memory multiprocessors

Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems

This paper describes our distributed architectural simulator of shared memory multiprocessors named Shaman. The simulator runs on a PC cluster that consists of multiple front-end nodes to simulate the instruction level behavior of a target multiprocessor in parallel and a back-end node to simulate the target memory system. The front-end also simulates the logical behavior of the shared memory using software DSM technique and generates memory references to drive the back-end. A remarkable feature of our simulator is the reference filtering to reduce the amount of the references transferred from the front-end to the backend utilizing the DSM mechanism and coherent cache simulation on the front-end. This technique and our sophisticated DSM implementation discussed in this paper give an extraordinary performance to the Shaman simulator. We achieved 335 million and 392 million simulation clock per second for LU decomposition and FFT in SPLASH-2 kernel benchmarks respectively, when we used 16 front-end nodes to simulate a 16-way target SMP.

Clown: a Microprocessor Simulator for Operating System Studies

2012

In this paper, I present the design and implementation of Clown--a simulator of a microprocessor-based computer system specifically optimized for teaching operating system courses at undergraduate or graduate levels. The package includes the simulator itself, as well as a collection of basic I/O devices, an assembler, a linker, and a disk formatter. The simulator architecturally resembles mainstream microprocessors from the Intel 80386 family, but is much easier to learn and program. The simulator is fast enough to be used as an emulator--in the direct user interaction mode.

A simulation platform for multi-threaded architectures

Proceedings of MASCOTS '96 - 4th International Workshop on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 1996

Simulation is a powerful tool f o r studying behavior of novel architectures and improving their performance. Howevel; the time, effort and resources invested in developing a reliable simulator with the required level of detail may become prohibitively large. In this papel; we present a simulation platform specifically designed to simulate the class of multithreaded architectures. The most important features of this simulator are its jlexibility and ease of use. The simulation model provides the user with a wide range of design criteria, architectural parameters and workload characteristics. The simulation platform includes several tools, such as: an experiment planner, an interface to Matlab f o r processing and displaying results, and an intelface to PVM for the execution of independent experiments in parallel. The simulation model is validated by comparison of analytical and experimental results.

Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset

ACM SIGARCH Computer Architecture News, 2005

The Wisconsin Multifacet Project has created a simulation toolset to characterize and evaluate the performance of multiprocessor hardware systems commonly used as database and web servers. We leverage an existing full-system functional simulation infrastructure (Simics [14]) as the basis around which to build a set of timing simulator modules for modeling the timing of the memory system and microprocessors. This simulator infrastructure enables us to run architectural experiments using a suite of scaled-down commercial workloads . To enable other researchers to more easily perform such research, we have released these timing simulator modules as the Multifacet General Execution-driven Multiprocessor Simulator (GEMS) Toolset, release 1.0, under GNU GPL .

Clown: a Microprocessor Simulator for Operating System

In this paper, I present the design and implementation of Clown -a simulator of a microprocessor-based computer system specifically optimized for teaching operating system courses at undergraduate or graduate levels. The package includes the simulator itself, as well as a collection of basic I/O devices, an assembler, a linker, and a disk formatter. The simulator architecturally resembles mainstream microprocessors from the Intel 80386 family, but is much easier to learn and program. The simulator is fast enough to be used as an emulator -in the direct user interaction mode.

Profiling High Level Abstraction Simulators of Multiprocessor Systems

2012

Simulation has become one of the most timeconsuming tasks in Electronic System Level design, required both on design and verification phases. As the complexity of modelled systems increases, so do the need for adequate use of available computational resources in multiprocessor computers or clusters. SystemC simulator models are designed to use only one core, even if the hardware is multi-core. In this paper, we analyse 20 platforms, designed in SystemC, varying from 1 to 16 cores with 4 different processor models (ISAs), and evaluated the SystemC kernel overhead for a set of 12 programs running over those platforms, totaling 240 configurations. We split the execution time into the simulation components and found out that the major contributor to the simulation is the SystemC kernel, consuming around 50% of the total simulator execution time. This finding opens space for new research focusing on improving SystemC Kernel performance.

A Simulation Environment for Microprocessor Architecture Evaluation

1989

The task of choosing the right architecture for a VLSI chip is becoming more difficult as the complexity of microprocessors increases. The process of designing such an architecture, examining the various alternatives and obtaining initial performance estimates requires a flexible and fast simulation environment that addresses those issues. This paper describes a performance simulation environment for the evaluation process of an advanced microprocessor, its use and our experience during definition and design time.

An architecture workbench for multicomputers

… Processing Symposium, 1997. …, 1997

The large design space of modern computer architectures calls for performance modelling tools to facilitate the evaluation of different alternatives. In this paper, we give an overview of the Mermaid multicomputer simulation environment. This environment allows the evaluation of a wide range of architectural design tradeoffs while delivering reasonable simulation performance. To achieve this, simulation takes place at a level of abstract machine instructions rather than at the level of real instructions. Moreover, a less detailed mode of simulation is also provided. So when accuracy is not the primary objective, this simulation mode can yield high simulation efficiency. As a consequence, Mermaid makes both fast prototyping and accurate evaluation of multicomputer architectures feasible. Communication operations Description send(message-size, destination) Synchronous recv(source) communication asend(message-size, destination) Asynchronous arecv(source) communication compute(duration) Computation Table 1. Trace events or operations. accurate simulation of, for example, the processor pipelines. This means that Mermaid is only of limited use for purposes like application debugging or compiler testing. The operations that act as input for the communication model are based on straightforward message passing. Both synchronous (blocking) and asynchronous (non-blocking) communication are supported. Computation performed within the communication model is simulated at task level by means of the compute operation.

A framework for simulating heterogeneous virtual processors

Simulation Symposium, …, 1999

This paper examines the layered software modules of a heterogeneous multiprocessor simulator and debugger, and the design patterns that span these modules. Lucent's LUxWORKS simulator and debugger works with multiple processor architectures. Its modeling infrastructure, processor models, processor monitor / control, hardware control, vendor simulator interface and Tcl/Tk extension layers are spanned by the following design patterns: 1) build and extend abstract virtual processors, 2) build reflective entities, and 3) build a covariant extensible system. Together these modules and patterns define a processor execution architecture that encourages reuse and dynamic extensibility.