Gaurav Kaul - Academia.edu (original) (raw)

Gaurav Kaul

Uploads

Papers by Gaurav Kaul

Research paper thumbnail of Abstractions for Parallelism: Patterns, Performance and Correctness

cs.manchester.ac.uk

Despite rapid advances in parallel hardware performance, the full potential of processing power i... more Despite rapid advances in parallel hardware performance, the full potential of processing power is not being exploited in the software community for one clear reason: the difficulty in designing efficient and effective parallel applications. Identifying sub-tasks within the application, designing parallel algorithms, and balancing load among the processing units has been a daunting task for novice programmers, and even the experienced programmers are often trapped with design decisions that underachieve in potential peak performance. Design patterns have been used as a notation to capture how experts in a given domain think about and approach their work. Over the last decade there have been several approaches in identifying common patterns that are repeatedly used in parallel software design process. Documentation of these design patterns helps the programmers by providing definition, solution and guidelines for common parallelization problems. A convenient way to further raise the level of abstraction and make it easier for programmers to write legible code is the philosophy of 'Separation of Concerns'. This separation is achieved by Aspect Oriented Programming (AOP) paradigm by allowing programmers to specify the concerns in an independent manner and letting the compiler 'weave' (AOP terminology for unification of modules) them at compile time. However, abstraction by its very nature often produces unoptimized code as it frames the solution of a problem without much thought to underlying machine architecture. Indeed, in the current phase of multicore era, where chip manufacturers are continuously experimenting with processor architectures, an optimization on one architecture might not yield any benefit on another from a different chip manufacturer. Using the auto-tuner one can automatically explore the optimization space for a particular computational kernel on a given processor architecture. The last relevant aspect of concern in this project would be the formal specification and verification of properties concerning parallel programs. It's well known fact that parallel programs are particularly prone to insidious defects such as deadlocks and race conditions due to shared variables and locking. Using tools from formal verification it is however possible to guarantee certain safety properties (such as deadlock and data race avoidance) while refining successive abstractions to code level. The interplay of abstractions, auto-tuning and correctness in the context of parallel software development will be considered in this project report.

Research paper thumbnail of Implementing Scientific Computing Kernels on Manycore Architectures: A Design Pattern Based Approach

sadie.cs.manchester.ac.uk

Abstract Multicores are now the only way to utilize the transistor budget provided by Moore&a... more Abstract Multicores are now the only way to utilize the transistor budget provided by Moore's Law. The triple barriers of power dissipation, memory latency and limited parallelism in instruction streams have compelled programmers to look at other sources of parallelism. ...

Research paper thumbnail of Abstractions for Parallelism: Patterns, Performance and Correctness

cs.manchester.ac.uk

Despite rapid advances in parallel hardware performance, the full potential of processing power i... more Despite rapid advances in parallel hardware performance, the full potential of processing power is not being exploited in the software community for one clear reason: the difficulty in designing efficient and effective parallel applications. Identifying sub-tasks within the application, designing parallel algorithms, and balancing load among the processing units has been a daunting task for novice programmers, and even the experienced programmers are often trapped with design decisions that underachieve in potential peak performance. Design patterns have been used as a notation to capture how experts in a given domain think about and approach their work. Over the last decade there have been several approaches in identifying common patterns that are repeatedly used in parallel software design process. Documentation of these design patterns helps the programmers by providing definition, solution and guidelines for common parallelization problems. A convenient way to further raise the level of abstraction and make it easier for programmers to write legible code is the philosophy of 'Separation of Concerns'. This separation is achieved by Aspect Oriented Programming (AOP) paradigm by allowing programmers to specify the concerns in an independent manner and letting the compiler 'weave' (AOP terminology for unification of modules) them at compile time. However, abstraction by its very nature often produces unoptimized code as it frames the solution of a problem without much thought to underlying machine architecture. Indeed, in the current phase of multicore era, where chip manufacturers are continuously experimenting with processor architectures, an optimization on one architecture might not yield any benefit on another from a different chip manufacturer. Using the auto-tuner one can automatically explore the optimization space for a particular computational kernel on a given processor architecture. The last relevant aspect of concern in this project would be the formal specification and verification of properties concerning parallel programs. It's well known fact that parallel programs are particularly prone to insidious defects such as deadlocks and race conditions due to shared variables and locking. Using tools from formal verification it is however possible to guarantee certain safety properties (such as deadlock and data race avoidance) while refining successive abstractions to code level. The interplay of abstractions, auto-tuning and correctness in the context of parallel software development will be considered in this project report.

Research paper thumbnail of Implementing Scientific Computing Kernels on Manycore Architectures: A Design Pattern Based Approach

sadie.cs.manchester.ac.uk

Abstract Multicores are now the only way to utilize the transistor budget provided by Moore&a... more Abstract Multicores are now the only way to utilize the transistor budget provided by Moore's Law. The triple barriers of power dissipation, memory latency and limited parallelism in instruction streams have compelled programmers to look at other sources of parallelism. ...

Log In