Reducing overhead in the Uintah framework to support short-lived tasks on GPU-heterogeneous architectures (original) (raw)

Importance of explicit vectorization for CPU and GPU software performance

Kamran Karimi

Journal of Computational Physics, 2011

View PDFchevron_right

Fast heterogeneous computing with CUDA compatible Tesla GPU computing processor (personal supercomputing)

Mohammed Qadeer

2010

View PDFchevron_right

Profiling general purpose GPU applications

Rafael Sachetto

Proceedings - Symposium on Computer Architecture and High Performance Computing, 2009

View PDFchevron_right

A combined GPGPU-FPGA high-performance desktop

An Braeken

View PDFchevron_right

GPU Programming in Rust: Implementing High-Level Abstractions in a Systems-Level Language

Andrew Lumsdaine

2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, 2013

View PDFchevron_right

Mont-Blanc 2020: Towards Scalable and Power Efficient European HPC Processors

Pierre-Axel Lagadec

2021

View PDFchevron_right

The gpu used as a math co-processor in real time applications

Paulo Pagliosa

Proceedings of the VI …, 2007

View PDFchevron_right

Parallelism Support in SIMD/VLIW Image Processing Architectures

Richard Kleihorst

2005

View PDFchevron_right

Datacenter-Scale Analysis and Optimization of GPU Machine Learning Workloads

حفصة خمقاني

IEEE Micro, 2021

View PDFchevron_right

Fast Operations on Raster Images with SIMD Machine Architectures

Hamid Arabnia

Computer Graphics Forum, 1986

View PDFchevron_right

Current Trends in Parallel Computing

FIROJ ALI SK

International Journal of Computer Applications, 2012

View PDFchevron_right

Toward Supporting Multi-GPU Targets via Taskloop and User-Defined Schedules

Abid Muslim Malik

OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

View PDFchevron_right

GEMTC: GPU Enabled Many-Task Computing

Ioan Raicu

View PDFchevron_right

Improving the GPU space of computation under triangular domain problems

nancy hitschfeld

View PDFchevron_right

Design and evaluation of the gemtc framework for GPU-enabled many-task computing

Justin Wozniak, Ioan Raicu

Proceedings of the 23rd international symposium on High-performance parallel and distributed computing - HPDC '14, 2014

View PDFchevron_right

Real-time data-intensive computing

Keith Beattie

AIP Conference Proceedings, 2016

View PDFchevron_right

Enabling Extremely Fine-grained Parallelism via Scalable Concurrent Queues on Modern Many-core Architectures

Poornima Karri

2021 29th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2021

View PDFchevron_right

Smith-Waterman Acceleration in Multi-GPUs: A Performance per Watt Analysis

Edans Sandes

Lecture Notes in Computer Science, 2017

View PDFchevron_right

Towards efficient GPU sharing on multicore processors

Tarek El-ghazawi

Proceedings of the second international workshop on Performance modeling, benchmarking and simulation of high performance computing systems - PMBS '11, 2011

View PDFchevron_right

Specification and verification of GPGPU programs

Matej Mihelčić

Science of Computer Programming, 2014

View PDFchevron_right

CNN-based language and interpreter for image processing on GPUs

Guilherme DeSouza

International Journal of Parallel, Emergent and Distributed Systems, 2011

View PDFchevron_right

GPU Tensor Cores for Fast Arithmetic Reductions

Roberto Carrasco

IEEE Transactions on Parallel and Distributed Systems, 2021

View PDFchevron_right

Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approach

Salman Habib

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2015

View PDFchevron_right

Blaze-DEMGPU: Modular high performance DEM framework for the GPU architecture

Daniel Nico Wilke

SoftwareX, 2016

View PDFchevron_right

Combining high productivity and high performance in image processing using Single Assignment C on multi-core CPUs and many-core GPUs

Clemens Grelck

Journal of Electronic Imaging, 2012

View PDFchevron_right