On Automatic Parallelization of Irregular Reductions on Scalable Shared Memory Systems⋆ (original) (raw)
Related papers
Scalable Automatic Parallelization of Irregular Reductions on Shared Memory Multiprocessors
Improving parallel irregular reductions using partial array expansion
Proceedings of the 2001 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '01, 2001
Data partitioning-based parallel irregular reductions
Concurrency and Computation: Practice and Experience, 2004
Optimization techniques for parallel irregular reductions
Journal of Systems Architecture, 2003
Proceedings of the 14th international conference on Supercomputing - ICS '00, 2000
On improving the performance of data partitioning oriented parallel irregular reductions
Proceedings 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing, 2002
An analytical model of locality-based parallel irregular reductions
Parallel Computing, 2008
Run-time characterization of irregular accesses applied to parallelization of irregular reductions
Proceedings International Conference on Parallel Processing Workshops, 2001
Architectural support for parallel reductions in scalable shared-memory multiprocessors
2001
Compiler and runtime support for irregular reductions on amultithreaded architecture
2002
Simultaneous parallel reduction on SIMD machines
1995
A Fast and Generic GPU-Based Parallel Reduction Implementation
2018 Symposium on High Performance Computing Systems (WSCAD)
Scaling irregular array-type reductions in OmpSs
2015
An Efficient Parallel Approach To ReduceSparse Matrices With Invariant Entries
WIT Transactions on Information and Communication Technologies, 1970
Hardware for speculative reduction parallelization and optimization in DSM multiprocessors
1999
Balanced, Locality-Based Parallel Irregular Reductions
Lecture Notes in Computer Science, 2003
Reduction on arrays: comparison of performances among different algorithms
Automatic data and computation decomposition on distributed memory parallel computers
ACM Transactions on Programming Languages and Systems, 2002
Compiler and runtime support for enabling reduction computations on heterogeneous systems
Concurrency and Computation: Practice and Experience, 2012
2010
A Proposal for User-Defined Reductions in OpenMP
Lecture Notes in Computer Science, 2010
On algorithmic reductions in task-parallel programming models
2017
RICH: implementing reductions in the cache hierarchy
Proceedings of the 34th ACM International Conference on Supercomputing, 2020
Efficient shared-memory support for parallel graph reduction
Future Generation Computer Systems, 1997
SAC — FROM HIGH-LEVEL PROGRAMMING WITH ARRAYS TO EFFICIENT PARALLEL EXECUTION
Parallel Processing Letters, 2003
The dutch parallel reduction machine project
Future Generation Computer Systems, 1987