Locality-Conscious Nested-Loops Parallelization (original) (raw)

Scheduling and partitioning for multiple loop nests

Edwin Sha

Proceedings of the 14th international symposium on Systems synthesis - ISSS '01, 2001

View PDFchevron_right

Run Time Parallelization

Raja Das

Encyclopedia of Parallel Computing, 2011

View PDFchevron_right

Readers Are Parallel Processors

Jonathan Grainger

Trends in Cognitive Sciences, 2019

View PDFchevron_right

Optimization of Nest-Loop Software Pipelining

Edwin Sha

View PDFchevron_right

Performance Technology for Complex Parallel and Distributed Systems

Sameer Shende

Scalable Computing: Practice and Experience, 2001

View PDFchevron_right

Optimizing nested loops with iterational and instructional retiming

Edwin Sha

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2005

View PDFchevron_right

Data wrangling at scale

flavio maria de paoli

Proceedings of the 12th European Conference on Software Architecture: Companion Proceedings, 2018

View PDFchevron_right

Time-constrained loop scheduling with minimal resources

Edwin Sha

View PDFchevron_right

Using Retiming to Minimize Inter-Iteration Dependencies

Edwin Sha

2001

View PDFchevron_right

Parallel Experimentation: A Basic Scheme for Dynamic Efficiency

David P Ellerman

Social Science Research Network, 2004

View PDFchevron_right

Performance Evaluation of an Irregular Application Parallelized in Java

Michelle Strout

2010 39th International Conference on Parallel Processing Workshops, 2010

View PDFchevron_right

Timing optimization of nested loops considering code size for DSP applications

Edwin Sha

International Conference on Parallel Processing, 2004. ICPP 2004., 2004

View PDFchevron_right

Combined partitioning and data padding for scheduling multiple loop nests

Edwin Sha

Proceedings of the international conference on Compilers, architecture, and synthesis for embedded systems - CASES '01, 2001

View PDFchevron_right

An ordered heuristic for the allocation of resources in unrelated parallel-machines

Maria Madureira

International Journal of Industrial Engineering Computations, 2015

View PDFchevron_right

Optimizing Data Distribution for Loops on Embedded Multicore with Scratch-Pad Memory

Edwin Sha

Journal of Computers, 2014

View PDFchevron_right

Optimizing Timing and Code Size Using Maximum Direct Loop Fusion

Edwin Sha

View PDFchevron_right

Scaling alltoall collective on multi-core systems

Rahul Kumar

2008 IEEE International Symposium on Parallel and Distributed Processing, 2008

View PDFchevron_right

Optimizing overall loop schedules using prefetching and partitioning

Edwin Sha

IEEE Transactions on Parallel and Distributed Systems, 2000

View PDFchevron_right

Implementing parallelism and scheduling data flow graphs on Java virtual machine

Edwin Sha

2002

View PDFchevron_right

Loop Distribution and Fusion with Timing and Code Size Optimization

Edwin Sha

Journal of Signal Processing Systems, 2010

View PDFchevron_right

Minimizing Inter-Iteration Dependencies in Multi-Dimensional Loops

Sukumar Anapalli

cs.uakron.edu

View PDFchevron_right

SUPPLE: An efficient run-time support for non-uniform parallel loops

mightyhero370 For Loops

Journal of Systems Architecture, 1999

View PDFchevron_right

Optimizing synchronous systems for multi-dimensional applications

Edwin Sha

Proceedings the European Design and Test Conference. ED&TC 1995

View PDFchevron_right

General loop fusion technique for nested loops considering timing and code size

Edwin Sha

2004

View PDFchevron_right