Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC (original) (raw)

Submitted by webmaster on Sat, 06/01/2013 - 11:46

Title Implementing a systolic algorithm for QR factorization on multicore clusters with PaRSEC
Publication Type Tech Report
Year of Publication 2013
Authors Aupy, G., M. Faverge, Y. Robert, J. Kurzak, P. Luszczek, and J. Dongarra
Technical Report Series Title Lawn 277
Number UT-CS-13-709
Date Published 2013-05
Abstract This article introduces a new systolic algorithm for QR factorization, and its implementation on a supercomputing cluster of multicore nodes. The algorithm targets a virtual 3D-array and requires only local communications. The implementation of the algorithm uses threads at the node level, and MPI for inter-node communications. The complexity of the implementation is addressed with the PaRSEC software, which takes as input a parametrized dependence graph, which is derived from the algorithm, and only requires the user to decide, at the high-level, the allocation of tasks to nodes. We show that the new algorithm exhibits competitive performance with state-of-the-art QR routines on a supercomputer called Kraken, which shows that high-level programming environments, such as PaRSEC, provide a viable alternative to enhance the production of quality software on complex and hierarchical architectures

External Publication Flag: