Object-Oriented Techniques for Sparse Matrix Computations in Fortran 2003 (original) (raw)
Related papers
Sparse Matrix Libraries in C++ for High Performance Architectures
We describe an object oriented sparse matrix library in C++ designed for portability and performance across a wide class of machine architectures. Besides simplifying the subroutine interface, the object oriented design allows the same driving code to be used for various sparse matrix formats, thus addressing many of the di culties encountered with the typical approach to sparse matrix libraries. We also discuss the the design of a C++ library for implementing various iterative methods for solving linear systems of equations. Performance results indicate that the C++ codes are competitive with optimized Fortran.
ACM Transactions on Mathematical Software, 2002
We discuss the interface design for the Sparse Basic Linear Algebra Subprograms (BLAS), the kernels in the recent standard from the BLAS Technical Forum that are concerned with unstructured sparse matrices. The motivation for such a standard is to encourage portable programming while allowing for library-specific optimizations. In particular, we show how this interface can shield one from concern over the specific storage scheme for the sparse matrix. This design makes it easy to add further functionality to the sparse BLAS in the future.
Design patterns for scientific computations on sparse matrices
2012
We discuss object-oriented software design patterns in the context of scientific computations on sparse matrices. Design patterns arise when multiple independent development efforts produce very similar designs, yielding an evolutionary convergence onto a good solution: a flexible, maintainable, high-performance design. We demonstrate how to engender these traits by implementing an interface for sparse matrix computations on NVIDIA GPUs starting from an existing sparse matrix library. We also present initial performance results.
A Linear Algebra Framework for Static High Performance Fortran Code Distribution
Scientific Programming, 1997
High Performance Fortran (HPF) was developed to support data parallel programming for single-instruction multiple-data (SIMD) and multiple-instruction multiple-data (MIMD) machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors, and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode HPF directives and to synthesize distributed code with spaceefficient array allocation, tight loop bounds, and vectorized communications for INDEPENDENT loops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, and overlap analysis. The systematic use of an affine framework makes it possible to prove the compilation scheme correct. ©
A sparse matrix library in C++ for high performance architectures
Proceedings of the …
We describe an object oriented sparse matrix library inC++ builtuponthe Level 3Sparse BLAS proposal 5]forportabilityandperformanceacross awideclass of machine architectures. The C++ library includes algorithmsforvariousiterativemethodsandsupports the most common ...
Level 3 Basic Linear Algebra Subprograms for sparse matrices: a user level interface
ACM Transactions on Mathematical Software
This paper proposes a set of Level 3 Basic Linear Algebra Subprograms and associated kernels for sparse matrices. A major goal is to design and develop a common framework to enable efficient, and portable, implementations of iterative algorithms for sparse matrices on high-performance computers. We have designed the routines to shield the developer of mathematical software from most of the complexities of the various data structures used for sparse matrices. We have kept the interface and suite of codes as simple as possible while at the same time including sufficient functionality to cover most of the requirements of iterative solvers, and sufficient flexibility to cover most sparse matrix data structures. An important aspect of our framework is that it can be easily extended to incorporate new kernels if the need arises. We discuss the design, implementation and use of subprograms for the multiplication of a full matrix by a sparse one and for the solution of sparse triangu...
A Framework for Sparse Matrix Code Synthesis from High-level Specifications
ACM/IEEE SC 2000 Conference (SC'00), 2000
We present compiler technology for synthesizing sparse matrix code from (i) dense matrix code, and (ii) a description of the index structure of a sparse matrix. Our approach is to embed statement instances into a Cartesian product of statement iteration and data spaces, and to produce efficient sparse code by identifying common enumerations for multiple references to sparse matrices. The approach works for imperfectly-nested codes with dependences, and produces sparse code competitive with handwritten library code for the Basic Linear Algebra Subroutines (BLAS).
Design patterns for sparse-matrix computations on hybrid CPU/GPU platforms
Scientific Programming, 2014
We apply object-oriented software design patterns to develop code for scientific software involving sparse matrices. Design patterns arise when multiple independent developments produce similar designs which converge onto a generic solution. We demonstrate how to use design patterns to implement an interface for sparse matrix computations on NVIDIA GPUs starting from PSBLAS, an existing sparse matrix library, and from existing sets of GPU kernels for sparse matrices. We also compare the throughput of the PSBLAS sparse matrix-vector multiplication on two platforms exploiting the GPU with that obtained by a CPU-only PSBLAS implementation. Our experiments exhibit encouraging results regarding the comparison between CPU and GPU executions in double precision, obtaining a speedup of up to 35.35 on NVIDIA GTX 285 with respect to AMD Athlon 7750, and up to 10.15 on NVIDIA Tesla C2050 with respect to Intel Xeon X5650.
Aspect-Oriented Programming of Sparse Matrix Code
1997
The expressiveness conferred by high-level and object-oriented languages is often impaired by concerns that cross-cut a program's basic functionality. Execution time, data representation, and numerical stability are three such concerns that are of great interest to numerical analysts. Using aspect-oriented programming we have created AML, a system for sparse matrix computation that deals with these concerns separately and explicitly while preserving the expressiveness of the original functional language. The resulting code maintains the efficiency of highly tuned low-level code, yet is ten times shorter.
An object oriented design for high performance linear algebra on distributed memory architectures
1993
We describe the design of ScaLAPACK++, an object oriented C++ library for implementing linear algebra computations on distributed memory multicomputers. This package, when complete, will support distributed matrix operations for symmetric, positive-de nite, and non-symmetric cases. In ScaLA-PACK++ we h a v e employed object oriented design methods to enchance scalability, portability, exibility, and ease-of-use. We illustrate some of these points by describing the implementation of basic algorithms and comment on tradeo s between elegance, generality, and performance.