A direct approach for finding loop transformation matrices (original) (raw)
Abstract
Loop transformations, such as loop interchange, reversal and skewing, have been unified under linear matrix transformations. A legal transformation matrix is usually generated based upon distance vectors or direction vectors. Unfortunately, for some nested loops, distance vectors may not be computable and direction vectors, on the other hand, may not contain useful information. We propose the use of linear equations or inequalities of distance vectors to approximate data dependence. This approach is advantageous since (1) many loops having no constant distance vectors have very simple equations of distance vectors; (2) these equations contain more information than derection vectors do, thus the chance of exploiting potential parallesism is improved.
In general, the equations or inequalities that approximate the data dependence of a given nested loop is not unique, hence classification is discussed for the purpose of loop transformation. Efficient algorithms are developed to generate all kinds of linear equations of distance vectors for a given nested loop. The issue of how to obtain a desired transformation matrix from those equations is also addressed.
Access this article
Subscribe and save
- Starting from 10 chapters or articles per month
- Access and download chapters and articles from more than 300k books and 2,500 journals
- Cancel anytime View plans
Buy Now
Price excludes VAT (USA)
Tax calculation will be finalised during checkout.
Instant access to the full article PDF.
Similar content being viewed by others
References
- Banejee U. Unimodular transformations of double loops. In_Advances in Languages and Compiler for Parallel Processing_, Nicolau A, Gelernter D, Gross T, Padua D (eds.), The MIT Press, 1991, pp. 192–219.
- Wolf M E, Lam M S. A data locality optimizing algorithm. In_Proc. ACM SIGPLAN’91 Conf. Programming language Design Implementation_, June 1991, pp. 30–44.
- Wolf M E, Lam M S. A loop transformation theory and an algorithm to maximize parallelism.IEEE Trans. on Parallel and Distributed Systems, 1991, 2(2).
- Maslov V. Delinearization: An efficient way to break multiloop dependence equations. In_ACM SIGPLAN’92 Conf. on Programming Languages Design and Implementation_, June, 1992.
- Allen J R, Kennedy K. Automatic translation of Fortran programs to vector form.ACM Trans. on Programming Languages and Systems, 1987, 9(4): 491–542.
Article MATH Google Scholar - Banerjee U. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Norwell, Mass., 1988.
Google Scholar - Fang Z, Lu M, Lin H. An approach to solve the cache thrashing problem. In_Proc. of 5th Int’l Parallel Processing Symp._, Anaheim, CA, April 1991, pp. 330–335.
- Fang F, Lu M. An iteration partition approach for cache and local memory thrashing on parallel processing.IEEE Trans. on Comput., 1993, 42(5): 529–546.
Article Google Scholar - Fang Z, Yew C, Tang C, Zhu C. Dynamic processor self-scheduling for general parallel nested loops.IEEE Trans. on Comput., 1990, 39(7): 919–929.
Article Google Scholar - Gannon D, Jalby W, Gallivan K. Strategies for cache and local memory management by global program transformation.J. Parallel and Distribute Comput., 1988, 5: 587–616.
Article Google Scholar - Goff G, Kennedy K, Tseng C. Practical dependence testing. In_ACM Sigplan’91 Conf. on Programming Languages Design and Implementation_, June, 1991.
- Hagphighat M, Polychronopoulos C. Symbolic dependence analysis for high performance parallelizing compiler. In_Advances in Languages and Compiler for Parallel Processing_, Nicolau A, Gelernter D, Gross T, Padua D (eds.), The MIT Press, 1991, pp. 310–330.
- Li Z, Yew P C, Zhu C. An efficient data dependence analysis for parallelizing compilers.IEEE Trans. on Parallel and Distributed Systems, 1990, 1(1).
- Maydan D E, Hennesy J L, Lam M S. Efficient and exact data dependence analysis. In_ACM SIGPLAN’91 Conf. on Programming Languages Design and Implementation_, June, 1991.
- Padua D A, Wolfe M J. Advanced compiler optimizations for supercomputers.Communications of the ACM, 1986, 29(12).
- Shen Z, Li Z, Yew P C. An empirical study on array subscripts and data dependence.IEEE Trans. on Parallel and Distributed Systems, 1991, 2: 145–150.
Google Scholar - Tzen T H, Ni L M. Dependence uniformization: A loop parallelization technique.IEEE Trans. on Parallel and Distributed Systems, 1993, 4(5).
- Wolfe M J. Optimizing Supercompilers for Supercomputers. Cambridge, MA: MIT Press. 1989.
MATH Google Scholar
Author information
Authors and Affiliations
- Electrical Engineering Department, Texas A&M University, 77843, College Station, TX, U.S.A.
Lin Hua & Lü Mi - Hewlett-Packard Lab, P.O. Box 10490, 94303, Palo Alto, CA, U.S.A
Jesse Z. Fang
Authors
- Lin Hua
- Lü Mi
- Jesse Z. Fang
Corresponding author
Correspondence toLin Hua.
Additional information
This research was supported by the Texas Advanced Technology Program under Grant No. 999903-165.
LIN Hua received his B.S. and M.S. degrees in electrical engineering from Fudan University, People’s Republic of China., in 1983 and 1986, respectively. Beginning in 1986, he taught for three years in the Department of Electronics Engineering at Fudan University as a Lecturer, and he is currently a Ph.D. candidate in the Department of Electrical Engineering at Texas A&M University, College Station, TX, USA. His research interests include the design and analysis of parallel algorithms for combinatorial optimization problems, and the parallelizing compiler.
Lü Mi received her M.S. and Ph.D. degrees in electrical engineering from Rice University, Houston, TX, USA, in 1984 and 1987, respectively.
She joined the Department of Electrical Engineering, Texas A&M University in 1987 where she is currently an Associate Professor. Her research interests include parallel computing, distributed processing, parallel computer architectures and applications, computational geometry and VLSI algorithms. She has published over 60 technical papers in these areas. Her research has been funded by the National Science Foundation and the Texas Advanced Technology Program.
Dr. LÜ is a senior member of the IEEE Computer Society. She is the Associate Editor of a number of professional journals, and the Stream Chairman of the 7th International Conference on Computing and Information. She served on the panels of NSF and IEEE Workshop on Imprecise and Approximate Computation’92, and on the Program Committees of the International Conference on Computing and Information’94, the Joint Conference on Information Science’94 and the IASTEDISMM International Conference on Parallel and Distributed Computing and Systems’95. She is a registered professional engineer.
Jesse Z. FANG received his B. S. degree in mathematics from Fudan University, Shanghai, China, and his M.S. and Ph.D. degrees in computer science from The University of Nebraska, Lincoln in 1982 and 1984 respectively.
After graduate, he taught at Computer Science Department at Wichita State University and was an visiting senior computer scientist in the Center for Supercomputing Research and Development at the University of Illinois, Urbana-Champaign. From 1986 to 1989, he was a consultant member of technical staff at the Concurrent Computer Corp. From 1989 to 1991, he worked on parallel/vectorized compiler and supercomputing system design for CONVEX Computer Corp. as a program manager in Software Development Department. He is currently working on Hewlett-Packard Laboratories to develop compilers for Hewlett-Packard new generation RISC architecture as a project manager. His research interests are instruction-level parallel compiler technologies on RISC architecture, superscalar and VLIW RISC architecture, parallel processing system, parallel/vectorized compiler, scheduling and synchronization.
Rights and permissions
About this article
Cite this article
Lin, H., Lü, M. & Fang, J.Z. A direct approach for finding loop transformation matrices.J. of Comput. Sci. & Technol. 11, 237–256 (1996). https://doi.org/10.1007/BF02943132
- Received: 15 July 1995
- Issue date: May 1996
- DOI: https://doi.org/10.1007/BF02943132