The Effect of Communication Time Delays in Parallel Computations (original) (raw)

Parallel Redistribution of Multidimensional Data

Parallel Computing, 2007

On a parallel computer with distributed memory, multidimensional arrays are usually mapped onto the nodes such that only one or more of the indexes becomes distributed. Global computa- tion on data associated with the reminding indexes may then be done without communication. However, when global communication is needed on all indexes a complete redistribution of the data is needed. In

On the massively parallel solution of the assignment problem

Journal of Parallel and Distributed Computing, 1991

In this paper we discuss the design, implementation, and effectiveness of massively parallel algorithms for the solution of large-scale dense assignment problems. In particular, we study the auction algorithm of Bertsekas, an algorithm based on the method of multipliers of Hestenes and Powell, and an algorithm based on the alternating direction method of multipliers of Eckstein. We discuss alternative approaches to the massively parallel implementation of the auction algorithm, including Jacobi, Gauss-Seidel, and a hybrid scheme. The hybrid scheme, in particular, exploits two different levels of parallelism and an efficient way of communicating the data between them without the need to perform general router operations across the hypercube network. We then study the performance of massively parallel implementations of the two methods of multipliers. Implementations are carried out on the Connection Machine CM-2, and the algorithms are evaluated empirically with the solution of largescale problems. The hybrid scheme significantly outperforms all of the other methods and gives the best computational results to date for a massively parallel solution to this problem. o

Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques

2008

* Copyright 2007, Uzi Vishkin. These class notes reflect the theorertical part in the Parallel Algorithms course at UMD. The parallel programming part and its computer architecture context within the PRAM-On-Chip Explicit Multi-Threading (XMT) platform is provided through the XMT home page www.umiacs.umd.edu/users/vishkin/XMT and the class home page. Comments are welcome: please write to me using my last name at umd.edu

On the Impact of Communication Complexity on the Design of Parallel Numerical Algorithms

IEEE Transactions on Computers, 1984

Thispaper describestwo models ofthe costofdata movement in parailel numerical algorithms.One model isa generalization of an approach due to Hockney, and is suitable for shared memory multiproeessors where each processorhas vectorcapabilities. The other model is applicableto highlyparallel nonshared memory MIMD systems. In this second model, algorithmperformance is characterizedin terms of the communication network design. Techniquesused inVLSI complexitytheory are alsobroughtin,and algorithmindependent upper bounds on system performance are derivedforseveralproblems that are important to scientific computation. _I Research supportedby the National Aeronauticsand Space Administration under NASA ContractNos.NASI-17070 and NASI-17130whilethe authorswere inresidenceat the Insti-tuteforComputer Applications in Scienceand Engineering, NASA LangleyResearch Center, Hampton, VA 23665. Primary support _orthe first authorwas providedby an IBM Faculty DevelopmentGrant. 1. Repoct No. NASA CR-172436 2. Government Accession No. 3. Recipient's C_log No. ICASE Report No. 84-41-4. Title and Subtitle 5. Report Date On the impact of communication complexity in the design August 1984 of parallel numerical algorithms 6. Performing OrganlzationCode 7. Author(s) 8. PerformingOrganization Report No. Dennis Cannon and John Van Rosendale 84-41 10. Work Unit No. 9. PerformingOrganizationName and Address Institute for Computer Applications in Science and Engineering '11q_::_lra_c_)_(_ant No.

Algorithms sequential and parallel: a unified approach

Microelectronics Journal, 2001

Creating this document (i.e., typing this document into MS Word) took approximately 1 staff day by one of the authors. Therefore, while this document is somewhat extensive, we anticipate that the changes to the text will require no more than 0.5 staff days of effort by the publisher, excluding figures. There are some revisions required to several figures, which we anticipate should take an additional 0.5 staff days of effort by the publisher. Therefore, we anticipate that with minimal effort on the part of the publisher, a significantly enhanced version of the text will be available.

Randomized parallel algorithms for the multidimensional assignment problem

2004

The multidimensional assignment problem (MAP) is a combinatorial optimization problem arising in diverse applications such as computer vision and motion tracking. In the MAP, the objective is to match tuples of objects with minimum total cost. Randomized parallel algorithms are proposed to solve MAPs appearing in multi-sensor multi-target applications. A parallel construction heuristic is described, together with some variations, as well as a parallel local search heuristic.