SCHEDULING STRATEGIES FOR MIXED DATA AND TASK PARALLELISM ON HETEROGENEOUS CLUSTERS (original) (raw)

Steady-state scheduling of task graphs on heterogeneous computing platforms

2003

In this paper, we consider the execution of a complex application on a heterogeneous "grid" computing platform. The complex application consists of a suite of identical, independent problems to be solved. In turn, each problem consists of a set of tasks. There are dependences (precedence constraints) between these tasks. A typical example is the repeated execution of the same algorithm on several distinct data samples. We use a non-oriented graph to model the grid platform, where resources have different speeds of computation and communication. We show how to determine the optimal steady-state scheduling strategy for each processor (the fraction of time spent computing and the fraction of time spent communicating with each neighbor) and how to build such as schedule. This result holds for a quite general framework, allowing for cycles and multiple paths in the platform graph.

Assessing the impact and limits of steady-state scheduling for mixed task and data parallelism on heterogeneous platforms

2004

In this paper, we consider steady-state scheduling techniques for mapping a collection of task graphs onto heterogeneous systems, such as clusters and grids. We advocate the use of steady-state scheduling to solve this difficult problem. Due to space limitations, we concentrate on complexity results. We show that the problem of optimizing the steady-state throughput is NP-Complete in the general case. We formulate a compact version of the problem that belongs to the NP complexity class but which does not restrict the optimality of the solution.

STEADY-STATE SCHEDULING ON HETEROGENEOUS CLUSTERS

International Journal of Foundations of Computer Science, 2005

In this paper, we consider steady-state scheduling techniques for heterogeneous systems, such as clusters and grids. We advocate the use of steady-state scheduling to solve a variety of important problems, which would be too difficult to tackle with the objective of makespan minimization. We give a few successful examples before discussing the main limitations of the approach.

A Paradigm for Allocating Parallel Application Tasks to Heterogeneous Computing Resources on the Grid

The work addresses the problem of allocating parallel application tasks for execution on heterogeneous computing resources on the Grid. The proposed allocation paradigm considers issues pertinent to the Grid environment. Basically, our model considers the relationship between the clients and the environment in one side, and the relationship between the system providers and the environment on the other. This consideration is re ected in utilizing the client and the system speci cation to determine the objective function and the constraints of the mapping problem. The paradigm adopts a multilevel graph partitioning and mapping approach. The objective of the mapping is to minimize the parallel application execution time, subject to the speci ed constraints. The paradigm introduces an ef cient heuristic for the coarsening step, called the VHEM method. The simulation study shows that the heuristic can achieve very high reduction factor, when the ratio of the number of tasks to the number of processors exceeds a threshold value. Also, the paradigm introduces an ef cient heuristic for the re nement phase, in which, the space of processor preference for remapping includes the subset of processors on the shortest paths from the currently allocated processor to all other processors to which adjacent vertices are allocated.

Clustering-Based Task Scheduling in a Large Number of Heterogeneous Processors

IEEE Transactions on Parallel and Distributed Systems, 2016

Parallelization paradigms for effective execution in a Directed Acyclic Graph (DAG) application have been widely studied in the area of task scheduling. Schedule length can be varied depending on task assignment policies, scheduling policies, and heterogeneity in terms of each processor and each communication bandwidth in a heterogeneous system. One disadvantage of existing task scheduling algorithms is that the schedule length cannot be reduced for a data intensive application. In this paper, we propose a clustering-based task scheduling algorithm called Clustering for Minimizing the Worst Schedule Length (CMWSL) to minimize the schedule length in a large number of heterogeneous processors. First, the proposed method derives the lower bound of the total execution time for each processor by taking both the system and application characteristics into account. As a result, the number of processors used for actual execution is regulated to minimize the Worst Schedule Length (WSL). Then, the actual task assignment and task clustering are performed to minimize the schedule length until the total execution time in a task cluster exceeds the lower bound. Experimental results indicate that CMWSL outperforms both existing list-based and clustering-based task scheduling algorithms in terms of the schedule length and efficiency, especially in data-intensive applications.

Dynamic scheduling of a batch of parallel task jobs on heterogeneous clusters

Parallel Computing, 2011

This paper addresses the problem of minimizing the scheduling length (make-span) of a batch of jobs with different arrival times. A job is described by a direct acyclic graph (DAG) of parallel tasks. The paper proposes a dynamic scheduling method that adapts the schedule when new jobs are submitted and that may change the processors assigned to a job during its execution. The scheduling method is divided into a scheduling strategy and a scheduling algorithm. We also propose an adaptation of the Heterogeneous Earliest-Finish-Time (HEFT) algorithm, called here P-HEFT, to handle parallel tasks in heterogeneous clusters with good efficiency without compromising the makespan. The results of a comparison of this algorithm with another DAG scheduler using a simulation of several machine configurations and job types shows that P-HEFT gives a shorter makespan for a single DAG but scores worse for multiple DAGs. Finally, the results of the dynamic scheduling of a batch of jobs using the proposed scheduler method showed significant improvements for more heavily loaded machines when compared to the alternative resource reservation approach.

Task scheduling for heterogeneous computing systems

The Journal of Supercomputing, 2016

Efficient scheduling of tasks in heterogeneous computing systems is of primary importance for high-performance execution of programs. The programs are to be considered as multiple sequences of tasks that are presented as directed acyclic graphs (DAG). Each task has its own execution timeline that incorporates into multiple processors. Moreover, each edge on the graph represents constraints between the sequenced tasks. In this paper, we propose a new list-scheduling algorithm that schedules the tasks represented in the DAG to the processor that best minimizes the total execution time by taking into consideration the restriction of crossover between processors. This objective will be achieved in two major phases: (a) computing priorities of each task that will be executed, and (b) selecting the processor that will handle each task. The first phase, priorities computation, focuses on finding the best execution sequence that minimizes the makespan of the overall execution. In list-scheduling algorithm, the quality of the solution is very sensitive to the priority assigned to the tasks. Therefore, in this paper, we include an enhanced calculation of weight that is used in the ranking equation for determining the priority of tasks. The second phase, processor selection, primarily focuses on allocating a processor that is a best fit for each task to be executed. In this paper, we enhance the processor selection by introducing a randomized decision mechanism based on a threshold which decides whether the task be assigned to the processor with the lowest execution time or to the processor that produces the lowest finish time. This mechanism considers a balanced combination of the local and global optimal results to explore the search space efficiently to optimize the over-

Scheduling Strategies for Master-Slave Tasking on Heterogeneous Processor Grids

Workshop on Applied Parallel Computing, 2002

In this paper, we consider the problem of allocating a large number of independent, equal-sized tasks to a heterogeneous "grid" com- puting platform. We use a non-oriented graph to model a grid, where resources can have different speeds of computation and communication, as well as different overlap capabilities. We show how to determine the optimal steady-state scheduling strategy for each

Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing

Parallel and Distributed Systems, …, 2002

AbstractÐEfficient application scheduling is critical for achieving high performance in heterogeneous computing environments. The application scheduling problem has been shown to be NP-complete in general cases as well as in several restricted cases. Because of its key importance, this problem has been extensively studied and various algorithms have been proposed in the literature which are mainly for systems with homogeneous processors. Although there are a few algorithms in the literature for heterogeneous processors, they usually require significantly high scheduling costs and they may not deliver good quality schedules with lower costs. In this paper, we present two novel scheduling algorithms for a bounded number of heterogeneous processors with an objective to simultaneously meet high performance and fast scheduling time, which are called the Heterogeneous Earliest-Finish-Time (HEFT) algorithm and the Critical-Path-on-a-Processor (CPOP) algorithm. The HEFT algorithm selects the task with the highest upward rank value at each step and assigns the selected task to the processor, which minimizes its earliest finish time with an insertion-based approach. On the other hand, the CPOP algorithm uses the summation of upward and downward rank values for prioritizing tasks. Another difference is in the processor selection phase, which schedules the critical tasks onto the processor that minimizes the total execution time of the critical tasks. In order to provide a robust and unbiased comparison with the related work, a parametric graph generator was designed to generate weighted directed acyclic graphs with various characteristics. The comparison study, based on both randomly generated graphs and the graphs of some real applications, shows that our scheduling algorithms significantly surpass previous approaches in terms of both quality and cost of schedules, which are mainly presented with schedule length ratio, speedup, frequency of best results, and average scheduling time metrics.

From Heterogeneous Task Scheduling to Heterogeneous Mixed Data and Task Parallel Scheduling

Mixed-parallelism, the combination of data-and taskparallelism, is a powerful way of increasing the scalability of entire classes of parallel applications on platforms comprising multiple compute clusters. While multi-cluster platforms are predominantly heterogeneous, previous work on mixed-parallel application scheduling targets only homogeneous platforms. In this paper we develop a method for extending existing scheduling algorithms for task-parallel applications on heterogeneous platforms to the mixed-parallel case.

SCHEDULING STRATEGIES FOR MIXED DATA AND TASK PARALLELISM ON HETEROGENEOUS CLUSTERS (original) (raw)

Related papers