Workflow scheduling for Grid environment
Before describing on workflow scheduling, it is necessary to understand what the workflow and workflow management system in grid environment.
A workflow can be realized as a systematic planning of activities where files, data sets or tasks of any scientific experiments or projects are passed from one machine to another according to a set of rules in a grid environment for their execution to achieve the ultimate goal [1].
The system that defines, creates, specify, monitor, and coordinate the execution of workflows within a distributed environment is called workflow management system. Workflow Management System provides an automation of workflow, as a whole or part. The functions of [2] workflow management system can be characterized into build time functions and run time functions. The build time functions are concerned with defining, and modeling workflow tasks and their dependencies. While run-time functions are concerned with managing workflows executions and interactions with grid resources for processing workflow applications.
Workflow Scheduling is one of the important components of workflow management system, which defines the set of procedural rules which facilitates in the direction of various tasks and sub-tasks to different machines in the grid network based on “throughput”, as one of distinct measures [3]. Objective of this paper is to define workflow scheduling and highlight taxonomy of workflow scheduling and finally review the various workflow scheduling strategies implemented by some recognized existing workflow system.
Introduction:
Workflow scheduling is a task scheduling which involves where to execute a task in a grid environment. It mainly focuses on mapping and managing the execution of inter-dependent tasks on shared resources that are not directly under the control of workflow system [2]. Workflow scheduling is a processing assigning tasks to resources over time [4]. So objective of workflow scheduling to basically, finding an efficient flow route to different participating machines for a various tasks and sub-tasks in a grid environment based on throughput, performance efficiency and other distinct measures using various algorithm.
Workflow scheduling can also be broadly called as meta-scheduling or global task scheduling. Making an efficient workflow schedule is very difficult due to distribution, dynamicity, unreliability and heterogeneity of grid environment.
There are three major steps in workflow scheduling [5]: workflow planning, advance reservation, and workflow execution with run-time rescheduling. Workflow planning is to select a service for every task in the workflow and generate a schedule before workflow execution. Advance reservation is important to workflow scheduling especially for long lasting workflow execution. And finally, the workflow scheduler must be able to adapt and update the schedule based on resource dynamics.
Approaches in Workflow Scheduling algorithm:
Workflow scheduling algorithms are resource allocation algorithms. These are the fundamental part of workflow scheduling. Such algorithms are based on various approaches. Some of them are…
a. Task based
b. Workflow based
c. QoS based
d. MatchMaking based
Task-based approach [5] concerns only on the nature of the task or jobs that are ready to run. These are the greedy algorithms that make the decisions locally for the jobs to allocate the resources. Algorithms like Min-min task scheduling algorithm uses task-based approach for the allocation of resources using time estimates derived from performance modeling techniques.
Workflow-based approach [5] concerns with overall workflow rather then just the ready to run jobs. In this approach all the jobs in the workflow are mapped a priori to resources in order of minimize the makespan [6] of the whole workflow, which may requires remapping if the environment changes. In this approach a number of iterations are made to find the vase possible mapping of jobs to resources for a given workflow. The main difference is that WBA creates and compares many alternative whole workflow schedules before the final schedule is chosen. A local search algorithm based of GRASP [8] uses the workflow-based approach for workflow allocation.
QoS-based approach concerns with the user’s QoS constraints for the allocation of resources. QoS constraints might be task specific in term of time and cost, or QoS constraint can be specified for overall execution of workflow.
MatchMaking-based approach simply matches the jobs with available resources and then allocates resources with high value of rank.
How Workflow scheduling works:
The workflow scheduler component interacts with Enactment Engine which is a service that supervises the reliable and fault tolerant execution of the tasks and transfer of the files [9]. The resource broker and the performance predictor are auxiliary services which provide information about the resources available on the Grid, and prediction about the expected execution times and data transfer times. The performance monitoring service provides up-to-date status information of the application execution and of the Grid environment. This information can be used by the scheduler to make a decision about rescheduling. All these service components are important parts of Workflow Management system.
The workflow scheduler itself consists of several components. The workflow evaluator transforms the dynamic and compact representation. The scheduling engine performs the actual scheduling, applying one of the alternative scheduling algorithms. The event generator is meant for generation of rescheduling events to cope the dynamic nature of the workflow.
References
[1] Workflow: An Introduction, Rob Allen
[2] A Taxonomy of workflow Management Systems for Grid Computing, Jia Yu and R Buyya.
[3] http://en.wikipedia.org/wiki/Workflow
[4] Grid scheduling, Marek Wieczorek, Institute of computer science, Innsbruck .
[5] QoS-based Scheduling of Workflow Applications on Service Grids, Jia Yu, Rajkumar Buyya and Chen Khong Tham.
[6] Task scheduling Strategies for Workflow-based Application in Grids, Jim Blythe, Sonal Jain, Ewa Deelman….
[7] Minimum makespan algorithm http://www.diku.dk/undervisning/2003e/404/msappr.pdf
[8] GRASP http://en.wikipedia.org/wiki/Greedy_randomized_adaptive_search_procedure
[9] Scheduling of Scientific Workflows in the ASKALON Grid Environment
Marek Wieczorek, Radu Prodan and Thomas Fahringer
[10] DAGMan http://www.cs.wisc.edu/condor/dagman/
[11] Condor http://www.cs.wisc.edu/condor/
[12] http://www.cs.wisc.edu/condor/manual/v6.4/2_11DAGMan_Applications.html
[13] The GrADS Project: Software Support for High-Level,Grid Application Development by Francine Berman, Andrew Chien, Keith Cooper, Jack Dongarra, Ian Foster,Dennis Gannon, Lennart Johnsson, Ken Kennedy, Carl Kesselman,
John Mellor-Crummey, Dan Reed, Linda Torczon, and Rich Wolski
[14] GridFlow: Workflow management for grid computing by Junwei Cao, Stephen A. JarVis….
[15] ARMS: Agent-based Resource Management System for Gird Computing by Jarvis, saini,…
[16] Gridbus: The Gridbus Toolkit for Service Oriented Grid and Utility Computing: An
Overview and Status Report.
[17] tuple space http://en.wikipedia.org/wiki/Tuple_space
[18] Scheduling of Scientific Workflows in the ASKALON Grid Environment_ Marek Wieczorek, Radu Prodan and Thomas Fahringer Institute of Computer Science, University of Innsbruck
[19] http://www.mygrid.org.uk/wiki/Mygrid/WorkflowDefinitions
[20] Workflow Management Coalition,The Workflow Reference Model
No comments:
Post a Comment