Adaptive Computing onthe Grid Using AppLeSAdaptive Computing onthe Grid Using AppLeS
Francine Berman, Richard Wolski, Henri Casanova,Walfredo Cirne, Holly Dail, Marcio Faerman,Silvia Figueira, Jim Hayes, Graziano Obertelli,Jennifer Schopf, Gary Shao, Shava Smallen,Neil Spring, Alan Su, and Dmitrii Zagorodnov
IEEE Transactions on Parallel and Distributed Systems, Vol. 14, No. 5, May 2003
AgendaAgenda
Introduction
Problems
AppLeS and its components
Result products
Related works
Discussions
Conclusions
IntroductionIntroduction
What is a Grid?
A collection of resources that can be used asan ensemble
What are resources?
Computational devices, networks, onlineinstruments, storage archives, and etc
ProblemsProblems
Heterogeneity
Different performance
Inconsistentcy
Shared
Fail
Upgraded
AppLeS ProjectAppLeS Project
Application Level Scheduling
Goals
Investigate adaptive scheduling for Gridcomputing
Apply research results to applications forvalidating the efficacy of the approach andextracting Grid performance for the end-user
StepsSteps
(6) ScheduleAdaptation
(1) ResourceDiscovery
(2) ResourceSelection
(3) ScheduleGeneration
(4) ScheduleSelection
(5) ApplicationExecution
Resource DiscoveryResource Discovery
Depend on the Grid
A List of user’s logins
Resource discovery services of each Grid
Resource SelectionResource Selection
Simple SARA
Synthetic ApertureRadar Atlas
Developed by JPL andSDSC
Provide access tosatellite imagesdistributed in variousrepositories
End-to-end availablebandwidth is predictedusing NWS
Performance ModelingPerformance Modeling
Jacobi 2D
Main loop
Loop until convergence
For all matrix entriesAi,j
Ai,j = ¼(Ai,j + Ai+1,j + Ai-1,j+ Ai,j+1 + Ai,j-1)
Compute local error
Model
Ti = Areai * Operi *AvailCPUi + Ci ; 1 <= I<= p
i,j
i-1,j
i+1,j
i,j-1
i,j+1
Area - the size of the strip, Oper - execution time to compute one entry
AvailCPU - percentage of available CPU, C - Communication time
Scheduling GenerationScheduling Generation
Complib
A computational biology application
Compare a library of unknown sequencesagainst a database of “known” sequencesusing FASTA scoring method
Parallization
Master/Worker
Work size
Small unit size (Self-scheduling) - high overhead
Big unit size - load imbalance
AppLeS’s ApprochAppLeS’s Approch
Scheduling AdaptationScheduling Adaptation
MCell
A computationalneuroscienceapplication
Study biochemicalinteractions withinliving cells at molecularlevel
Multiple independenttasks
Shared input
XSufferageXSufferage
Based on Sufferage
Sufferage value =second best - firstbest
XSufferage concernsdata replication time(zero for locallyavailable)
OutcomeOutcome
APST - AppLeS Parameter SweepTemplate
AMWAT - AppLeS Master/WorkerApplication Template
SA - Supercomputer AppLeS
APSTAPST
Parameter SweepApplications
Mostly independent
Provide
Transparent deployment
Automatic scheduling
Capabilities
Launching tasks
Moving and storing data
Discovering and monitoringresources
AMWATAMWAT
Master/Worker
Provide
APIs for
Discovering
Scheduling
Predicting
SS - Self-Scheduling
FSC - Fixed Size Chunking
GSS - Guided Self-Schduling
TSS - Trapezoidal Self-Scheduling
FAC2 - Factoring
SASA
Space-shared
Moldable jobs
Reduce responsetimes
Related WorksRelated Works
Environment
MARS and Dome - Run-time checkpointing environment
Structure
MARS - SPMD
VDCE and SEA - Task graph
IOS - Real-time, fine-grained, task graph
Dome and SPP - Abstract language
Dome - SPMD
SPP - Task graph
Performance model
Depend on program structure
Objective
Minimize execution time
Related WorksRelated Works
Env
Struct
Perf
Approach
AppLeS
Any
Any
Provided
Adaptive
MARS
ChkPnt
SPMD
Statistics
Data Dist
Dome
ChkPnt
SPMD
Data Dist
Data Dist
VDCE
TG
Derived
List Sched
SPP
TG
Derived
SEA
TG
Data Flow
Expert Sys
IOS
TG
Derived
GA
GrADS
DiscussionsDiscussions
Performance of distributed applicationsdepend on both application and platform-specific information
Storage and service are usually separated
Communication must be concerned in themodel
Multi-applications environment has notbeen addressed
ConclusionsConclusions
AppLeS
An application-level scheduling framework
Provide adaptive, flexible, and reusablecomponents
being integrated into GrADS for building nextgeneration Grid applications
Each part has been demonstrated itsimprovement