LBG-SQUARE Fault-Tolerant, Locality-Aware Co-Allocation in P2P Grids
A PHP Error was encountered
Message: Undefined index: id
Line Number: 218
In this paper, the deployment and execution of iterative stencil applications on a P2Pgrid middleware are investigated. So-called iterative stencil applications are composed of sets of heavily-communicating, long-running tasks. They thus require co-allocation of multiple reliable resources for extended periods of time. P2P grids are totally decentralized and provide on-demand, transparent access to edge resources, e.g. Internet-connected, non-dedicated desktop computers. A P2P grid has the potential to provide access to a large number of resources at the fraction of the cost of a dedicated cluster. However, edge resources are heterogeneous in performance and intrinsically unreliable: task execution failures are common due to resource preemption or resource failure. Furthermore, P2P grid schedulers usually target sets of independent computational Tasks, i.e. so-called Bags of Tasks applications. It is therefore not trivial to deploy and run an iterative stencil application on a P2P grid. Checkpointing is a common fault-tolerance mechanism in high performance distributed computing, often based on a centralized architecture. Locality-aware co-allocation in P2P grids has been recently investigated. Checkpointing and locality-aware co-allocation yet have to be integrated in P2P grids. We propose to provide co-allocation through an existing middleware-level Bag of Tasks scheduling mechanism. We also introduce a layer of fault-tolerance for the iterative stencils that relies on a scalable, application-level, P2P checkpointing mechanism. Finally, LBG-SQUARE is described. This software results from the combination of a specific Iterative Stencil application (a computational fluid dynamicssimulation software called LaBoGrid) with a P2Pgrid middleware (Lightweight Bartering Grid).