Implementing a Distributed Execution Service for a Grid Broker

Flavio V. D. de FigueiredoFrancisco V. Brasileiro Andrey E. M. BritoVII Workshop de Testes e Tolerância a Falhas - Curitiba, PR, Brasil - 2006

Grid middleware such as OurGrid offer solutions for executing parallel tasks on a grid system. In such systems, users submit their applications for executions through a client broker. MyGrid is the client broker used for the OurGrid system; it is in charge of managing task executions that a user has submitted.Although the broker is able to detect task failures and reschedule them, MyGrid itself constitutes a single point of failure from the user perspective. If it fails, all knowledge of task executions is lost. Moreover, MyGrid is also a bottleneck, since hundreds, or even thousands, of executions could potentially be spawned by an application and need to be managed at the same time by a single broker. In this paper we present the design and implementation of a fault-tolerant distributed execution service that allows for load balancing and improves MyGrid performance. A checkpointing mechanism is used to ease the implementation of the service and to further increase system reliability.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: