Análise do impacto de detectores de falha adaptativos no OurGrid

Abmar Grangeiro de BarrosFrancisco Vilar Brasileiro

Failure detectors are fundamental building blocks for implementingdistributed systems. In this context, the state-of-the-art presents a lot of mechanismsthat provide scalability, adaptation, flexibility and quality of serviceenforcement. Despite that, few systems in production actually use these mechanisms.We believe that one of the main reasons for this state of affairs is that thebenefits of using a sophisticated failure detection service are not clearly understood.This paper presents a preliminary evaluation on the impact of adaptivefailure detectors in distributed systems, taking the OurGrid, a middleware forgrid computing, as an use case. We have analyzed, via simulation, the effectof using three of the most known adaptive mechanisms in literature on the taskmakespan. Our results show that the failure detection mechanism currently implementedin OurGrid performs substantially worse than any of the adaptivedetectors analyzed.

