Um Modelo para Tolerância a Falhas em Sistemas Distribuídos com QoS

Sérgio GorenderRaimundo José de Araújo Macêdo

Fault Tolerance is a fundamental requirement for the correct functioning of the new safe-critical Internet applications where service interruption may result in great loses (such as e-commerce or environmental monitoring, etc.). Nevertheless, none of the existing fault tolerant distributed system models consider the new architectures intended for providing services with QoS (Quality-of-Service), such as IntServ and DiffServ proposed by the IETF [1,2]. This paper presents a novel approach to deal with fault tolerance in such environments. Our model is adaptable to the dynamic QoS conditions, allowing the exploitation of the best of the two extreme possible distributed system scenarios (synchronous and asynchronous). Furthermore, our model is particularly powerful in the sense that it allows for processes with distinct QoS views to continue their computations and cooperating in a safe, fault tolerant manner. To realize that, we introduce the concept of timely and not timely failure detectors and present a consensus protocol which works correctly even if different processes have distinct views of the local failure detector quality. The consensus protocol is optimum. That is, if all failure detectors are timely (synchronous system), it tolerates f = n-1 crash faults, and for the worse case scenario (asynchronous system with OS failure detectors [3]), it tolerates n/2 - 1 faults, where n is the number of processes.

Caso o link acima esteja inválido, faça uma busca pelo texto completo na Web: Buscar na Web

Biblioteca Digital Brasileira de Computação - Contato:
     Mantida por: