Distributed systems are usually large and complex systems composed of various components. Systemcomponents are subject to various errors. These failures often require error recovery to be conducted at architectural-level. However, due to complexity of distributed systems, specifying fault tolerance mechanisms at architectural level is complex and error prone. In this paper, we propose a formal approach to specifying components and architectures of fault tolerant distributed and reactive systems. Our approach is based on refinement in the action system formalism – a framework for formal model-driven development of distributed systems. We demonstrate how to specify and refine fault tolerant components and complex distributed systems composed of them. The proposed approach provides designers with a systematic method for developing distributed fault tolerant systems.
Author: Elena Troubitsyna ; Dept. of IT, Abo Akademi Univ., Turku, Finland