|Title||Integrating formal reasoning into component-based approach to reconfigurable distributed systems|
Distributed computing is becoming ubiquitous in recent years in many areas, especially the
scientific and industrial ones, where the processing power - even that of supercomputers - never
seems to be enough. Grid systems were born out of necessity, and had to grow quickly to
meet requirements which evolved over time, becoming today’s complex systems. Even the
simplest distributed system nowadays is expected to have some basic functionalities, such as
resources and execution management, security and optimization features, data control, etc. The
complexity of Grid applications is also accentuated by their distributed nature, making them
some of the most elaborate systems to date. It is often too easy that these intricate systems
happen to fall in some kind of failure, it being a software bug, or plain simple human error; and
if such a failure occurs, it is not always the case that the system can recover from it, possibly
meaning hours of wasted computational power.
In this thesis, some of the problems which are at the core of the development and mainte-
nance of Grid software applications are addressed by introducing novel and solid approaches
to their solution. The difficulty of Grid systems to deal with unforeseen and unexpected cir-
cumstances resulting from dynamic reconfiguration can be identified. Such problems are often
related to the fact that Grid applications are large, distributed and prone to resource failures.
This research has produced a methodology for the solution of this problem by analysing the
structure of distributed systems and their reliance on the environment which they sit upon, often
overlooked when dealing with these types of scenarios. It is concluded that the way that Grid applications interact with the infrastructure is not sufficiently addressed and a novel approach
is developed in which formal verification methods are integrated with distributed applications
development and deployment in a way that includes the environment. This approach allows for
reconfiguration scenarios in distributed applications to proceed in a safe and controlled way, as
demonstrated by the development of a prototype application.