Interoperability of heterogeneous large-scale scientific workflows and data resources

Kukla, T. 2011. Interoperability of heterogeneous large-scale scientific workflows and data resources. PhD thesis University of Westminster School of Electronics and Computer Science https://doi.org/10.34737/8zyyx

TitleInteroperability of heterogeneous large-scale scientific workflows and data resources
TypePhD thesis
AuthorsKukla, T.
Abstract

Workflow allows e-Scientists to express their experimental processes in a structured way and provides a glue to integrate remote applications. Since Grid provides an enormously large amount of data and computational resources, executing workflows on the Grid results in significant performance improvement. Several workflow management systems, which are widely used by different scientific communities, were developed for various purposes. Therefore, they differ in several aspects.
This thesis outlines two major problems of existing workflow systems: workflow interoperability and data access. On the one hand, existing workflow systems are based on different technologies. Therefore, to achieve interoperability between their workflows at any level is a challenging task. In spite of the fact that there is a clear demand for interoperable workflows, for example, to enable scientists to share workflows, to leverage existing work of others, and to create multi-disciplinary workflows; currently, there are only limited, ad-hoc workflow interoperability solutions available for scientists. Existing solutions only realise workflow interoperability between a small set of workflow systems and do not consider performance issues that arise in the case of large-scale (computational and/or data intensive) scientific workflows. Scientific workflows are typically computation and/or data intensive and are executed in a distributed environment to speed up their execution time. Therefore, their performance is a key issue. Existing interoperability solutions bottleneck the communication between workflows in most scenarios dramatically increasing execution time. On the other hand, many scientific computational experiments are based on data that reside in data resources which can be of different types and vendors.
Many workflow systems support access to limited subsets of such data resources preventing data level workflow interoperation between different systems. Therefore, there is a demand for a general solution that provides access to a wide range of data resources of different types and vendors. If such a solution is general, in the sense that it can be adopted by several workflow systems, then it also enables workflows of different systems to access the same data resources and therefore interoperate at data level. Note that data semantics are out of the scope of this work. For the same reasons as described above, the performance characteristics of such a solution are inevitably important. Although in terms of functionality, there are solutions which could be adopted by workflow systems for this purpose, they provide poor performance. For that reason, they did not gain wide acceptance by the scientific workflow community.
Addressing these issues, a set of architectures is proposed to realise heterogeneous data access and heterogeneous workflow execution solutions. The primary goal was to investigate how such solutions can be implemented and integrated with workflow systems. The secondary aim was to analyse how such solutions can be implemented and utilised by single applications.

Year2011
File
PublisherUniversity of Westminster
Publication dates
Published2011
Digital Object Identifier (DOI)https://doi.org/10.34737/8zyyx

Related outputs

Enabling Scientific Workflow Sharing through Coarse-Grained Interoperability
Terstyanszky, G., Kukla, T., Kiss, T., Kacsuk, P., Balasko, A. and Farkas, Z. 2014. Enabling Scientific Workflow Sharing through Coarse-Grained Interoperability. Future Generation Computing Systems: The International Journal of Grid Computing and eScience. 37, pp. 46-59. https://doi.org/10.1016/j.future.2014.02.016

Exploring workflow interoperability for neuroimage analysis on the SHIWA platform
Korkhov, V., Krefting, D., Kukla, T., Terstyanszky, G., Caan, M.W.A. and Olabarriaga, S.D. 2013. Exploring workflow interoperability for neuroimage analysis on the SHIWA platform. Journal of Grid Computing. 11 (3), pp. 505-522. https://doi.org/10.1007/s10723-013-9262-7

Application repository and science gateway for running molecular docking and dynamics simulations
Terstyanszky, G., Kiss, T., Kukla, T., Lichtenberger, Z., Winter, S., Greenwell, P., McEldowney, S. and Heindl, H. 2012. Application repository and science gateway for running molecular docking and dynamics simulations. in: Gesing, S., Glatard, T., Kruger, J., Delgado Olabarriaga, S., Solomonides, T., Silverstein, J.C., Montagnat, J., Gaignard, A. and Krefting, D. (ed.) Healthgrid applications and technologies meet science gateways for life sciences IOS Press.

Integrating Open Grid Services Architecture Data Access and Integration with computational Grid workflows
Kukla, T., Kiss, T., Kacsuk, P. and Terstyanszky, G. 2009. Integrating Open Grid Services Architecture Data Access and Integration with computational Grid workflows. Philosophical Transactions of the Royal Society A: Mathematical, Physical & Engineering Sciences. 367 (1897), pp. 2521-2532. https://doi.org/10.1098/rsta.2009.0040

Achieving interoperation of grid data resources via workflow level integration
Kiss, T. and Kukla, T. 2009. Achieving interoperation of grid data resources via workflow level integration. Journal of Grid Computing. 7 (3), pp. 355-374. https://doi.org/10.1007/s10723-009-9136-1

A general and scalable solution for heterogeneous workflow invocation and nesting
Kukla, T., Kiss, T., Terstyanszky, G. and Kacsuk, P. 2008. A general and scalable solution for heterogeneous workflow invocation and nesting. in: Proceedings of the 3rd Workshop on Workflows in Support of Large-Scale Science, in conjunction with SC 2008, Austin, TX, USA, November 17 2008 IEEE . pp. 1-8

Towards Grid data interoperation: OGSA-DAI data resources in computational Grid workflows
Kiss, T., Kukla, T., Terstyanszky, G., Kacsuk, P. and Sipos, G. 2008. Towards Grid data interoperation: OGSA-DAI data resources in computational Grid workflows. in: Proc. of the CoreGRID Workshop “Integrated Research in Grid Computing”, Heraklion-Crete, Greece, 2-4 April 2008 Crete University Press.

High-level user interface for accessing database resources on the Grid
Kiss, T. and Kukla, T. 2008. High-level user interface for accessing database resources on the Grid. in: Kacsuk, P., Lovas, R. and Nemeth, Z. (ed.) Distributed and parallel systems: in focus: desktop grid computing Boston, MA Springer. pp. 155-163

Permalink - https://westminsterresearch.westminster.ac.uk/item/8zyyx/interoperability-of-heterogeneous-large-scale-scientific-workflows-and-data-resources


Share this

Usage statistics

132 total views
177 total downloads
These values cover views and downloads from WestminsterResearch and are for the period from September 2nd 2018, when this repository was created.