Automated debugging mechanisms for orchestrated cloud infrastructures with active control and global evaluation

Kovacs, J., Ligetfalvi, B. and Lovas, R. 2024. Automated debugging mechanisms for orchestrated cloud infrastructures with active control and global evaluation. IEEE Access. 12, pp. 143193-143214. https://doi.org/10.1109/ACCESS.2024.3467228

TitleAutomated debugging mechanisms for orchestrated cloud infrastructures with active control and global evaluation
TypeJournal article
AuthorsKovacs, J.
Ligetfalvi, B.
Lovas, R.
Abstract

Orchestration methods at Infrastructure-as-a-Service (IaaS) level automate the deployment, scaling, and management of virtualized resources, typically across multiple hosts and data centres. While orchestration provides many advantages, it also introduces several challenges in testing and debugging phases, particularly due to the distributed nature of the virtualized resources. Even the proper initial deployment of interdependent virtual machines (VM) may cause fatal errors since the unpredictable timing conditions may change the overall initialisation method, which can lead to abnormal behaviour, i.e. in complex, non-deterministic environments, the set of VM configurations can drift from their expected states (‘configuration drift’). The overall motivation of our research is to improve the reliability of cloud-based infrastructures with minimal user interactions and significantly automate the time-consuming debugging process. This paper focuses on the examination and behaviour of cloud-based infrastructures during their deployment phase. We continued the adaption of a replay-active control based debugging technique, called macrostep, in the field of cloud orchestration. In order to provide efficient support for developers troubleshooting major deployment related errors, the fundamental macrostep mechanisms have been enriched and significantly extended including 1) the automated generation of collective breakpoint sets, 2) parallel and robust traversal method for such consistent global states with 3) automated evaluation of global predicates in each global state of VM set. Furthermore, the novel methods have been 4) generalized towards wider user scenarios by targeting the Terraform orchestration tool as well (besides the already supported Occopus). The paper describes the significantly enhanced approach, our design choices, and also the implementation of the experimental debugger tool with a use case for validation purposes by addressing the deployment of a SLURM (HPC) cluster.

Keywordscloud
iaas
debugging
orchestration
replay
active control
troubleshooting
macrostep
evaluation
global predicates
JournalIEEE Access
Journal citation12, pp. 143193-143214
ISSN2169-3536
Year2024
PublisherIEEE
Publisher's version
License
CC BY-NC-ND 4.0
File Access Level
Open (open metadata and files)
Digital Object Identifier (DOI)https://doi.org/10.1109/ACCESS.2024.3467228
Web address (URL)https://ieeexplore.ieee.org/document/10693454/
Publication dates
Published25 Sep 2024
Supplemental file
File Access Level
Open (open metadata and files)

Related outputs

Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration
Pintye, I., Kovacs, J. and Lovas, R. 2024. Enhancing Machine Learning-Based Autoscaling for Cloud Resource Orchestration. Journal of Grid Computing. 4 (22), p. 31. https://doi.org/10.1007/s10723-024-09783-1

Decentralised Orchestration of Microservices in the Cloud-to-Edge Continuum
Kiss, T., Ullah, A., Kovacs, J., Deslauriers, J., Terstyanszky, G. and Tusa, F. 2024. Decentralised Orchestration of Microservices in the Cloud-to-Edge Continuum. 16th International Workshop on Science Gateways (IWSG2024). Tolouse, France 18 - 20 Jun 2024 Zenodo. https://doi.org/10.5281/zenodo.13863564

Swarmchestrate: Towards a Fully Decentralised Framework for Orchestrating Applications in the Cloud-to-Edge Continuum
Kiss, T., Ullah, A., Terstyanszky, G., Kao, O., Becker, S., Verginadis, Y., Michalas, A., Stankovski, V., Kertesz, A., Ricci, E, Altmann, J., Egger, B., Tusa, F., Kovacs, J. and Lovas, R. 2024. Swarmchestrate: Towards a Fully Decentralised Framework for Orchestrating Applications in the Cloud-to-Edge Continuum. AINA 2024 - 38th International Conference on Advanced Information Networking and Applications. Kitakyushu International Convention Center, Kitakyushu, Japan 17 - 19 Apr 2024 Springer. https://doi.org/10.1007/978-3-031-57931-8_9

Orchestration in the Cloud-to-Things Compute Continuum: Taxonomy, Survey and Future Directions
Ullah, A., Kiss, T., Kovacs, J., Tusa, F., Deslauriers, J., Dagdeviren, H., Arjun, R. and Hamzeh, H. 2023. Orchestration in the Cloud-to-Things Compute Continuum: Taxonomy, Survey and Future Directions. Journal of Cloud Computing: Advances, Systems and Applications. 12 (135). https://doi.org/10.1186/s13677-023-00516-5

Toward a reference architecture based science gateway framework with embedded e‐learning support
Pierantoni, G., Kiss, T., Bolotov, A., Kagialis, D., James DesLauriers, Ullah, A., Chen, H., Chan You Fee, D., Dang, H., Kovacs, J., Belehaki, A., Herekakis, T., Tsagouri, I. and Gesing, S. 2023. Toward a reference architecture based science gateway framework with embedded e‐learning support. Concurrency and Computation: Practice and Experience. 35 (18) e6872. https://doi.org/10.1002/cpe.6872

To Offload or Not? An Analysis of Big Data Offloading Strategies from Edge to Cloud
Singh, R., Kovacs, J. and Kiss, T. 2022. To Offload or Not? An Analysis of Big Data Offloading Strategies from Edge to Cloud. IEEE World AI IoT Congress 2022. Seattle, USA 06 - 09 Jun 2022 IEEE . https://doi.org/10.1109/AIIoT54504.2022.9817276

Interoperable Data Analytics Reference Architectures Empowering Digital-Twin-Aided Manufacturing
Marosi, A.C., Márk Emodi, Hajnal, A., Lovas, R., Kiss, T., Valerie Poser, Antony, J., Bergweiler, S., Hamzeh, H., Deslauriers, J. and Kovacs, J. 2022. Interoperable Data Analytics Reference Architectures Empowering Digital-Twin-Aided Manufacturing. Future Internet. 14 (4) e114. https://doi.org/10.3390/fi14040114

Abstractions of Abstractions: Metadata to Infrastructure-as-Code
Deslauriers, J., Kovacs, J. and Kiss, T. 2022. Abstractions of Abstractions: Metadata to Infrastructure-as-Code. FIST 2022 - 1st International Workshop on the Foundations of Infrastructure Specification and Testing, In conjunction with the 19TH IEEE International Conference on Software Architecture (ICSA 2022). On-line 12 - 15 Mar 2022 IEEE . https://doi.org/10.1109/icsa-c54293.2022.00051

Dynamic Composition and Automated Deployment of Digital Twins for Manufacturing
Deslauriers, J., Kiss, T. and Kovacs, J. 2021. Dynamic Composition and Automated Deployment of Digital Twins for Manufacturing. 13th International Workshop on Science Gateways. Virtual event 10 - 11 Jun 2021 CEUR Workshop Proceedings.

Industry Simulation Gateway on a Scalable Cloud
Kovacs, J., Kiss, T., Taylor, S.J.E., Farkas, A, Anagnostou, A., Pattison, G, Emodi, M, Kite, S., Petry, J, Snookes, G, Kacsuk, P. and Lovas, R. 2020. Industry Simulation Gateway on a Scalable Cloud. Gesing, S., Taylor, I. and Barclay, I (ed.) 12th International Workshop on Science Gateways. On-line 10 - 11 Jun 2020 CEUR Workshop Proceedings.

Towards a Deadline-Based Simulation Experimentation Framework Using Micro-Services Auto-Scaling Approach
Anagnostou, A., Taylor, S.J.E., Abubakar, N.T., Kiss, T., Deslauriers, J., Gesmier, G., Terstyanszky, G., Kacsuk, P. and Kovacs, J. 2019. Towards a Deadline-Based Simulation Experimentation Framework Using Micro-Services Auto-Scaling Approach. Mustafee, N., Bae, K.-H.G., Lazarova-Molnar, S., Rabe, M., Szabo, C., Haas, P. and Son, Y-J. (ed.) Winter Simulation Conference 2019. Gaylord National Resort & Conference Center National Harbor, Maryland 08 - 11 Dec 2019 IEEE . https://doi.org/10.1109/wsc40007.2019.9004882

WS-PGRADE/gUSE in European Projects
Kiss, T., Kacsuk, P., Lovas, R., Balasko, A., Spinuso, A., Atkinson, M., D’Agostino, D., Danovaro, E. and Schiffers, M. 2014. WS-PGRADE/gUSE in European Projects. in: Kacsuk, P. (ed.) Science Gateways for Distributed Computing Infrastructures, Development Framework and Exploitation by Scientific User Communities Springer. pp. 235-254

EDGeS: bridging EGEE to BOINC and XtremWeb
Urbah, E., Kacsuk, P., Farkas, Z., Fedak, G., Kecskemeti, G., Lodygensky, O., Marosi, A.C., Balaton, Z., Caillat, G., Gombas, G., Kornafeld, A., Kovacs, J., He, H. and Lovas, R. 2009. EDGeS: bridging EGEE to BOINC and XtremWeb. Journal of Grid Computing. 7 (3), pp. 335-354. https://doi.org/10.1007/s10723-009-9137-0

Integrated service and desktop grids for scientific computing
Lovas, R. and Kiss, T. 2009. Integrated service and desktop grids for scientific computing. in: Conference proceddings of DCABES 2009. The 8th international symposium on distributed computing and applications to business, engineering and science. Wuhan, China, 16-19, October, 2009 DCABES. pp. 251-255

EDGeS, the common boundary between service and desktop grids
Balaton, Z., Farkas, Z., Gombas, G., Kacsuk, P., Lovas, R., Marosi, A.C., Emmen, E., Terstyanszky, G., Kiss, T., Kelley, I., Taylor, I. and Araujo, F. 2008. EDGeS, the common boundary between service and desktop grids. Parallel Processing Letters. 18 (3), pp. 433-445. https://doi.org/10.1142/S012962640800348X

EDGeS: integrating EGEE with desktop grids
Lovas, R., Kacsuk, P. and Lodygensky, O. 2008. EDGeS: integrating EGEE with desktop grids. 3rd EGEE User Forum. Clermont-Ferrand, France

EDGeS: a bridge between desktop grids and service grids
Fedak, G., He, H., Lodygensky, O., Balaton, Z., Farkas, Z., Gombas, G., Kacsuk, P., Lovas, R., Marosi, A.C., Kelley, I., Taylor, I., Terstyanszky, G., Kiss, T., Cardenas-Montes, M., Emmen, E. and Araujo, F. 2008. EDGeS: a bridge between desktop grids and service grids. in: 3rd ChinaGrid Annual Conference, ChinaGrid2008 IEEE . pp. 3-9

EDGeS: bridging desktop and service grids
Cárdenas-Montes, M., Emmen, E., Marosi, A.C., Araujo, F., Gombas, G., Kiss, T., Fedak, G., Kelley, I., Taylor, I., Lodygensky, O., Kacsuk, P., Lovas, R., Balaton, Z. and Farkas, Z. 2008. EDGeS: bridging desktop and service grids. in: IBERGRID'2008, Iberian Grid Infrastructure Conference Netbiblo. pp. 212-224

EDGeS: the common boundary between service and desktop grids
Balaton, Z., Farkas, Z., Gombas, G., Kacsuk, P., Lovas, R., Marosi, A.C., Terstyanszky, G., Kiss, T., Lodygensky, O., Fedak, G., Emmen, E., Kelley, I., Taylor, I., Cardenas-Montes, M. and Araujo, F. 2008. EDGeS: the common boundary between service and desktop grids. in: Gorlatch, S., Fragopoulou, P. and Priol, T. (ed.) Grid computing: achievements and prospects Springer. pp. 37-48

GRID superscalar enabled P-GRADE portal
Lovas, R., Sipos, G., Kacsuk, P., Sirvent, R., Perez, J.M. and Badia, R.M. 2007. GRID superscalar enabled P-GRADE portal. in: Gorlatch, S. and Danelutto, M. (ed.) Integrated research in GRID computing: CoreGRID Integration Workshop 2005 (selected papers), November 28-30, Pisa, Italy Berlin Springer. pp. 241-254

Correctness debugging of message passing programs using model verification techniques
Lovas, R. and Kacsuk, P. 2007. Correctness debugging of message passing programs using model verification techniques. in: Cappello, F., Herault, T. and Dongarra, J. (ed.) Recent advances in parallel virtual machine and message passing interface: 14th European PVM/MPI Users' Group Meeting, Paris, France, September 30 - October 3, 2007: proceedings Berlin Springer.

Air pollution forecast on the HUNGRID infrastructure
Lovas, R., Patvarczki, J., Kacsuk, P., Lagzi, I., Turnyi, T., Kullman, L., Haszpra, L., Meszaros, R., Horanyi, A. and Bencsura, A. 2006. Air pollution forecast on the HUNGRID infrastructure. in: Parallel Computing: current and future issues of high-end computing John von Neumann Institute for Computing, Central Institute for Applied Mathematics, Forschungszentrum Julich Julich, Germany.

P-GRADE: a grid programming environment
Kacsuk, P., Dozsa, G., Kovacs, J., Lovas, R., Podhorszki, N., Balaton, Z. and Gombas, G. 2003. P-GRADE: a grid programming environment. Journal of Grid Computing. 1 (2), pp. 171-197. https://doi.org/10.1023/B:GRID.0000024073.65405.63

Demonstration of P-GRADE job-mode for the Grid
Kacsuk, P., Lovas, R., Kovacs, J., Gombas, G., Podhorszki, N., Ovath, A.H., Horanyi, A., Szeberenyi, T., Delaitre, T. and Terstyanszky, G. 2003. Demonstration of P-GRADE job-mode for the Grid. in: Kosch, H. and Boszormenyi, L. (ed.) Euro-Par 2003 Parallel Processing: 9th International Euro-Par Conference Klagenfurt, Austria, August 26-29, 2003 Proceedings Berlin, Germany Springer. pp. 1281-1286

Permalink - https://westminsterresearch.westminster.ac.uk/item/wxw9x/automated-debugging-mechanisms-for-orchestrated-cloud-infrastructures-with-active-control-and-global-evaluation


Share this

Usage statistics

8 total views
2 total downloads
These values cover views and downloads from WestminsterResearch and are for the period from September 2nd 2018, when this repository was created.