Logo des Repositoriums
 
Zeitschriftenartikel

LAIK: A Library for Fault Tolerant Distribution of Global Data for Parallel Applications

Vorschaubild nicht verfügbar

Volltext URI

Dokumententyp

Text/Journal Article

Zusatzinformation

Datum

2017

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V., Fachgruppe PARS

Zusammenfassung

HPC applications usually are not written in a way that they can cope with dynamic changes in the execution environment, such as removing or integrating new nodes or node components. However, for higher flexibility with regard to scheduling and fault tolerance strategies, adequate application-integrated reaction would be worthwhile. However, with legacy MPI codes, this is difficult to achieve. In this paper, we present Lightweight Application-Integrated data distribution for parallel worKers (LAIK), a lightweight library for distributed index spaces and associated data containers for parallel programs supporting fault tolerance features. By giving LAIK control over data and its partitioning, the library can free compute nodes before they fail and do replication for rollback schemes on demand. Applications become more adaptive to changes of available resources. We show a simple example which integrates our LAIK library and present first results on a prototype implementation.

Beschreibung

Weidendorfer, Josef; Yang, Dai; Trinitis, Carsten (2017): LAIK: A Library for Fault Tolerant Distribution of Global Data for Parallel Applications. PARS-Mitteilungen: Vol. 34, Nr. 1. Berlin: Gesellschaft für Informatik e.V., Fachgruppe PARS. PISSN: 0177-0454. pp. 140-150

Schlagwörter

Zitierform

DOI

Tags