Logo des Repositoriums
 

LAIK: A Library for Fault Tolerant Distribution of Global Data for Parallel Applications

dc.contributor.authorWeidendorfer, Josef
dc.contributor.authorYang, Dai
dc.contributor.authorTrinitis, Carsten
dc.date.accessioned2020-03-11T00:06:20Z
dc.date.available2020-03-11T00:06:20Z
dc.date.issued2017
dc.description.abstractHPC applications usually are not written in a way that they can cope with dynamic changes in the execution environment, such as removing or integrating new nodes or node components. However, for higher flexibility with regard to scheduling and fault tolerance strategies, adequate application-integrated reaction would be worthwhile. However, with legacy MPI codes, this is difficult to achieve. In this paper, we present Lightweight Application-Integrated data distribution for parallel worKers (LAIK), a lightweight library for distributed index spaces and associated data containers for parallel programs supporting fault tolerance features. By giving LAIK control over data and its partitioning, the library can free compute nodes before they fail and do replication for rollback schemes on demand. Applications become more adaptive to changes of available resources. We show a simple example which integrates our LAIK library and present first results on a prototype implementation.en
dc.identifier.pissn0177-0454
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/31937
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V., Fachgruppe PARS
dc.relation.ispartofPARS-Mitteilungen: Vol. 34, Nr. 1
dc.titleLAIK: A Library for Fault Tolerant Distribution of Global Data for Parallel Applicationsen
dc.typeText/Journal Article
gi.citation.endPage150
gi.citation.publisherPlaceBerlin
gi.citation.startPage140

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
PARS-2017_paper_12.pdf
Größe:
236.23 KB
Format:
Adobe Portable Document Format