Logo des Repositoriums
 

Preserving Recomputability of Results from Big Data Transformation Workflows

dc.contributor.authorKricke, Matthias
dc.contributor.authorGrimmer, Martin
dc.contributor.authorSchmeißer, Michael
dc.date.accessioned2018-01-08T08:08:01Z
dc.date.available2018-01-08T08:08:01Z
dc.date.issued2017
dc.description.abstractThe ability to recompute results from raw data at any time is important for data-driven companies to ensure data stability and to selectively incorporate new data into an already delivered data product. However, data transformation processes are heterogeneous and it is possible that manual work of domain experts is part of the process to create a deliverable data product. Domain experts and their work are expensive and time consuming, a recomputation process needs the ability of automatically adding former human interactions. It becomes even more challenging when external systems are used or data changes over time. In this paper, we propose a system architecture which ensures recomputability of results from big data transformation workflows on internal and external systems by using distributed key-value data stores. Furthermore, the system architecture will contain the possibility of incorporating human interactions of former data transformation processes. We will describe how our approach significantly relieves external systems and at the same time increases the performance of the big data transformation workflows.
dc.identifier.pissn1610-1995
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/11021
dc.publisherSpringer
dc.relation.ispartofDatenbank-Spektrum: Vol. 17, No. 3
dc.relation.ispartofseriesDatenbank-Spektrum
dc.subjectBigData
dc.subjectBitemporality
dc.subjectRecomputability
dc.subjectSystem architecture
dc.subjectTime-to-consistency
dc.titlePreserving Recomputability of Results from Big Data Transformation Workflows
dc.typeText/Journal Article
gi.citation.endPage253
gi.citation.startPage245

Dateien