Logo des Repositoriums
 
Konferenzbeitrag

Sampling with incremental mapreduce

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2015

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

The goal of this paper is to increase the computation speed of MapReduce jobs by reducing the accuracy of the result. Often, the timely processing is more important than the precision of the result. Hadoop has no built-in functionality for such an approximation technique, so the user has to implement sampling techniques manually. We introduce an automatic system for computing arithmetic approximations. The sampling is based on techniques from statistics and the extrapolation is done generically. This system is also extended by an incremental component which enables the reuse of already computed results to enlarge the sampling size. This can be used iteratively to further increase the sampling size and also the precision of the approximation. We present a transparent incremental sampling approach, so the developed components can be integrated in the Hadoop framework in a non-invasive manner.

Beschreibung

Schäfer, Marc; Schildgen, Johannes; Deßloch, Stefan (2015): Sampling with incremental mapreduce. Datenbanksysteme für Business, Technologie und Web (BTW 2015) - Workshopband. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-636-7. pp. 121-130. Hamburg. 2.-3. März 2015

Schlagwörter

Zitierform

DOI

Tags