Efficient Data-Parallel Cumulative Aggregates for Large-Scale Machine Learning

Boehm, Matthias; Evfimievski, Alexandre; Reinwald, Berthold

Efficient Data-Parallel Cumulative Aggregates for Large-Scale Machine Learning

dc.contributor.author	Boehm, Matthias
dc.contributor.author	Evfimievski, Alexandre
dc.contributor.author	Reinwald, Berthold
dc.contributor.editor	Grust, Torsten
dc.contributor.editor	Naumann, Felix
dc.contributor.editor	Böhm, Alexander
dc.contributor.editor	Lehner, Wolfgang
dc.contributor.editor	Härder, Theo
dc.contributor.editor	Rahm, Erhard
dc.contributor.editor	Heuer, Andreas
dc.contributor.editor	Klettke, Meike
dc.contributor.editor	Meyer, Holger
dc.date.accessioned	2019-04-11T07:21:20Z
dc.date.available	2019-04-11T07:21:20Z
dc.date.issued	2019
dc.description.abstract	Cumulative aggregates are often overlooked yet important operations in large-scale machine learning (ML) systems. Examples are prefix sums and more complex aggregates, but also preprocessing techniques such as the removal of empty rows or columns. These operations are challenging to parallelize over distributed, blocked matrices—as commonly used in ML systems—due to recursive data dependencies. However, computing prefix sums is a classic example of a presumably sequential operation that can be efficiently parallelized via aggregation trees. In this paper, we describe an efficient framework for data-parallel cumulative aggregates over distributed, blocked matrices. The basic idea is a self-similar operator composed of a forward cascade that reduces the data size by orders of magnitude per iteration until the data fits in local memory, a local cumulative aggregate over the partial aggregates, and a backward cascade to produce the final result. We also generalize this framework for complex cumulative aggregates of sum-product expressions, and characterize the class of supported operations. Finally, we describe the end-to-end compiler and runtime integration into SystemML, and the use of cumulative aggregates in other operations. Our experiments show that this framework achieves both high performance for moderate data sizes and good scalability.	en
dc.identifier.doi	10.18420/btw2019-17
dc.identifier.isbn	978-3-88579-683-1
dc.identifier.pissn	1617-5468
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/21701
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik, Bonn
dc.relation.ispartof	BTW 2019
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) – Proceedings, Volume P-289
dc.subject	Cumulative Aggregates
dc.subject	ML Systems
dc.subject	Large-Scale Machine Learning
dc.subject	Data-Parallel Computation
dc.subject	Apache SystemML
dc.title	Efficient Data-Parallel Cumulative Aggregates for Large-Scale Machine Learning	en
gi.citation.endPage	286
gi.citation.startPage	267
gi.conference.date	4.-8. März 2019
gi.conference.location	Rostock
gi.conference.sessiontitle	Wissenschaftliche Beiträge

Dateien

Originalbündel

1 - 1 von 1

Name:: B6-2.pdf
Größe:: 839.21 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P289 - BTW2019 - Datenbanksysteme für Business, Technologie und Web