Aggregate-based Training Phase for ML-based Cardinality Estimation

Woltmann, Lucas; Hartmann, Claudio; Habich, Dirk; Lehner, Wolfgang

Aggregate-based Training Phase for ML-based Cardinality Estimation

dc.contributor.author	Woltmann, Lucas
dc.contributor.author	Hartmann, Claudio
dc.contributor.author	Habich, Dirk
dc.contributor.author	Lehner, Wolfgang
dc.contributor.editor	Kai-Uwe Sattler
dc.contributor.editor	Melanie Herschel
dc.contributor.editor	Wolfgang Lehner
dc.date.accessioned	2021-03-16T07:57:13Z
dc.date.available	2021-03-16T07:57:13Z
dc.date.issued	2021
dc.description.abstract	Cardinality estimation is a fundamental task in database query processing and optimization. As shown in recent papers, machine learning (ML)-based approaches may deliver more accurate cardinality estimations than traditional approaches. However, a lot of training queries have to be executed during the model training phase to learn a data-dependent ML model making it very time-consuming. Many of those training or example queries use the same base data, have the same query structure, and only differ in their selective predicates. To speed up the model training phase, our core idea is to determine a predicate-independent pre-aggregation of the base data and to execute the example queries over this pre-aggregated data. Based on this idea, we present a specific aggregate-based training phase for ML-based cardinality estimation approaches in this paper. As we are going to show with different workloads in our evaluation, we are able to achieve an average speedup of 63 with our aggregate-based training phase and thus outperform indexes.	en
dc.identifier.doi	10.18420/btw2021-07
dc.identifier.isbn	978-3-88579-705-0
dc.identifier.pissn	1617-5468
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/35812
dc.language.iso	en
dc.publisher	Gesellschaft für Informatik, Bonn
dc.relation.ispartof	BTW 2021
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) - Proceedings, Volume P-311
dc.subject	cardinality estimation
dc.subject	machine learning
dc.subject	database support
dc.subject	pre-aggregation
dc.title	Aggregate-based Training Phase for ML-based Cardinality Estimation	en
gi.citation.endPage	154
gi.citation.startPage	135
gi.conference.date	13.-17. September 2021
gi.conference.location	Dresden
gi.conference.sessiontitle	ML & Data Science

Dateien

Originalbündel

1 - 1 von 1

Name:: A2-1.pdf
Größe:: 1.63 MB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P311 - BTW2021- Datenbanksysteme für Business, Technologie und Web