GI LogoGI Logo
  • Login
Digital Library
    • All of DSpace

      • Communities & Collections
      • Titles
      • Authors
      • By Issue Date
      • Subjects
    • This Collection

      • Titles
      • Authors
      • By Issue Date
      • Subjects
Digital Library Gesellschaft für Informatik e.V.
GI-DL
    • English
    • Deutsch
  • English 
    • English
    • Deutsch
View Item 
  •   DSpace Home
  • Lecture Notes in Informatics
  • Proceedings
  • BTW - Datenbanksysteme für Business, Technologie und Web
  • P311 - BTW2021- Datenbanksysteme für Business, Technologie und Web
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.
  •   DSpace Home
  • Lecture Notes in Informatics
  • Proceedings
  • BTW - Datenbanksysteme für Business, Technologie und Web
  • P311 - BTW2021- Datenbanksysteme für Business, Technologie und Web
  • View Item

Optimized Theta-Join Processing

Author:
Weise, Julian [DBLP] ;
Schmidl, Sebastian [DBLP] ;
Papenbrock, Thorsten [DBLP]
Abstract
The Theta-Join is a powerful operation to connect tuples of different relational tables based on arbitrary conditions. The operation is a fundamental requirement for many data-driven use cases, such as data cleaning, consistency checking, and hypothesis testing. However, processing theta-joins without equality predicates is an expensive operation, because basically all database management systems (DBMSs) translate theta-joins into a Cartesian product with a post-filter for non-matching tuple pairs. This seems to be necessary, because most join optimization techniques, such as indexing, hashing, bloom-filters, or sorting, do not work for theta-joins with combinations of inequality predicates based on <, ?, ?, ?, >. In this paper, we therefore study and evaluate optimization approaches for the efficient execution of theta-joins. More specifically, we propose a theta-join algorithm that exploits the high selectivity of theta-joins to prune most join candidates early; the algorithm also parallelizes and distributes the processing (over CPU cores and compute nodes, respectively) for scalable query processing. The algorithm is baked into our distributed in-memory database system prototype A2DB. Our evaluation on various real-world and synthetic datasets shows that A2DB significantly outperforms existing single-machine DBMSs including PostgreSQL and distributed data processing systems, such as Apache SparkSQL, in processing highly selective theta-join queries.
  • Citation
  • BibTeX
Weise, J., Schmidl, S. & Papenbrock, T., (2021). Optimized Theta-Join Processing. In: , ., , . & , . (Hrsg.), BTW 2021. Gesellschaft für Informatik, Bonn. (S. 59-78). DOI: 10.18420/btw2021-03
@inproceedings{mci/Weise2021,
author = {Weise, Julian AND Schmidl, Sebastian AND Papenbrock, Thorsten},
title = {Optimized Theta-Join Processing},
booktitle = {BTW 2021},
year = {2021},
editor = {Kai-Uwe Sattler AND Melanie Herschel AND Wolfgang Lehner} ,
pages = { 59-78 } ,
doi = { 10.18420/btw2021-03 },
publisher = {Gesellschaft für Informatik, Bonn},
address = {}
}
DateienGroesseFormatAnzeige
A1-3.pdf1.129Mb PDF View/Open

Sollte hier kein Volltext (PDF) verlinkt sein, dann kann es sein, dass dieser aus verschiedenen Gruenden (z.B. Lizenzen oder Copyright) nur in einer anderen Digital Library verfuegbar ist. Versuchen Sie in diesem Fall einen Zugriff ueber die verlinkte DOI: 10.18420/btw2021-03

Haben Sie fehlerhafte Angaben entdeckt? Sagen Sie uns Bescheid: Send Feedback

More Info

DOI: 10.18420/btw2021-03
ISBN: 978-3-88579-705-0
ISSN: 1617-5468
xmlui.MetaDataDisplay.field.date: 2021
Language: en (en)

Keywords

  • theta-join
  • query optimization
  • distributed computing
  • actor programming
Collections
  • P311 - BTW2021- Datenbanksysteme für Business, Technologie und Web [23]

Show full item record


About uns | FAQ | Help | Imprint | Datenschutz

Gesellschaft für Informatik e.V. (GI), Kontakt: Geschäftsstelle der GI
Diese Digital Library basiert auf DSpace.

 

 


About uns | FAQ | Help | Imprint | Datenschutz

Gesellschaft für Informatik e.V. (GI), Kontakt: Geschäftsstelle der GI
Diese Digital Library basiert auf DSpace.