Logo des Repositoriums
 

RMG Sort: Radix-Partitioning-Based Multi-GPU Sorting

dc.contributor.authorIlic, Ivan
dc.contributor.authorTolovski, Ilin
dc.contributor.authorRabl, Tilmann
dc.contributor.editorKönig-Ries, Birgitta
dc.contributor.editorScherzinger, Stefanie
dc.contributor.editorLehner, Wolfgang
dc.contributor.editorVossen, Gottfried
dc.date.accessioned2023-02-23T13:59:47Z
dc.date.available2023-02-23T13:59:47Z
dc.date.issued2023
dc.description.abstractIn recent years, graphics processing units (GPUs) emerged as database accelerators due to their massive parallelism and high-bandwidth memory. Sorting is a core database operation with many applications, such as output ordering, index creation, grouping, and sort-merge joins. Many single-GPU sorting algorithms have been shown to outperform highly parallel CPU algorithms. Today's systems include multiple GPUs with direct high-bandwidth peer-to-peer (P2P) interconnects. However, previous multi-GPU sorting algorithms do not efficiently harness the P2P transfer capability of modern interconnects, such as NVLink and NVSwitch.In this paper, we propose RMG sort, a novel radix partitioning-based multi-GPU sorting algorithm. We present a most-significant-bit partitioning strategy that efficiently utilizes high-speed P2P interconnects while reducing inter-GPU communication. Independent of the number of GPUs, we exchange radix partitions between the GPUs in one all-to-all P2P swap. We evaluate RMG sort on two modern multi-GPU systems. Our experiments show that RMG sort scales well with the number of GPUs and outperforms a parallel CPU-based sort by up to 20x. Compared to two state-of-the-art merge-based multi-GPU sorting algorithms, we achieve speedups of up to 1.3x and 1.8x across both systems.en
dc.identifier.doi10.18420/BTW2023-15
dc.identifier.isbn978-3-88579-725-8
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/40319
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofBTW 2023
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-331
dc.subjectMulti-GPU sorting
dc.subjectradix partitioning
dc.subjecthigh-speed interconnects
dc.subjectdatabase acceleration
dc.titleRMG Sort: Radix-Partitioning-Based Multi-GPU Sortingen
dc.typeText/Conference Paper
gi.citation.endPage328
gi.citation.publisherPlaceBonn
gi.citation.startPage305
gi.conference.date06.-10. März 2023
gi.conference.locationDresden, Germany

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
B3-4.pdf
Größe:
632.85 KB
Format:
Adobe Portable Document Format