Zeitschriftenartikel
Evaluation of GPU-Compression Algorithms for CUDA-Aware MPI
Lade...
Volltext URI
Dokumententyp
Text/Journal Article
Dateien
Zusatzinformation
Datum
2024
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik e.V., Fachgruppe PARS
Zusammenfassung
This study evaluates an efficient compression algorithm suitable for use with CUDA-aware MPI, aiming to lessen the latency of extensive GPU message transfers. We examine the performance of various compression algorithms on distinct datasets. Ndzip emerges as the optimal compression algorithm for our needs. Our findings reveal that large message latency can improve depending on the dataset. However, latency may increase for non-compressible data due to overhead when using compression. With well-compressible data, the Cannon algorithm for dense matrix-matrix multiplication can improve performance by up to 30%. For data that is not highly compressible, there’s only a minor performance penalty, as the compression overhead remains relatively small.