Show simple item record

dc.contributor.authorSobe, Peter
dc.contributor.authorPazak, Denny
dc.contributor.authorStiehr, Martin
dc.date.accessioned2017-06-29T14:30:50Z
dc.date.available2017-06-29T14:30:50Z
dc.date.issued2015
dc.identifier.issn0177-0454
dc.description.abstractData deduplication is a technique for detection and elimination of duplicated data blocks in storage systems. It creates a set of unique data blocks and places references accordingly, which allows to access the original data within a reduced amount of data blocks. For deduplication, hashes of data blocks are calculated and compared in order to detect and remove duplicates. It can be seen as an alternative to data compression that allows to save storage capacity in large storage systems. A storage capacity saving is reached at the cost of additional computational effort that originates when data blocks are written and updated. This computational effort increases with the size of the storage system. On a single processor system, deduplication influences the performance in a negative way, particularly the write and update rates drop. The utilization of parallelism is a rewarding task to compensate this performance drop, particularly for hash value calculations and comparisons of hashes. In this paper we explain in which parts of a deduplication system it is worth to parallelize and how. Exemplarily, we show the performance results of two deduplication algorithms and their parallel implementations, based on multithreading and on parallel GPU computations.en
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V., Fachgruppe PARS
dc.relation.ispartofPARS-Mitteilungen: Vol. 32, Nr. 1
dc.titleParallel Processing for Data Deduplicationen
dc.typeText/Journal Article
dc.pubPlaceBerlin


Files in this item

Thumbnail

Show simple item record