Logo des Repositoriums
 

Parallel sorted neighborhood blocking with MapeReduce

dc.contributor.authorKolb, Lars
dc.contributor.authorThor, Andreas
dc.contributor.authorRahm, Erhard
dc.contributor.editorHärder, Theo
dc.contributor.editorLehner, Wolfgang
dc.contributor.editorMitschang, Bernhard
dc.contributor.editorSchöning, Harald
dc.contributor.editorSchwarz, Holger
dc.date.accessioned2019-01-17T10:36:48Z
dc.date.available2019-01-17T10:36:48Z
dc.date.issued2011
dc.description.abstractCloud infrastructures enable the efficient parallel execution of data-intensive tasks such as entity resolution on large datasets. We investigate challenges and possible solutions of using the MapReduce programming model for parallel entity resolution. In particular, we propose and evaluate two MapReduce-based implementations for Sorted Neighborhood blocking that either use multiple MapReduce jobs or apply a tailored data replication.en
dc.identifier.isbn978-3-88579-274-1
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/19619
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofDatenbanksysteme für Business, Technologie und Web (BTW)
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-180
dc.titleParallel sorted neighborhood blocking with MapeReduceen
dc.typeText/Conference Paper
gi.citation.endPage64
gi.citation.publisherPlaceBonn
gi.citation.startPage45
gi.conference.date02.-04.03.2011
gi.conference.locationKaiserslautern
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
45.pdf
Größe:
353.17 KB
Format:
Adobe Portable Document Format