Auflistung nach Autor:in "Sehili, Ziad"
1 - 5 von 5
Treffer pro Seite
Sortieroptionen
- ZeitschriftenartikelIterative Computation of Connected Graph Components with MapReduce(Datenbank-Spektrum: Vol. 14, No. 2, 2014) Kolb, Lars; Sehili, Ziad; Rahm, ErhardThe use of the MapReduce framework for iterative graph algorithms is challenging. To achieve high performance it is critical to limit the amount of intermediate results as well as the number of necessary iterations. We address these issues for the important problem of finding connected components in large graphs. We analyze an existing MapReduce algorithm, CC-MR, and present techniques to improve its performance including a memory-based connection of subgraphs in the map phase. Our evaluation with several large graph datasets shows that the improvements can substantially reduce the amount of generated data by up to a factor of 8.8 and runtime by up to factor of 3.5.
- TextdokumentMulti-Party Privacy Preserving Record Linkage in Dynamic Metric Space(BTW 2021, 2021) Sehili, Ziad; Rohde, Florens; Franke, Martin; Rahm, ErhardWe propose and evaluate several approaches for multi-party privacy-preserving record linkage (MP-PPRL) for multiple data sources. To reduce the number of comparisons for scalability we propose a new pivot-based metric space approach that dynamically adapts the selection of pivots for additional sources and growing data volume. We investigate so-called early and late clustering schemes that either cluster matching records per additional source or holistically for all sources. A comprehensive evaluation for different datasets confirms the high effectiveness and efficiency of the proposed methods.
- KonferenzbeitragPrivacy preserving record linkage with ppjoin(Datenbanksysteme für Business, Technologie und Web (BTW 2015), 2015) Sehili, Ziad; Kolb, Lars; Borgs, Christian; Schnell, Rainer; Rahm, ErhardPrivacy-preserving record linkage (PPRL) becomes increasingly important to match and integrate records with sensitive data. PPRL not only has to preserve the anonymity of the persons or entities involved but should also be highly efficient and scalable to large datasets. We therefore investigate how to adapt PPJoin, one of the fastest approaches for regular record linkage, to PPRL resulting in a new approach called P4Join. The use of bit vectors for PPRL also allows us to devise a parallel execution of P4Join on GPUs. We evaluate the new approaches and compare their efficiency with a PPRL approach based on multibit trees.
- JournalScaDS Research on Scalable Privacy-preserving Record Linkage(Datenbank-Spektrum: Vol. 19, No. 1, 2019) Franke, Martin; Gladbach, Marcel; Sehili, Ziad; Rohde, Florens; Rahm, Erhard
- ZeitschriftenartikelSpeeding up Privacy Preserving Record Linkage for Metric Space Similarity Measures(Datenbank-Spektrum: Vol. 16, No. 3, 2016) Sehili, Ziad; Rahm, ErhardThe analysis of person-related data in Big Data applications faces the tradeoff of finding useful results while preserving a high degree of privacy. This is especially challenging when person-related data from multiple sources need to be integrated and analyzed. Privacy-preserving record linkage (PPRL) addresses this problem by encoding sensitive attribute values such that the identification of persons is prevented but records can still be matched. In this paper we study how to improve the efficiency and scalability of PPRL by restricting the search space for matching encoded records. We focus on similarity measures for metric spaces and investigate the use of M‑trees as well as pivot-based solutions. Our evaluation shows that the new schemes outperform previous filter approaches by an order of magnitude.