Logo des Repositoriums
 
Zeitschriftenartikel

Mining Entity Rankings

Vorschaubild nicht verfügbar

Volltext URI

Dokumententyp

Text/Journal Article

Zusatzinformation

Datum

2016

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Springer

Zusammenfassung

In this paper, we propose models, algorithms, and implementation details of an approach that extract the most relevant entity rankings from large datasets. This is done in a fully automated way, as with large amounts of structured data, beyond well understood databases (schemas), manual solutions do not scale. The core task of our approach is to decide which categorical constraints, ranking order (descending or ascending), and length form together an interesting ranking. We make use of a model based on information entropy to find interesting/relevant categorical constraints and devise pruning conditions to avoid generating too many irrelevant rankings. We further investigate the skewness of the value distributions of ranking criteria to find suitable ranking dimensions and ranking order, and present an overall scoring model to assess the meaningfulness of a ranking. For each individual step of our approach, we discuss iterative MapReduce-based algorithms. Finally, the experimental evaluation on real-world data is reported where the users manually evaluate our approach of generating most relevant rankings.

Beschreibung

Pal, K.; Reinartz, F.; Michel, S. (2016): Mining Entity Rankings. Datenbank-Spektrum: Vol. 16, No. 1. Springer. PISSN: 1610-1995. pp. 27-38

Schlagwörter

Zitierform

DOI

Tags