Mining Entity Rankings

Pal, K.; Reinartz, F.; Michel, S.

Zeitschriftenartikel

Mining Entity Rankings

Dokumententyp

Text/Journal Article

Datum

2016

Autor:innen

Pal, K.

Reinartz, F.

Michel, S.

Quelle

Datenbank-Spektrum: Vol. 16, No. 1

Verlag

Springer

Zusammenfassung

In this paper, we propose models, algorithms, and implementation details of an approach that extract the most relevant entity rankings from large datasets. This is done in a fully automated way, as with large amounts of structured data, beyond well understood databases (schemas), manual solutions do not scale. The core task of our approach is to decide which categorical constraints, ranking order (descending or ascending), and length form together an interesting ranking. We make use of a model based on information entropy to find interesting/relevant categorical constraints and devise pruning conditions to avoid generating too many irrelevant rankings. We further investigate the skewness of the value distributions of ranking criteria to find suitable ranking dimensions and ranking order, and present an overall scoring model to assess the meaningfulness of a ranking. For each individual step of our approach, we discuss iterative MapReduce-based algorithms. Finally, the experimental evaluation on real-world data is reported where the users manually evaluate our approach of generating most relevant rankings.

Pal, K.; Reinartz, F.; Michel, S. (2016): Mining Entity Rankings. Datenbank-Spektrum: Vol. 16, No. 1. Springer. PISSN: 1610-1995. pp. 27-38

Sammlungen

Datenbank Spektrum 16(1) - März 2016

Komplettanzeige

Mining Entity Rankings

Volltext URI

Dokumententyp

Zusatzinformation

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Quelle

Verlag

Zusammenfassung

Beschreibung

Schlagwörter

Zitierform

DOI

Tags

Sammlungen