Mining Entity Rankings

Pal, K.; Reinartz, F.; Michel, S.

Mining Entity Rankings

dc.contributor.author	Pal, K.
dc.contributor.author	Reinartz, F.
dc.contributor.author	Michel, S.
dc.date.accessioned	2018-01-10T13:20:29Z
dc.date.available	2018-01-10T13:20:29Z
dc.date.issued	2016
dc.description.abstract	In this paper, we propose models, algorithms, and implementation details of an approach that extract the most relevant entity rankings from large datasets. This is done in a fully automated way, as with large amounts of structured data, beyond well understood databases (schemas), manual solutions do not scale. The core task of our approach is to decide which categorical constraints, ranking order (descending or ascending), and length form together an interesting ranking. We make use of a model based on information entropy to find interesting/relevant categorical constraints and devise pruning conditions to avoid generating too many irrelevant rankings. We further investigate the skewness of the value distributions of ranking criteria to find suitable ranking dimensions and ranking order, and present an overall scoring model to assess the meaningfulness of a ranking. For each individual step of our approach, we discuss iterative MapReduce-based algorithms. Finally, the experimental evaluation on real-world data is reported where the users manually evaluate our approach of generating most relevant rankings.
dc.identifier.pissn	1610-1995
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/11770
dc.publisher	Springer
dc.relation.ispartof	Datenbank-Spektrum: Vol. 16, No. 1
dc.relation.ispartofseries	Datenbank-Spektrum
dc.title	Mining Entity Rankings
dc.type	Text/Journal Article
gi.citation.endPage	38
gi.citation.startPage	27

Sammlungen

Datenbank Spektrum 16(1) - März 2016

Mining Entity Rankings

Dateien

Sammlungen