Logo des Repositoriums
 

Mining Entity Rankings

dc.contributor.authorPal, K.
dc.contributor.authorReinartz, F.
dc.contributor.authorMichel, S.
dc.date.accessioned2018-01-10T13:20:29Z
dc.date.available2018-01-10T13:20:29Z
dc.date.issued2016
dc.description.abstractIn this paper, we propose models, algorithms, and implementation details of an approach that extract the most relevant entity rankings from large datasets. This is done in a fully automated way, as with large amounts of structured data, beyond well understood databases (schemas), manual solutions do not scale. The core task of our approach is to decide which categorical constraints, ranking order (descending or ascending), and length form together an interesting ranking. We make use of a model based on information entropy to find interesting/relevant categorical constraints and devise pruning conditions to avoid generating too many irrelevant rankings. We further investigate the skewness of the value distributions of ranking criteria to find suitable ranking dimensions and ranking order, and present an overall scoring model to assess the meaningfulness of a ranking. For each individual step of our approach, we discuss iterative MapReduce-based algorithms. Finally, the experimental evaluation on real-world data is reported where the users manually evaluate our approach of generating most relevant rankings.
dc.identifier.pissn1610-1995
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/11770
dc.publisherSpringer
dc.relation.ispartofDatenbank-Spektrum: Vol. 16, No. 1
dc.relation.ispartofseriesDatenbank-Spektrum
dc.titleMining Entity Rankings
dc.typeText/Journal Article
gi.citation.endPage38
gi.citation.startPage27

Dateien