Logo des Repositoriums
 
Konferenzbeitrag

Topic detection based on the PageRank's clustering property

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2011

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

This paper introduces a method to cluster graphs of semantically related terms from texts using PageRank calculations for use in the field of text mining, e.g. to automatically discover different topics in a text corpus. It is evaluated by providing empirical results of tests by applying this method on real text corpora. It is shown that this application of the PageRank formula realizes suitable clustering such that the mean similarity between the terms in the clusters reaches a high level. A special state transition in the mean term similarity is discussed when analysing texts with stopwords.

Beschreibung

Kubek, Mario; Unger, Herwig (2011): Topic detection based on the PageRank's clustering property. 11th International Conference on Innovative Internet Community Systems (I2CS 2011). Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-280-2. pp. 139-148. Regular Research Papers. Berlin. June 15-17, 2011

Schlagwörter

Zitierform

DOI

Tags