Logo des Repositoriums
 

Selection of representative documents for clusters in a document collection

dc.contributor.authorGelbukh, Alexander
dc.contributor.authorAlexandrov, Mikhail
dc.contributor.authorBourek, Ales
dc.contributor.authorMakagonov, Pavel
dc.contributor.editorDüsterhöft, Antje
dc.contributor.editorThalheim, Bernhard
dc.date.accessioned2019-11-14T11:21:40Z
dc.date.available2019-11-14T11:21:40Z
dc.date.issued2003
dc.description.abstractAn efficient way to explore a large document collection (e.g., the search results returned by a search engine) is to subdivide it into clusters of relatively similar documents, to get a general view of the collection and select its parts of particular interest. A way of presenting the clusters to the user is selection of a document in each cluster. For different purposes this can be done in different ways. We consider three cases: selection of the average, the "most typical," and the "least typical" document. The algorithms are given, which rely on a dictionary of keywords reflecting the topic of the user's interest. After clustering, we select a document in each cluster basing on its closeness to the other ones. Different distance measures are discussed; preliminary experimental results are presented. Our approach was implemented in the new version of Document Classifier system.en
dc.identifier.isbn3-88579-358-X
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/29880
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofNatural language processing and information systems
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-29
dc.titleSelection of representative documents for clusters in a document collectionen
dc.typeText/Conference Paper
gi.citation.endPage126
gi.citation.publisherPlaceBonn
gi.citation.startPage120
gi.conference.dateJune 2003
gi.conference.locationBurg (Spreewald)
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
GI-Proceedings.29-10.pdf
Größe:
126.27 KB
Format:
Adobe Portable Document Format