Logo des Repositoriums
 

Keyword extraction for text characterization

dc.contributor.authorRenz, Ingrid
dc.contributor.authorFiczay, Andrea
dc.contributor.authorHitzler, Holger
dc.contributor.editorDüsterhöft, Antje
dc.contributor.editorThalheim, Bernhard
dc.date.accessioned2019-11-14T11:21:42Z
dc.date.available2019-11-14T11:21:42Z
dc.date.issued2003
dc.description.abstractKeywords are valuable means for characterizing texts. In order to extract keywords we propose an efficient and robust, languageand domainindependent approach which is based on small word parts (quadgrams). The basic algorithm can be improved by re-examining and re-ranking keywords using edit distance (i.e. Levenshtein distance) and an algorithm based on the relativistic addition of velocities (here: weights). For the purpose of evaluation, we compare our approach to frequency-based keyword extraction (exemplary text collection: 45000 intranet documents in German and English).en
dc.identifier.isbn3-88579-358-X
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/29889
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofNatural language processing and information systems
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-29
dc.titleKeyword extraction for text characterizationen
dc.typeText/Conference Paper
gi.citation.endPage234
gi.citation.publisherPlaceBonn
gi.citation.startPage228
gi.conference.dateJune 2003
gi.conference.locationBurg (Spreewald)
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
GI-Proceedings.29-19.pdf
Größe:
81.65 KB
Format:
Adobe Portable Document Format