Logo des Repositoriums
 
Konferenzbeitrag

Keyword extraction for text characterization

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2003

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

Keywords are valuable means for characterizing texts. In order to extract keywords we propose an efficient and robust, languageand domainindependent approach which is based on small word parts (quadgrams). The basic algorithm can be improved by re-examining and re-ranking keywords using edit distance (i.e. Levenshtein distance) and an algorithm based on the relativistic addition of velocities (here: weights). For the purpose of evaluation, we compare our approach to frequency-based keyword extraction (exemplary text collection: 45000 intranet documents in German and English).

Beschreibung

Renz, Ingrid; Ficzay, Andrea; Hitzler, Holger (2003): Keyword extraction for text characterization. Natural language processing and information systems. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 3-88579-358-X. pp. 228-234. Regular Research Papers. Burg (Spreewald). June 2003

Schlagwörter

Zitierform

DOI

Tags