Logo des Repositoriums
 
Zeitschriftenartikel

Internet Corpora: A Challenge for Linguistic Processing

Vorschaubild nicht verfügbar

Volltext URI

Dokumententyp

Text/Journal Article

Zusatzinformation

Datum

2015

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Springer

Zusammenfassung

Natural language processing tools are mostly developed for and optimized on newspaper texts, and often show a substantial performance drop when applied to other types of texts such as Twitter feeds, chat data or Internet forum posts. We explore a range of easy-to-implement methods of adapting existing part-of-speech taggers to improve their performance on Internet texts. Our results show that these methods can improve tagger performance substantially.

Beschreibung

Horbach, Andrea; Thater, Stefan; Steffen, Diana; Fischer, Peter M.; Witt, Andreas; Pinkal, Manfred (2015): Internet Corpora: A Challenge for Linguistic Processing. Datenbank-Spektrum: Vol. 15, No. 1. Springer. PISSN: 1610-1995. pp. 41-47

Schlagwörter

Computer-mediated communication, Natural language processing, Part-of-speech tagging

Zitierform

DOI

Tags