Logo des Repositoriums
 

Internet Corpora: A Challenge for Linguistic Processing

dc.contributor.authorHorbach, Andrea
dc.contributor.authorThater, Stefan
dc.contributor.authorSteffen, Diana
dc.contributor.authorFischer, Peter M.
dc.contributor.authorWitt, Andreas
dc.contributor.authorPinkal, Manfred
dc.date.accessioned2018-01-10T13:19:59Z
dc.date.available2018-01-10T13:19:59Z
dc.date.issued2015
dc.description.abstractNatural language processing tools are mostly developed for and optimized on newspaper texts, and often show a substantial performance drop when applied to other types of texts such as Twitter feeds, chat data or Internet forum posts. We explore a range of easy-to-implement methods of adapting existing part-of-speech taggers to improve their performance on Internet texts. Our results show that these methods can improve tagger performance substantially.
dc.identifier.pissn1610-1995
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/11731
dc.publisherSpringer
dc.relation.ispartofDatenbank-Spektrum: Vol. 15, No. 1
dc.relation.ispartofseriesDatenbank-Spektrum
dc.subjectComputer-mediated communication
dc.subjectNatural language processing
dc.subjectPart-of-speech tagging
dc.titleInternet Corpora: A Challenge for Linguistic Processing
dc.typeText/Journal Article
gi.citation.endPage47
gi.citation.startPage41

Dateien