Internet Corpora: A Challenge for Linguistic Processing
dc.contributor.author | Horbach, Andrea | |
dc.contributor.author | Thater, Stefan | |
dc.contributor.author | Steffen, Diana | |
dc.contributor.author | Fischer, Peter M. | |
dc.contributor.author | Witt, Andreas | |
dc.contributor.author | Pinkal, Manfred | |
dc.date.accessioned | 2018-01-10T13:19:59Z | |
dc.date.available | 2018-01-10T13:19:59Z | |
dc.date.issued | 2015 | |
dc.description.abstract | Natural language processing tools are mostly developed for and optimized on newspaper texts, and often show a substantial performance drop when applied to other types of texts such as Twitter feeds, chat data or Internet forum posts. We explore a range of easy-to-implement methods of adapting existing part-of-speech taggers to improve their performance on Internet texts. Our results show that these methods can improve tagger performance substantially. | |
dc.identifier.pissn | 1610-1995 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/11731 | |
dc.publisher | Springer | |
dc.relation.ispartof | Datenbank-Spektrum: Vol. 15, No. 1 | |
dc.relation.ispartofseries | Datenbank-Spektrum | |
dc.subject | Computer-mediated communication | |
dc.subject | Natural language processing | |
dc.subject | Part-of-speech tagging | |
dc.title | Internet Corpora: A Challenge for Linguistic Processing | |
dc.type | Text/Journal Article | |
gi.citation.endPage | 47 | |
gi.citation.startPage | 41 |