Berger, HelmutKöhle, MonikaMerkl, DieterKaschek, RolandMayr, Heinrich C.Liddle, Stephen2019-10-112019-10-1120053-88579-392-Xhttps://dl.gi.de/handle/20.500.12116/28349This paper provides an analysis of multi-class e-mail categorization performance. In order to investigate this issue, the quality of various classification algorithms based on two distinct document representation formalisms is compared. In particular, both a standard word-based document representation as well as a character n-gram document representation is used. The latter is regarded as highly noise-tolerant and was originally proposed for automatic language identification and as a convenient means for producing compact document indices. Furthermore the impact of using available e-mail specific meta-information on classification performance is explored and the findings are presented.enOn the impact of document representation on classifier performance in e-mail categorizationText/Conference Paper1617-5468