Schmidt, ThomasWinterl, BrigitteMaul, MilenaSchark, AlinaVlad, AndreaWolff, ChristianDraude, ClaudeLange, MartinSick, Bernhard2019-08-272019-08-272019978-3-88579-689-3https://dl.gi.de/handle/20.500.12116/25044We present the results of a comparative evaluation study of five annotation tools with 50 participants in the context of sentiment and emotion annotation of literary texts. Ten participants per tool annotated 50 speeches of the play Emilia Galotti by G. E. Lessing. We evaluate the tools via standard usability and user experience questionnaires, by measuring the time needed for the annotation, and via semi-structured interviews. Based on the results we formulate a recommendation. In addition, we discuss and compare the usability metrics and methods to develop best practices for tool selection in similar contexts. Furthermore, we also highlight the relationship between inter-rater agreement and usability metrics as well as the effect of the chosen tool on annotation behavior.enSentiment AnnotationUsability EngineeringUsabilityInter-rater agreementAnnotationSentiment AnalysisEmotion AnalysisAnnotation ToolsInter-Rater Agreement and Usability: A Comparative Evaluation of Annotation Tools for Sentiment AnnotationText/Conference Paper10.18420/inf2019_ws121617-5468