Luhmann, JanBurghardt, ManuelTiepmar, JochenReussner, Ralf H.Koziolek, AnneHeinrich, Robert2021-01-272021-01-272021978-3-88579-701-2https://dl.gi.de/handle/20.500.12116/34708For streaming websites, media shopping platforms and movie databases, movie recommendation systems have become an important technology, where mostly hybrid methods of collaborative and content-based filtering on the basis of user ratings and user-generated content have proven to be effective. However, these methods can lead to popularity-biased results that show an under-representation of those movies for which only little user-generated data exists. In this paper we will discuss the possibility of generating movie recommendations that are not based on user-generated data or metadata, but solely on the content of the movies themselves, confining ourselves to movie dialog. We extract low-level features from movie subtitles by using methods from Information Retrieval, Natural Language Processing and Stylometry, and examine a possible correlation of these features' similarity with the overall movie similarity. In addition we present a novel web application called SubRosa (http://ch01.informatik.uni-leipzig.de:5001/), which can be used to interactively compare the results of different feature combinations.enMovie SimilaritySubtitles ProcessingInformation RetrievalStylometryNatural Language ProcessingSubRosa: Determining Movie Similarities based on Subtitles10.18420/inf2020_1191617-5468