SubRosa: Determining Movie Similarities based on Subtitles
dc.contributor.author | Luhmann, Jan | |
dc.contributor.author | Burghardt, Manuel | |
dc.contributor.author | Tiepmar, Jochen | |
dc.contributor.editor | Reussner, Ralf H. | |
dc.contributor.editor | Koziolek, Anne | |
dc.contributor.editor | Heinrich, Robert | |
dc.date.accessioned | 2021-01-27T13:33:18Z | |
dc.date.available | 2021-01-27T13:33:18Z | |
dc.date.issued | 2021 | |
dc.description.abstract | For streaming websites, media shopping platforms and movie databases, movie recommendation systems have become an important technology, where mostly hybrid methods of collaborative and content-based filtering on the basis of user ratings and user-generated content have proven to be effective. However, these methods can lead to popularity-biased results that show an under-representation of those movies for which only little user-generated data exists. In this paper we will discuss the possibility of generating movie recommendations that are not based on user-generated data or metadata, but solely on the content of the movies themselves, confining ourselves to movie dialog. We extract low-level features from movie subtitles by using methods from Information Retrieval, Natural Language Processing and Stylometry, and examine a possible correlation of these features' similarity with the overall movie similarity. In addition we present a novel web application called SubRosa (http://ch01.informatik.uni-leipzig.de:5001/), which can be used to interactively compare the results of different feature combinations. | en |
dc.identifier.doi | 10.18420/inf2020_119 | |
dc.identifier.isbn | 978-3-88579-701-2 | |
dc.identifier.pissn | 1617-5468 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/34708 | |
dc.language.iso | en | |
dc.publisher | Gesellschaft für Informatik, Bonn | |
dc.relation.ispartof | INFORMATIK 2020 | |
dc.relation.ispartofseries | Lecture Notes in Informatics (LNI) - Proceedings, Volume P-307 | |
dc.subject | Movie Similarity | |
dc.subject | Subtitles Processing | |
dc.subject | Information Retrieval | |
dc.subject | Stylometry | |
dc.subject | Natural Language Processing | |
dc.title | SubRosa: Determining Movie Similarities based on Subtitles | en |
gi.citation.endPage | 1280 | |
gi.citation.startPage | 1271 | |
gi.conference.date | 28. September - 2. Oktober 2020 | |
gi.conference.location | Karlsruhe | |
gi.conference.sessiontitle | Methoden und Anwendungen der Computational Humanities |
Dateien
Originalbündel
1 - 1 von 1