Logo des Repositoriums
 

SubRosa: Determining Movie Similarities based on Subtitles

dc.contributor.authorLuhmann, Jan
dc.contributor.authorBurghardt, Manuel
dc.contributor.authorTiepmar, Jochen
dc.contributor.editorReussner, Ralf H.
dc.contributor.editorKoziolek, Anne
dc.contributor.editorHeinrich, Robert
dc.date.accessioned2021-01-27T13:33:18Z
dc.date.available2021-01-27T13:33:18Z
dc.date.issued2021
dc.description.abstractFor streaming websites, media shopping platforms and movie databases, movie recommendation systems have become an important technology, where mostly hybrid methods of collaborative and content-based filtering on the basis of user ratings and user-generated content have proven to be effective. However, these methods can lead to popularity-biased results that show an under-representation of those movies for which only little user-generated data exists. In this paper we will discuss the possibility of generating movie recommendations that are not based on user-generated data or metadata, but solely on the content of the movies themselves, confining ourselves to movie dialog. We extract low-level features from movie subtitles by using methods from Information Retrieval, Natural Language Processing and Stylometry, and examine a possible correlation of these features' similarity with the overall movie similarity. In addition we present a novel web application called SubRosa (http://ch01.informatik.uni-leipzig.de:5001/), which can be used to interactively compare the results of different feature combinations.en
dc.identifier.doi10.18420/inf2020_119
dc.identifier.isbn978-3-88579-701-2
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/34708
dc.language.isoen
dc.publisherGesellschaft für Informatik, Bonn
dc.relation.ispartofINFORMATIK 2020
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-307
dc.subjectMovie Similarity
dc.subjectSubtitles Processing
dc.subjectInformation Retrieval
dc.subjectStylometry
dc.subjectNatural Language Processing
dc.titleSubRosa: Determining Movie Similarities based on Subtitlesen
gi.citation.endPage1280
gi.citation.startPage1271
gi.conference.date28. September - 2. Oktober 2020
gi.conference.locationKarlsruhe
gi.conference.sessiontitleMethoden und Anwendungen der Computational Humanities

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
C26-1.pdf
Größe:
1.13 MB
Format:
Adobe Portable Document Format