Logo des Repositoriums
 

Semi-supervised topic modelling as a tool for hypothesis-driven forensic communication analysis

dc.contributor.authorFelser, Jenny
dc.contributor.authorSpranger, Michael
dc.contributor.editorKlein, Maike
dc.contributor.editorKrupka, Daniel
dc.contributor.editorWinter, Cornelia
dc.contributor.editorGergeleit, Martin
dc.contributor.editorMartin, Ludger
dc.date.accessioned2024-10-21T18:24:24Z
dc.date.available2024-10-21T18:24:24Z
dc.date.issued2024
dc.description.abstractMobile communication data has become a crucial source of evidence in digital forensics. Nevertheless, the high volume of chat messages presents a challenge for investigators, mainly as only a tiny fraction is relevant to the case. Therefore, a method is needed to summarise the messages in a way that separates the forensically relevant parts from the irrelevant ones. Topic modelling can be beneficial as it automatically extracts the main ideas of the chats. However, more than traditional unsupervised topic modelling is needed for forensic data analysis as it is inherently hypothesis-driven. This research incorporates case-specific knowledge into topic modelling to extract topics that align with the investigator’s expectations. Two semi-supervised topic modelling algorithms and proposed extensions were compared using real-case data. The user study results suggested extending algorithms based on word embeddings could help find evidence for suspected topics. Furthermore, the study examined the correlation between these findings and topic coherence, a standard measure of automatic evaluation that did not reflect actual interpretability.en
dc.identifier.doi10.18420/inf2024_22
dc.identifier.eissn2944-7682
dc.identifier.isbn978-3-88579-746-3
dc.identifier.issn2944-7682
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/45180
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofINFORMATIK 2024
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-352
dc.subjectforensic text analysis
dc.subjecttopic modelling
dc.subjecthypothesis-driven analysis
dc.subjectsemi-supervised
dc.titleSemi-supervised topic modelling as a tool for hypothesis-driven forensic communication analysisen
dc.typeText/Conference Paper
gi.citation.endPage330
gi.citation.publisherPlaceBonn
gi.citation.startPage321
gi.conference.date24.-26. September 2024
gi.conference.locationWiesbaden
gi.conference.sessiontitle4. International Workshop on Digital Forensics (IWDF4)

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
Felser_Spranger_Semi_supervised_topic_modelling.pdf
Größe:
376.58 KB
Format:
Adobe Portable Document Format