Logo des Repositoriums
 
Konferenzbeitrag

Semi-supervised topic modelling as a tool for hypothesis-driven forensic communication analysis

Lade...
Vorschaubild

Volltext URI

Dokumententyp

Text/Conference Paper

Zusatzinformation

Datum

2024

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Gesellschaft für Informatik e.V.

Zusammenfassung

Mobile communication data has become a crucial source of evidence in digital forensics. Nevertheless, the high volume of chat messages presents a challenge for investigators, mainly as only a tiny fraction is relevant to the case. Therefore, a method is needed to summarise the messages in a way that separates the forensically relevant parts from the irrelevant ones. Topic modelling can be beneficial as it automatically extracts the main ideas of the chats. However, more than traditional unsupervised topic modelling is needed for forensic data analysis as it is inherently hypothesis-driven. This research incorporates case-specific knowledge into topic modelling to extract topics that align with the investigator’s expectations. Two semi-supervised topic modelling algorithms and proposed extensions were compared using real-case data. The user study results suggested extending algorithms based on word embeddings could help find evidence for suspected topics. Furthermore, the study examined the correlation between these findings and topic coherence, a standard measure of automatic evaluation that did not reflect actual interpretability.

Beschreibung

Felser, Jenny; Spranger, Michael (2024): Semi-supervised topic modelling as a tool for hypothesis-driven forensic communication analysis. INFORMATIK 2024. DOI: 10.18420/inf2024_22. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-746-3. pp. 321-330. 4. International Workshop on Digital Forensics (IWDF4). Wiesbaden. 24.-26. September 2024

Zitierform

Tags