Semi-supervised topic modelling as a tool for hypothesis-driven forensic communication analysis
dc.contributor.author | Felser, Jenny | |
dc.contributor.author | Spranger, Michael | |
dc.contributor.editor | Klein, Maike | |
dc.contributor.editor | Krupka, Daniel | |
dc.contributor.editor | Winter, Cornelia | |
dc.contributor.editor | Gergeleit, Martin | |
dc.contributor.editor | Martin, Ludger | |
dc.date.accessioned | 2024-10-21T18:24:24Z | |
dc.date.available | 2024-10-21T18:24:24Z | |
dc.date.issued | 2024 | |
dc.description.abstract | Mobile communication data has become a crucial source of evidence in digital forensics. Nevertheless, the high volume of chat messages presents a challenge for investigators, mainly as only a tiny fraction is relevant to the case. Therefore, a method is needed to summarise the messages in a way that separates the forensically relevant parts from the irrelevant ones. Topic modelling can be beneficial as it automatically extracts the main ideas of the chats. However, more than traditional unsupervised topic modelling is needed for forensic data analysis as it is inherently hypothesis-driven. This research incorporates case-specific knowledge into topic modelling to extract topics that align with the investigator’s expectations. Two semi-supervised topic modelling algorithms and proposed extensions were compared using real-case data. The user study results suggested extending algorithms based on word embeddings could help find evidence for suspected topics. Furthermore, the study examined the correlation between these findings and topic coherence, a standard measure of automatic evaluation that did not reflect actual interpretability. | en |
dc.identifier.doi | 10.18420/inf2024_22 | |
dc.identifier.eissn | 2944-7682 | |
dc.identifier.isbn | 978-3-88579-746-3 | |
dc.identifier.issn | 2944-7682 | |
dc.identifier.pissn | 1617-5468 | |
dc.identifier.uri | https://dl.gi.de/handle/20.500.12116/45180 | |
dc.language.iso | en | |
dc.publisher | Gesellschaft für Informatik e.V. | |
dc.relation.ispartof | INFORMATIK 2024 | |
dc.relation.ispartofseries | Lecture Notes in Informatics (LNI) - Proceedings, Volume P-352 | |
dc.subject | forensic text analysis | |
dc.subject | topic modelling | |
dc.subject | hypothesis-driven analysis | |
dc.subject | semi-supervised | |
dc.title | Semi-supervised topic modelling as a tool for hypothesis-driven forensic communication analysis | en |
dc.type | Text/Conference Paper | |
gi.citation.endPage | 330 | |
gi.citation.publisherPlace | Bonn | |
gi.citation.startPage | 321 | |
gi.conference.date | 24.-26. September 2024 | |
gi.conference.location | Wiesbaden | |
gi.conference.sessiontitle | 4. International Workshop on Digital Forensics (IWDF4) |
Dateien
Originalbündel
1 - 1 von 1
Lade...
- Name:
- Felser_Spranger_Semi_supervised_topic_modelling.pdf
- Größe:
- 376.58 KB
- Format:
- Adobe Portable Document Format