Auflistung nach Schlagwort "forensic text analysis"
1 - 2 von 2
Treffer pro Seite
Sortieroptionen
- TextdokumentRecommendation of Query Terms for Colloquial Texts in Forensic Text Analysis(INFORMATIK 2022, 2022) Felser,Jenny; Xi,Jian; Demus,Christoph; Labudde,Dirk; Spranger,MichaelTextual social media content and short messages have gained in importance as evidence in criminal investigations. Yet, the large number of textual data poses a great challenge for investigators. Even though text retrieval systems can assist in finding evidential messages, the success of the search still depends on entering appropriate search terms. However, for colloquial texts these are difficult to determine because one cannot be sure about what terms are used in the texts and might be of interest. Therefore, the aim is to develop a method that recommends keywords and searches phrases based on the underlying data. A particular challenge here is that the appropriate search terms are often non-obvious words that are not expected to be found in the data, but are particularly relevant. In total, four methods were evaluated for extracting and suggesting the most relevant terms and phrases using a real-life dataset. The best results were obtained with topic modeling considering syntagmatic relations.
- KonferenzbeitragSemi-supervised topic modelling as a tool for hypothesis-driven forensic communication analysis(INFORMATIK 2024, 2024) Felser, Jenny; Spranger, MichaelMobile communication data has become a crucial source of evidence in digital forensics. Nevertheless, the high volume of chat messages presents a challenge for investigators, mainly as only a tiny fraction is relevant to the case. Therefore, a method is needed to summarise the messages in a way that separates the forensically relevant parts from the irrelevant ones. Topic modelling can be beneficial as it automatically extracts the main ideas of the chats. However, more than traditional unsupervised topic modelling is needed for forensic data analysis as it is inherently hypothesis-driven. This research incorporates case-specific knowledge into topic modelling to extract topics that align with the investigator’s expectations. Two semi-supervised topic modelling algorithms and proposed extensions were compared using real-case data. The user study results suggested extending algorithms based on word embeddings could help find evidence for suspected topics. Furthermore, the study examined the correlation between these findings and topic coherence, a standard measure of automatic evaluation that did not reflect actual interpretability.