Textdokument
SCALA-Speech: An Interactive System for Finding and Analyzing Speech Content in Audio Data
Lade...
Volltext URI
Dokumententyp
Dateien
Zusatzinformation
Datum
2022
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik, Bonn
Zusammenfassung
Audio data does not contain as much static information as images and texts and thus analyses inherently require more time. Although in monitoring applications it is likely that large quantities of the captured audio files do not contain meaningful information, without prior knowledge investigators need to listen to all audio files in full length. In this work, a system for automatically finding and analyzing speech content in audio data is presented. The system provides different speech processing algorithms as well as a graphical interface (SCALA) for assisting investigators in audio analysis. The system consists of four components: speech detection, language recognition, speaker diarization/recognition and keyword spotting. SCALA-Speech structures audio data by recognizing speech regions, used languages and speaker changes, thus enabling investigators to listen to audio data more efficiently. Furthermore, specific speakers and keywords can be annotated and searched for. Usage of SCALA-Speech is demonstrated on audio tracks of videos linked in Twitter posts related to an exemplary topic.