Automatic speech/music discrimination for broadcast signals

Automatic speech/music discrimination describes the task of automatically detecting speech and music audio within a recording. This is useful for a great number of tasks in both research and industry. In particular, this approach can be used for broadcast signals (e.g. from TV or radio stations) in order to determine the amount of music played. The results can then be used for various reporting purposes (e.g. for royalty collection societies such as the German GEMA). Speech/music discrimination is commonly performed by using machine learning technologies, where models are first trained on manually annotated data, and can then be used to classify previously unseen audio data. In this paper, we give an overview over the applications and the state of the art of speech/music discrimination. Afterwards, we present our approaches based on a set of audio features, Gaussian Mixture Models and Deep Learning. Finally, we give suggestions for the direction of new research into this topic.

Kruspe, Anna; Zapf, Dominik; Lukashevich, Hanna (2017): Automatic speech/music discrimination for broadcast signals. INFORMATIK 2017. DOI: 10.18420/in2017_10. Gesellschaft für Informatik, Bonn. PISSN: 1617-5468. ISBN: 978-3-88579-669-5. pp. 151-162. Musik trifft Informatik. Chemnitz. 25.-29. September 2017

Schlagwörter

speech/music discrimination , music/speech classification , music detection , music analysis

DOI

10.18420/in2017_10

Sammlungen

P275 - INFORMATIK 2017

Komplettanzeige

Automatic speech/music discrimination for broadcast signals

Volltext URI

Dokumententyp

Dateien

Zusatzinformation

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Quelle

Verlag

Zusammenfassung

Beschreibung

Schlagwörter

Zitierform

DOI

Tags

Sammlungen