Enhanced low-latency speaker spotting using selective cluster enrichment

Patino, Jose; Delgado, Héctor; Evans, Nicholas

Enhanced low-latency speaker spotting using selective cluster enrichment

dc.contributor.author	Patino, Jose
dc.contributor.author	Delgado, Héctor
dc.contributor.author	Evans, Nicholas
dc.contributor.editor	Brömme, Arslan
dc.contributor.editor	Busch, Christoph
dc.contributor.editor	Dantcheva, Antitza
dc.contributor.editor	Rathgeb, Christian
dc.contributor.editor	Uhl, Andreas
dc.date.accessioned	2019-06-17T10:00:17Z
dc.date.available	2019-06-17T10:00:17Z
dc.date.issued	2018
dc.description.abstract	Low-latency speaker spotting (LLSS) calls for the rapid detection of known speakers within multi-speaker audio streams. While previous work showed the potential to develop efficient LLSS solutions by combining speaker diarization and speaker detection within an online processing framework, it failed to move significantly beyond the traditional definition of diarization. This paper shows that the latter needs rethinking and that a diarization sub-system tailored to the end application, rather than to the minimisation of the diarization error rate, can improve LLSS performance. The proposed selective cluster enrichment algorithm is used to guide the diarization system to better model segments within a multi-speaker audio stream and hence detect more reliably a given target speaker. The LLSS solution reported in this paper shows that target speakers can be detected with a 16% equal error rate after having been active in multi-speaker audio streams for only 15 seconds.	en
dc.identifier.isbn	978-3-88579-676-4
dc.identifier.pissn	1617-5468
dc.identifier.uri	https://dl.gi.de/handle/20.500.12116/23786
dc.language.iso	en
dc.publisher	Köllen Druck+Verlag GmbH
dc.relation.ispartof	BIOSIG 2018 - Proceedings of the 17th International Conference of the Biometrics Special Interest Group
dc.relation.ispartofseries	Lecture Notes in Informatics (LNI) - Proceedings, Volume P-283
dc.subject	low-latency speaker spotting
dc.subject	speaker detection
dc.subject	speaker diarization
dc.title	Enhanced low-latency speaker spotting using selective cluster enrichment	en
dc.type	Text/Conference Paper
gi.citation.publisherPlace	Bonn
gi.conference.date	26.-28. September 2018
gi.conference.location	Darmstadt

Dateien

Originalbündel

1 - 1 von 1

Name:: BIOSIG_2018_paper_61.pdf
Größe:: 167.65 KB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

P282 - BIOSIG 2018 - Proceedings of the 17th International Conference of the Biometrics Special Interest Group