dc.contributor.author | Patino, Jose | |
dc.contributor.author | Delgado, Héctor | |
dc.contributor.author | Evans, Nicholas | |
dc.contributor.editor | Brömme, Arslan | |
dc.contributor.editor | Busch, Christoph | |
dc.contributor.editor | Dantcheva, Antitza | |
dc.contributor.editor | Rathgeb, Christian | |
dc.contributor.editor | Uhl, Andreas | |
dc.date.accessioned | 2019-06-17T10:00:17Z | |
dc.date.available | 2019-06-17T10:00:17Z | |
dc.date.issued | 2018 | |
dc.identifier.isbn | 978-3-88579-676-4 | |
dc.identifier.issn | 1617-5468 | |
dc.identifier.uri | http://dl.gi.de/handle/20.500.12116/23786 | |
dc.description.abstract | Low-latency speaker spotting (LLSS) calls for the rapid detection of known speakers
within multi-speaker audio streams. While previous work showed the potential to develop efficient
LLSS solutions by combining speaker diarization and speaker detection within an online processing
framework, it failed to move significantly beyond the traditional definition of diarization. This paper
shows that the latter needs rethinking and that a diarization sub-system tailored to the end application,
rather than to the minimisation of the diarization error rate, can improve LLSS performance.
The proposed selective cluster enrichment algorithm is used to guide the diarization system to better
model segments within a multi-speaker audio stream and hence detect more reliably a given target
speaker. The LLSS solution reported in this paper shows that target speakers can be detected with a
16% equal error rate after having been active in multi-speaker audio streams for only 15 seconds. | en |
dc.language.iso | en | |
dc.publisher | Köllen Druck+Verlag GmbH | |
dc.relation.ispartof | BIOSIG 2018 - Proceedings of the 17th International Conference of the Biometrics Special Interest Group | |
dc.relation.ispartofseries | Lecture Notes in Informatics (LNI) - Proceedings, Volume P-283 | |
dc.subject | low-latency speaker spotting | |
dc.subject | speaker detection | |
dc.subject | speaker diarization | |
dc.title | Enhanced low-latency speaker spotting using selective cluster enrichment | en |
dc.type | Text/Conference Paper | |
dc.pubPlace | Bonn | |
mci.conference.location | Darmstadt | |
mci.conference.date | 26.-28. September 2018 | |