Logo des Repositoriums
 

Curricular SincNet: Towards Robust Deep Speaker Recognition by Emphasizing Hard Samples in Latent Space

dc.contributor.authorChowdhury, Labib
dc.contributor.authorKamal, Mustafa
dc.contributor.authorTasnim, Najia
dc.contributor.authorMohammed, Nabeel
dc.contributor.editorBrömme, Arslan
dc.contributor.editorBusch, Christoph
dc.contributor.editorDamer, Naser
dc.contributor.editorDantcheva, Antitza
dc.contributor.editorGomez-Barrero, Marta
dc.contributor.editorRaja, Kiran
dc.contributor.editorRathgeb, Christian
dc.contributor.editorSequeira, Ana
dc.contributor.editorUhl, Andreas
dc.date.accessioned2021-10-04T08:43:51Z
dc.date.available2021-10-04T08:43:51Z
dc.date.issued2021
dc.description.abstractDeep learning models have become an increasingly preferred option for biometric recognition systems; such as speaker recognition. SincNet, a deep neural network architecture gained popularity in speaker recognition tasks, due to its use of parameterized sinc functions that allow it to work directly on the speech signal. The original SincNet architecture uses the softmax loss which may not be the most suitable choice for recognition-based tasks, as such loss functions do not impose inter-class margins nor does it differentiate between easy and hard training samples. Curriculum learning, particularly those leveraging angular margin-based losses has proven to be very successful in other biometric applications such as face recognition. The advantage of such a curriculum learning-based techniques is that it will impose inter-class margins as well as taking to account easy and hard samples. In this paper, we propose Curricular SincNet (CL-SincNet), an improved SincNet model where we use a curricular loss function to do the training on the SincNet architecture. The proposed model is evaluated on multiple datasets using intra-dataset and inter-dataset evaluation protocol. In both settings, the model performs competitively with other previously published work and in the case of inter-dataset testing, it achieves the best overall results with a reduction of 4% error rate compare to SincNet and other published work.en
dc.identifier.isbn978-3-88579-709-8
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/37471
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofBIOSIG 2021 - Proceedings of the 20th International Conference of the Biometrics Special Interest Group
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-315
dc.subjectBiometric Authentication
dc.subjectSpeaker Recognition
dc.subjectCurriculum Learning
dc.titleCurricular SincNet: Towards Robust Deep Speaker Recognition by Emphasizing Hard Samples in Latent Spaceen
dc.typeText/Conference Paper
gi.citation.endPage50
gi.citation.publisherPlaceBonn
gi.citation.startPage43
gi.conference.date15.-17. September 2021
gi.conference.locationInternational Digital Conference
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
biosig2021_proceedings_05.pdf
Größe:
107.03 KB
Format:
Adobe Portable Document Format