Konferenzbeitrag
Speech recognition as a retrieval problem
Lade...
Volltext URI
Dokumententyp
Text/Conference Paper
Dateien
Zusatzinformation
Datum
2013
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Gesellschaft für Informatik e.V.
Zusammenfassung
Common approaches to automatic speech recognition (ASR) are based on training statistical models for the acoustics of speech. In our work, a retrieval-based ASR system is developed that does not rely on training and thus provides more flexible application. It is based on a set of known reference word utterances for each possibly occurring word in a test string. A test word string is identified by finding the most similar reference for each word by using an approach based on dynamic time warping (DTW). The DTW variant suitable for recognizing strings of connected words is called level-building DTW, proposed by Myers and Rabiner in 1981. It is using a level-bylevel iteration to match each word in the test utterance with the most similar reference. In our work, an ASR system for connected digit recognition based on level-building DTW is developed, evaluated and compared with a state-of-the-art HMM recognizer.