Improving ASR for continuous thai words using ANN/HMM

The baseline system of an automatic speech recognition normally uses Mel- Frequency Cepstral Coefficients (MFCC) as feature vectors. However, for tonal language like Thai, tone information is one of the important features which can be used to improve the accuracy of recognition. This paper proposes a method for building an acoustic model for Thai-ASR using a combination of MFCC and tone information as an input feature vector. In addition, we apply Artificial Neural Network (ANN) multilayer perceptrons to estimate the posterior probabilities of a class model given a sequence of observation input. The performance of the ANN approach is compared with the Gaussian Mixture Model (GMM) used in the Hidden Markov Model Toolkit (HTK). The experiments were carried out with 2-grams and 3-grams of language model. The training and test data sets were prepared from reading speech of ten Aesop's stories from 5 male and 5 female speakers. The results showed that the proposed method can be used to improve the performance of Thai-ASR in term of reducing word error rate.

Sodanil, Maleerat; Nitsuwat, Supot; Haruechaiyasak, Choochart (2010): Improving ASR for continuous thai words using ANN/HMM. 10th International Conferenceon Innovative Internet Community Systems (I2CS) – Jubilee Edition 2010 –. Bonn: Gesellschaft für Informatik e.V.. PISSN: 1617-5468. ISBN: 978-3-88579-259-8. pp. 247-256. Regular Research Papers. Bangkok, Thailand. June 3-5, 2010

Sammlungen

P165 - I2CS: 10th International Conference on Innovative Internet Community Systems - Jubilee Edition 2010 -

Komplettanzeige

Improving ASR for continuous thai words using ANN/HMM

Volltext URI

Dokumententyp

Dateien

Zusatzinformation

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Quelle

Verlag

Zusammenfassung

Beschreibung

Schlagwörter

Zitierform

DOI

Tags

Sammlungen