Improving ASR for continuous thai words using ANN/HMM

Sodanil, MaleeratNitsuwat, SupotHaruechaiyasak, ChoochartEichler, GeraldKropf, PeterLechner, UlrikeMeesad, PhayungUnger, Herwig2019-01-112019-01-112010978-3-88579-259-8https://dl.gi.de/handle/20.500.12116/19020The baseline system of an automatic speech recognition normally uses Mel- Frequency Cepstral Coefficients (MFCC) as feature vectors. However, for tonal language like Thai, tone information is one of the important features which can be used to improve the accuracy of recognition. This paper proposes a method for building an acoustic model for Thai-ASR using a combination of MFCC and tone information as an input feature vector. In addition, we apply Artificial Neural Network (ANN) multilayer perceptrons to estimate the posterior probabilities of a class model given a sequence of observation input. The performance of the ANN approach is compared with the Gaussian Mixture Model (GMM) used in the Hidden Markov Model Toolkit (HTK). The experiments were carried out with 2-grams and 3-grams of language model. The training and test data sets were prepared from reading speech of ten Aesop's stories from 5 male and 5 female speakers. The results showed that the proposed method can be used to improve the performance of Thai-ASR in term of reducing word error rate.enImproving ASR for continuous thai words using ANN/HMMText/Conference Paper1617-5468