ISCA Archive - Two level continuous speech recognition using demisyllable-based HMM word spotting (original) (raw)

Two level continuous speech recognition using demisyllable-based HMM word spotting

Eduardo Lleida, Jose B. Marino, Climent Nadeu, Albert Oliveras

This paper describes a two level Spanish Continuous Speech Recognition System based on Demisyllable HMM modelling, word-spotting and finite-state lexical and syntactic knowledge. The first level, the word level, is based on a spotting algorithm which takes as input the unknown utterance, the HMM of the reference demisyllable and the lexical knowledge in terms of a finite-state network. The output of the word level is a lattice of word hypothesis [1]. The second level, the phrase level, searches in a time-synchronous procedure the best sentence that end at each time instant. It takes as input the word lattice and the syntactic knowledge in terms of a finite-state network, giving as output the best legal sentence. The proposal two-level system was tested recognizing the integers from 0 to 1000 in a speaker independent approach. We get a word accuracy of 93,2% with a sentence accuracy of 84. 5%. Keywords: Speech Recognition, Hidden Markov Model, Fuzzy Training, Demisyllable, Word-spotting, Multiple Hypothesis, Finite State Networks.