Dictation Machine Based on Japanese Character Source Modeling (original) (raw)

International Journal of Pattern Recognition and Artificial Intelligence

Abstract

This paper describes a phonetic typewriter and a dictation machine that utilize the underlying statistical structure of phoneme or character sequences. The approach of using syllable or character trigrams is applied to language source modeling. The language source models are obtained by calculating trigram probabilities from a large text database. These models are combined with the HMM-LR continuous speech recognition system.3,6 The phonetic typewriter is tested using 274 phrases uttered by one male speaker. The syllable source model achieves a 94.9% phoneme recognition rate with the test-set phoneme perplexity of 3.9. Without the syllable source model, the phoneme recognition rate is only 73.2%. A trigram model based on characters is also evaluated. This character source model can reduce the syllable perplexity significantly to 7.7, compared with 10.5 of the syllable source model. The character source model achieves a 78.5% character transcription rate for the 274 phrase utterances...

Kiyohiro Shikano hasn't uploaded this paper.

Let Kiyohiro know you want this paper to be uploaded.

Ask for this paper to be uploaded.