ISCA Archive - Phoneme recognition using visual features on speech spectrograms (original) (raw)

Phoneme recognition using visual features on speech spectrograms

Shigeru Katagiri, Manami Yokota

In order to apply speech spectrogram reading heuristics to an automatic speech recognition system, a more accurate expression of the heuristics must be developed. In particular, the transformation between acoustic feature measurements and phoneme candidates must be developed in a quantitative manner.

In this paper, a visual acoustic-feature label and a phoneme identification approach using this label is proposed. The visual acoustic-feature label, which is a polygon on a speech spectrogram, represents some aspects of an acoustic feature by its own geometric characteristics. Preliminary experimental results show that phoneme identification using the visual acoustic-feature label is feasible for realizing the quantitative transformation rules between the acoustic feature measurements and phoneme candidates.