ISCA Archive - Parametric representations for speech synthesis (original) (raw)
Parametric representations for speech synthesis
R. D. Wright, S. J. Elliott
A comparison has been made of transition behavior for six types of speech synthesiser parameters: parallel resonance, serial resonance, prediction coefficients, reflection coefficients, area functions, and finally a simple set of artidilatory parameters. The six synthesizers can be made to produce identical steady-state sounds (targets) but interpolation paths between targets will differ. Each synthesizer was tested on nonsense words spanning a wide range of parameter variation. Listening tests were also performed, using the FAAF (Four Alternative Auditory Feature) wordlist. Interpolation path differences were very obvious in the graphical results, and the numerical averages showed that path differences generally exceeded the JND (Just Noticeable Difference) for formant frequencies and bandwidths. There were also small but statistically significant differences in intelligibility, with the highest scores produced by the series resonance synthesiser. Finally, linear interpolation was found to be as good as any other interpolation method tested.