OF SPEAKER IDENTITY IN HMM BASED SPEECH SYNTHESIZERS Speech Signal Processing (original) (raw)

2017

Abstract

The purpose of speech synthesizer is to convert the input text to speech. Most common synthesizers are USS and HTS synthesizer. The speech synthesized by the HMM based system is found to be more intelligible than that synthesized by the USS system due to the elimination of sonic glitches and also the memory requirement of HMM based system is less around 5MB as against 500MB for an USS system. Hence HMM based synthesizer is efficient and economical. But in HMM synthesizer, buzziness is detected which prevents the preservation of speaker’s identity and decreases the intelligibility and pleasantness. The speaker’s identity preservation depends on two parameters namely speech rate and number of states used for modelling. When the input speech rate is slow, the generated speech seems to be noisy as the number of formants per window is not sufficient. Hence, the importance of choosing an appropriate speech rate in text to speech synthesis systems is analyzed. Two 3-hour speech corpora – o...

D Gayathri hasn't uploaded this paper.

Let D know you want this paper to be uploaded.

Ask for this paper to be uploaded.