Reducing spectral mismatches in concatenative speech synthesis via systematic database enrichment (original) (raw)

Abstract

This paper presents work performed for the Time-Domain TTS system, which is being developed at the ILSP for the Greek language. It focuses on the enhancement of the synthetic speech quality, by reducing the spectral mismatches between concatenated segments. To that end, a study has been performed to determine the distance that can best predict when a spectral mismatch is audible. Experimentation with different spectral distances has taken place and the distance with the best performance has been used in order to systematically enrich the segment database, which initially contained only one instance per segment. Results of this procedure indicate a substantial improvement on the synthetic speech quality.

George Tambouratzis hasn't uploaded this paper.

Let George know you want this paper to be uploaded.

Ask for this paper to be uploaded.