ProPause: A syntactico-Prosodic System Designed to Assign Pauses (original) (raw)

Placing pauses in read spoken Spanish: a model and an algorithm

Language Design, 2002

The purpose of this work is to describe the appearance and location of typographically unmarked pauses in any Spanish text to be read. An experiment is designed to derive pause location from natural speech: results show that Intonation Group length constraints guide the appearance of pauses, which are placed depending on syntactic information. Then, a rule-based algorithm is developed to automatically place pauses whose performance is tested by means of qualitative tests. The evaluation shows that the system adequately places pauses in read texts, since it predicts 81% of orthographically unmarked pauses; when pauses associated to punctuation signs are included, the percentage of correct prediction increases to 92%.

Automatic Determination of Phrase Breaks for Argentine Spanish

This work evaluates the efficiency of different word classes -part of speech-, normalized vs. non normalized counting for syllable and word occurrences, to predict non orthographic breaks of an Argentine Spanish database, designed for the development of the prosody component for a Text To Speech system. Within a set of 741 sentences, regression trees were trained and tested with two different proportions of data. The results show an error range of 8 to 15% whose minimum value is related to a reduced amount of morphologic categories, and a normalized counting of syllables and words.

Prediction of Pauses in TTS ‐ Tamil

Proceedings of Tamil Internet 2010, 2010

Text to Speech (TTS) involves the task of converting the text typed in electronic format to speech signal. In MILE lab, we are involved in making a TTS system for Tamil and Kannada. In this paper, the contribution of syntactic information such as part of speech (POS) tags in enhancing the quality of a text to speech synthesis system for Tamil is researched. The quality of a TTS system is measured by the intelligibility and naturalness of the synthesized speech. The NLP module of the TTS system (for example, text normalization) contributes not only to its intelligibility, but also to its naturalness, by improving the prosody. The stress and pause modeling can be improved using the POS and other syntactic information. In a sentence, where there should and should not be a pause needs to be identified for the naturalness of the produced speech. This is because, a sentence without any pause or with identical pause intervals between words sounds robotic. Also, pause at a wrong place makes the sentence unnatural and there is even a possibility of change of meaning. For example, take the following sentence, avarukku inRu

mAlai kitaittatu. avarukku inRu mAlai

kitaittatu.

here indicates that there is a pause. The pause given in different places gives different meanings. Syntactic information such as parts of speech can be used for identifying the rules for pause in a sentence. A rule based POS tagger is developed for this purpose without using a root word dictionary. Currently, manual evaluation shows an accuracy of approximately 74% using only the lexical rules. The performance is expected to improve after the context sensitive rules are applied. Rules are made for predicting the insertion of pause at the right place. The manual evaluation of pause insertion shows a significant improvement in the naturalness of the produced sentence.