ISCA Archive - Using text analysis to predict intonational boundaries (original) (raw)

Using text analysis to predict intonational boundaries

Julia Hirschberg

Relating the intonational characteristics of an utterance to features inferable from its orthographic transcription is important both for speech recognition and for speech synthesis. Results are presented for predicting the location of intonational phrase boundaries in a corpus of spontaneous (elicited) speech from syntactic, temporal and other features inferred from simple text analysis of its transcription. Classification and Regression Tree (CART) techniques are employed to model the relationship between hand-labeled boundary phenomena and textual features. Results from an additional experiment using these prediction trees to distinguish correct strings from those incorrectly recognized by a speech recognizer are also reported.