Context modeling and the generation of spoken discourse (original) (raw)

Automatic referent resolution of deictic and anaphoric expressions

Computational Linguistics, 1995

Deictic and anaphoric expressions frequently cause problems for natural language analysis. In this paper we present a single model that accounts for referent resolution of deictic and anaphoric expressions in a research prototype of a multimodal user interface called EDWARD. The linguistic expressions are keyed in by a user and are possibly accompanied by pointing gestures. The proposed model for reference resolution elaborates on notions of context factors and salience and integrates both linguistic and perceptual context effects. The model is contrasted with two alternative referent resolution models, namely, a simplistic one and the more sophisticated model proposed by . Based on empirical and analytical grounds, we conclude that the model we propose is preferable from a computational and engineering point of view.

Natural Language Generation

Oxford Handbooks Online, 2005

Communication via a natural language requires two fundamental skills, producing 'text' (written or spoken) and understanding it. Th is chapter introduces newcomers to the fi eld of computational approaches to the former-natural language generation (henceforth NLG)-showing some of the theoretical and practical problems that linguists, computer scientists, and psychologists have encountered when trying to explain how language works in machines or in our minds. ¹ Authors in alphabetical order     . G I: W  N L G (C, L,  S D)

Issues in the choice of a source for natural language generation

Computational Linguistics, 1993

The most vexing question in natural language generation is 'what is the source'-what do speakers start from when they begin to compose an utterance? Theories of generation in the literature differ markedly in their assumptions. A few start with an unanalyzed body of numerical data (e.g. . Most start with the structured objects that are used by a particular reasoning system or simulator and are cast in that system's representational formalism (e.g. Hovy 1990;. A growing number of systems, largely focused on problems in machine translation or grammatical theory, take their input to be logical formulae based on lexical predicates (e.g. ).

The importance of discourse context for statistical natural language generation

Proceedings of the 5th SIGDial Workshop …, 2004

Surface realization in statistical natural language generation is based on the idea that when there are many ways to say the same thing, the most frequent option based on corpus counts is the best. Based on data from English and Finnish, we argue instead that all options are not equivalent, and the most frequent one can be incoherent in some contexts. A statistical NLG system where word order choice is based only on frequency counts of forms cannot capture the contextually-appropriate use of word order. We describe an alternative method for word order selection and show how it outperforms a frequency-only approach.

Context and Cognition. A Corpus-Driven Approach to Parenthetical Uses of Mental Predicates

dsglynn.eu

nition to meaning and phenomenology is explained in Krawczak (2007, 2010a, 2010b). Speech is not just a combination of lexis, syntax, morphology and prosody. Speech draws on all kinds of linguistic form to negotiate a complex social milieu-subtly judging the interaction of an immense range of factors from gender, age, and situational context to world knowledge, but also how the speaker negotiates all of this. Psycholinguistics elicits language production and examines this produc tion. Such language is always artificially simple and without context. Corpus linguistics examines traces of language use and is, therefore, more indirect than experimental methods, but it can look at actual language in natural context. This study examines a corpus of natural interactive language to see how speakers judge, engage in, and profile epistemic stance in interaction. Careful manual analysis of a large number of speech events reveals patterns in how speakers engage with each other through language. This gives us indirect insights into how speakers profile their subjective view of the world. Multivariate statistics can then help identify the patterns of usage from a multidimensional perspective. Moreover, multivariate modelling captures this complexity in a cognitively plausible way. Although the statistical modelling itself is not argued to represent how an individual processes language, it does permit us to weigh up the relative factors and their interaction that speakers handle in communication. Arguably, this is similar to what speakers do pre-consciously when they coordinate the various dimensions of intersubjective communication. The use of corpora to describe conceptual structure is basic to usagebased research in Cognitive Semantics (Gries and Stefanowitsch 2006, Stefanowitsch and Gries 2006, Glynn and Fischer 2010, Glynn and Robinson in press). The Corpus to Conception principle, which states that patterns of language use represent patterns of conceptual structure, is specifically explored in Glynn (2009, 2010, in press).

Subsequent reference: Syntactic and rhetorical constraints

Proceedings of the 1978 workshop on Theoretical …, 1978

Once a_.nn ~ is introduced into a discourse, the form of subsequent references to it are strongly governed by convention. This paper discusses how those conventions can be represented for use by a generation facility. A multistage representation is used, allowing decisions to be made when and where the information is available. It is suggested that a specification of rhetorical structure of the intended message should be included with the present syntactic one, and the conventions eventually reformulated in terms of it.