Understanding English in limited pragmatic domains (original) (raw)

Building Working Models of Full Natural-Language Understanding in Limited Pragmatic Domains

James A. Mason - 2010 May 17, 20, 26; June 8; 2012 Aug 16; 2012 Sep 1
http://www.yorku.ca/jmason/asdindex.htm

Keywords: English language understanding , natural-language processing , NLP , NLU , computational linguistics , dialog system , Java , Augmented Syntax Diagram , ASD , playing card

My long-term research project now is to build working models that understand English as an English-speaker does, in realistic pragmatic domains. Of course, for the foreseeable future, such models will require pragmatic domains which are restricted to ones that can be modeled completely on a computer. Nevertheless, limited pragmatic domains can permit us to explore and model thoroughly many detailed syntactic and semantic structures of English, most of which structures should generalize well to less-limited pragmatic domains. In particular, we should be able to model most of the syntax and semantics of the so-called "function words" of English -- articles and other determiners in noun phrases, conjuctions, and prepositions -- as contrasted with the "content words" -- nouns, adjectives, verbs and adverbs. Function words are also sometimes referred to as "closed class" words, belonging to syntactic classes to which new words are almost never added to the language. Content words are sometimes referred to as "open class" words, belonging to syntactic classes to which new words are frequently added.

I have chosen ordinary playing cards as the basis for a first pragmatic domain for which to build models of English-language understanding. That domain is simple enough to be modeled fairly easily in computer software, yet it is rich enough to allow exploration of many syntactic and semantic features of English. I am building a succession of models of English-language understanding for that domain, which I call CardWorld. The first two implementations are CardWorld1 and CardWorld2, which are available from this web site in both compiled and open-source form. The latest model can also be run from this link as a Java Web Start applet created by Roxanne Parent. The CardWorld models can be used with various kinds of input, including stylus and touch-screen pointing, and English input by keyboard as well as spoken English input using a program like Dragon Naturally Speaking as a front-end.

Documentation for the first two versions CardWorld is provided in <CardWorld1Documentation.html> and <CardWorld2Documentation.html> . It should be noted that, for setting and getting values of semantic feature variables, CardWorld1 and CardWorld2 use only the basic tools provided by ASDParser and ASDDecider. They do not require use of the SemanticValue class hierarchy.

CardWorld models can be extended in many directions, including these:

Pragmatically

Permit other operations on playing cards and collections of cards:

accepting pointing gestures to more than one location per input utterance
moving cards into various kinds of collections -- e.g. hands, drawing piles, discard piles
finding specific cards by extrinsic description -- e.g. by position in a collection
counting cards [with specific descriptions] in given collections
sorting cards according to specific descriptions
selecting cards at random from a collection
rotating card images
following rules of various card games Introduce additional agents into a card world, give them different views of that world, and allow various kinds of communication among them.

Semantically

Add semantic structures required for the extended pragmatics, including

structures to represent extrinsic and intrinsic description properties
structures to represent various kinds of collections
structures to represent various sorting orders for cards
structures to represent exact and vague quantities
structures to represent random selection
structures to represent angles of rotation of cards, in addition to position
structures to represent viewpoints of cards and piles from various agents
structures to represent actions in games Extensions like those may need the SemanticValue class hierarchy and more.

Syntactically

Add vocabulary and grammar structures required for the extended pragmatics and semantics, including

the words "rank", "suit", "deck", "hand", "draw[ing]", "discard", "deal[er]" and others required for games
the quantifier "some", and vocabulary for exact (e.g. "five") and vague (e.g. "a few") quantities
words "top", "bottom", and ordinals "first", "second", etc.
words for asking questions -- "how many", "where", "which", etc.
syntactic structures for prepositional phrases, relative clauses, conjunctions of noun phrases, and conjunctions of more than two clauses All such syntactic extensions can be accomplished with ASDEditor and ASDParser.

FurtherNotesAboutCardWorld