Vinay Chaudhri - Profile on Academia.edu (original) (raw)

Papers by Vinay Chaudhri

Our work is driven by the hypothesis that, for a program to answer questions, explain the answers... more Our work is driven by the hypothesis that, for a program to answer questions, explain the answers, and engage in a dialog just as a human does, it must have an explicit representation of knowledge. Such explicit representations naturally occur in many situations such as in designs created by engineers , software requirements created in a unified modeling language or process flow diagrams created for a manufacturing process. Automated approaches based on natural language processing have progressed on tasks such as named entity recognition, fact extraction and relation learning, but they cannot generate expressive representations with high accuracy. In this paper, we report on our effort to systematically curate a knowledge base for a substantial fraction of a biology textbook. Although this experience and the process inherently offer insights, three aspects are especially instructive for the future development of knowledge bases both by manual and by automatic methods: (1) Consider imposing a simplifying abstract structure on natural language sentences so that the surface form is closer to the target logical form to be extracted; (2) Adopt an upper ontology that is strongly motivated and influenced by natural language; (3) Develop a set of syntactic and semantic guidelines that captures how the conceptual distinctions in the ontology may be realized in natural language. Because this representation has effectively enabled reasoning, explanation and dialog, it gives a concrete target for what should be learned by automated methods.

Cognitive simulation of analogical processing can be used to answer comparison questions such as:... more Cognitive simulation of analogical processing can be used to answer comparison questions such as: What are the similarities and/or differences between A and B, for concepts A and B in a knowledge base (KB). Previous attempts to use a general-purpose analogical reasoner to answer such questions revealed three major problems: (a) the system presented too much information in the answer, and the salient similarity or difference was not highlighted; (b) analogical inference found some incorrect differences; and (c) some expected similarities were not found. The cause of these problems was primarily a lack of a well-curated KB and, and secondarily, al-gorithmic deficiencies. In this paper, relying on a well-curated biology KB, we present a specific implementation of comparison questions inspired by a general model of analogical reasoning. We present numerous examples of answers produced by the system and empirical data on answer quality to illustrate that we have addressed many of the problems of the previous system.

L earning a scientific discipline such as biology is a daunting challenge. In a typical advanced ... more L earning a scientific discipline such as biology is a daunting challenge. In a typical advanced high school or introductory college biology course, a student is expected to learn about 5000 concepts and several hundred thousand new relationships among them. 1 Science textbooks are difficult to read and yet there are few alternative resources for study. Despite the great need for science graduates, too few students are willing to study science and many drop out without completing their degrees. New approaches are needed to provide students with a more usable and useful resource and to accelerate their learning.

The long-term goal of Project Halo is to build an application called Digital Aristotle that can a... more The long-term goal of Project Halo is to build an application called Digital Aristotle that can answer questions on a wide variety of science topics and provide user-and domain-appropriate explanations. As a near-term goal, we are focusing on enabling subject matter experts (SMEs) to construct declarative knowledge bases (KBs) from 50 pages of a science textbook in the domains of Physics, Chemistry, and Biology in a way that the system can answer questions similar to those in an Advanced Placement (AP) exam in the respective discipline. The textbook knowledge is a mixture of textual information, mathematical equations, tables, diagrams, and domain-specific representations such as chemical reactions. In this paper, we explore the following question: Can we build a knowledge capture system to enable SMEs to construct KBs from the knowledge found in science textbooks and use the resulting KB for deductive question answering? We answer this question in the context of a system called AURA that supports knowledge capture from science textbooks.

P roject Halo is a long-range research effort sponsored by Vulcan Inc., pursuing the vision of th... more P roject Halo is a long-range research effort sponsored by Vulcan Inc., pursuing the vision of the " Digital Aristotle " — an application containing large volumes of scientific knowledge and capable of applying sophisticated problem-solving methods to answer novel questions. As this capability develops , the project focuses on two primary applications: a tutor capable of instructing and assessing students and a research assistant with the broad, interdisciplinary skills needed to help scientists in their work. Clearly, this goal is an ambitious, long-term vision, with Digital Aristotle serving as a distant target for steering the project's near-term research and development. Making the full range of scientific knowledge accessible and intelligible to a user might involve anything from the simple retrieval of facts to answering a complex set of interdependent questions and providing user-appropriate justifications for those answers. Retrieval of simple facts might be achieved by information extraction systems searching and extracting information from a large corpus of text. But, to go beyond this, to systems that are capable of generating answers and explanations that are not explicitly written in the texts, requires the computer to acquire, represent, and reason with knowledge of the domain (that is, to have genuine, internal " understanding " of the domain). I In the winter 2004 issue of AI Magazine, we reported Vulcan Inc.'s first step toward creating a question-answering system called Digital Aristotle. The goal of that first step was to assess the state of the art in applied knowledge representation and reasoning (KRR) by asking AI experts to represent 70 pages from the advanced placement (AP) chemistry syllabus and to deliver knowledge-based systems capable of answering questions from that syllabus. This article reports the next step toward realizing a Digital Aristotle: we present the design and evaluation results for a system called AURA, which enables domain experts in physics, chemistry, and biology to author a knowledge base and that then allows a different set of users to ask novel questions against that knowledge base. These results represent a substantial advance over what we reported in 2004, both in the breadth of covered subjects and in the provision of sophisticated technologies in knowledge representation and reasoning, natural language processing, and question answering to domain experts and novice users.

In this paper, I summarize the results of a decade-plus of research and development driven by the... more In this paper, I summarize the results of a decade-plus of research and development driven by the vision that human knowledge can be grounded in a small number of prototypical components that can be extended through composition and analogy. This vision has been embodied in a system called AURA, which has been used to engineer an expressive knowledge base for an intelligent biology textbook. The focus of the current paper is to abstract away from the specifics and, to instead describe the core ideas in such a manner that they can be transferred and applied in different contexts, and to relate those ideas to the ongoing research by others.

There has been a rich history of research on semantic parsing on a wide range of problems some of... more There has been a rich history of research on semantic parsing on a wide range of problems some of which include understanding database queries. In this paper, we propose the task of understanding business rules as a new challenge for semantic parsing. Business rules can be usually expressed in a restricted subset of English, and one would hope that semantic parsing would do well on converting such English into logic. We show that an off-the-shelf semantic parser is able to achieve an accuracy of approximately 25% on this task suggesting that this task is a challenge for semantic parsing. We identify several problems that prevent higher performance on this task and outline a research agenda for future research on semantic parsing.