Enabling Robots to Understand Indirect Speech Acts in Task-Based Interactions (original) (raw)

Going Beyond Literal Command-Based Instructions: Extending Robotic Natural Language Interaction Capabilities

Proceedings of the AAAI Conference on Artificial Intelligence

The ultimate goal of human natural language interaction is to communicate intentions. However, these intentions are often not directly derivable from the semantics of an utterance (e.g., when linguistic modulations are employed to convey polite-ness, respect, and social standing). Robotic architectures withsimple command-based natural language capabilities are thus not equipped to handle more liberal, yet natural uses of linguistic communicative exchanges. In this paper, we propose novel mechanisms for inferring in-tentions from utterances and generating clarification requests that will allow robots to cope with a much wider range of task-based natural language interactions. We demonstrate the potential of these inference algorithms for natural human-robot interactions by running them as part of an integrated cognitive robotic architecture on a mobile robot in a dialogue-based instruction task.

Service Robots Dealing with Indirect Speech Acts

2006

Successful interaction between a service robot and its human users depends on the robot's ability to understand not only direct commands, but also more indirect ways for a human to express what she would like the robot to do. Such indirect ways are pervasive in human-human interaction; enabling the robot to understand them can make human-robot interaction more human-friendly. This paper presents a model for a robot that pursues its serving duties by trying to interpret indirect ways of expressing requests to execute certain actions. In case of uncertainty about the proper interpretation the robot can ask for clarification and adapt its interpretation for future interactions.

Strategies and mechanisms to enable dialogue agents to respond appropriately to indirect speech acts

2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2017

Humans often use indirect speech acts (ISAs) when issuing directives. Much of the work in handling ISAs in computational dialogue architectures has focused on correctly identifying and handling the underlying non-literal meaning. There has been less attention devoted to how linguistic responses to ISAs might differ from those given to literal directives and how to enable different response forms in these computational dialogue systems. In this paper, we present ongoing work toward developing dialogue mechanisms within a cognitive, robotic architecture that enables a richer set of response strategies to non-literal directives.

How We Talk with Robots: Eliciting Minimally-Constrained Speech to Build Natural Language Interfaces and Capabilities

Proceedings of the Human Factors and Ergonomics Society Annual Meeting

Industry, military, and academia are showing increasing interest in collaborative human-robot teaming in a variety of task contexts. Designing effective user interfaces for human-robot interaction is an ongoing challenge, and a variety of single-and multiple-modality interfaces have been explored. Our work is to develop a bi-directional natural language interface for remote human-robot collaboration in physically situated tasks. When combined with a visual interface and audio cueing, we intend for the natural language interface to provide a naturalistic user experience that requires little training. Building the language portion of this interface requires first understanding how potential users would speak to the robot. In this paper, we describe our elicitation of minimally-constrained robot-directed language, observations about the users' language behavior, and future directions for constructing an automated robotic system that can accommodate these language needs.

A Hybrid Architectural Approach to Understanding and Appropriately Generating Indirect Speech Acts

Proceedings of the AAAI Conference on Artificial Intelligence

Current approaches to handling indirect speech acts (ISAs) do not account for their sociolinguistic underpinnings (i.e., politeness strategies). Deeper understanding and appropriate generation of indirect acts will require mechanisms that integrate natural language (NL) understanding and generation with social information about agent roles and obligations,which we introduce in this paper. Additionally, we tackle the problem of understanding and handling indirect answers that take the form of either speech acts or physical actions, which requires an inferential, plan-reasoning approach. In order to enable artificial agents to handle an even wider-variety of ISAs, we present a hybrid approach, utilizing both the idiomatic and inferential strategies. We then demonstrate our system successfully generating indirect requests and handling indirect answers, and discuss avenues of future research.

Combining Goal Inference and Natural-Language Dialogue for Human-Robot Joint Action

We demonstrate how combining the reasoning components from two existing systems designed for human-robot joint action produces an integrated system with greater capabilities than either of the individual systems. One of the systems supports primarily non-verbal interaction and uses dynamic neural fields to infer the user's goals and to suggest appropriate system responses; the other emphasises natural-language interaction and uses a dialogue manager to process user input and select appropriate system responses. Combining these two methods of reasoning results in a robot that is able to coordinate its actions with those of the user while employing a wide range of verbal and non-verbal communicative actions.

Enabling Human-Like Task Identification From Natural Conversation

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

A robot as a coworker or a cohabitant is becoming mainstream day-by-day with the development of low-cost sophisticated hardware. However, an accompanying software stack that can aid the usability of the robotic hardware remains the bottleneck of the process, especially if the robot is not dedicated to a single job. Programming a multipurpose robot requires an on the fly mission scheduling capability that involves task identification and plan generation. The problem dimension increases if the robot accepts tasks from a human in natural language. Though recent advances in NLP and planner development can solve a variety of complex problems, their amalgamation for a dynamic robotic task handler is used in a limited scope. Specifically, the problem of formulating a planning problem from natural language instructions is not studied in details. In this work, we provide a non-trivial method to combine an NLP engine and a planner such that a robot can successfully identify tasks and all the relevant parameters and generate an accurate plan for the task. Additionally, some mechanism is required to resolve the ambiguity or missing pieces of information in natural language instruction. Thus, we also develop a dialogue strategy that aims to gather additional information with minimal question-answer iterations and only when it is necessary. This work makes a significant stride towards enabling a human-like task understanding capability in a robot.

Towards Goal Inference for Human-Robot Collaboration

2020

Natural language instructions often leave a speaker’s intent under specified or unclear. We propose a goal inference procedure that extracts user intent using natural language processing techniques. This procedure uses semantic role labeling and synonym generation to extract utterance semantics, and then analyzes a task domain to infer the user’s underlying goal. This procedure is designed as an extension to the MIDCA cognitive architecture that enables human-robot collaboration. In this work, we describe a conceptual model of this procedure, lay out the steps a robot follows to make a goal inference, give an example use case, and describe the procedure’s implementation in a simulated environment. We close with a discussion of the benefits and limitations of this approach. We expect this procedure to improve user satisfaction with agent behavior when compared to planbased dialogue systems.

Toward a Cognitive Approach to Human-Robot Dialogue

A theory of language sufficient for building conversationally-adequate human-robot dialogue systems must account for the communicative act as a whole, from the inferential mechanism of intersubjective joint attention-sharing up through the conceptualization processes that respond to those inferences. However, practitioners of AI have in the past tended to adopt linguistic theories that either emphasize or tacitly assume the modularity of linguistic mental processes that is, their isolation from the pressures and influences of other cognitive processes. These assumptions have precluded satisfactory mod-eling of human language use. An adequate theory of language will account naturally and holistically (without ad hoc computational machinery) for discourse structure, referential flexibility, lexical non-compositionality, deixis, pragmatic effects , gesture, and intonation. This paper makes the argument that certain theories in the field of cognitive linguistics already exhibit these desiderata, and briefly describes work to implement one.

What am I? - Complementing a robot's task-solving capabilities with a mental model using a cognitive architecture

8th Workshop on Behavior Adaptation and Learning for Assistive Robotics, 2024

One way to improve Human-Robot Interaction (HRI) and increase trust, acceptance and mutual understanding is to make the behavior of a social robot more comprehensible and understandable for humans. This is particularly important if humans and machines are to work together as partners. To be able to do this, both must have the same basic understanding of the task and the current situation. We created a model within a cognitive architecture connected to the robot. The cognitive model processed relevant conversational data during a dialog with a human to create a mental model of the situation. The dialog parts of the robot were generated with a Large Language Model (LLM) from OpenAI using suitable prompts. An ACT-R model evaluated the data received by the robot according to predefined criteria-in our example application, hierarchical relationships were established and rememberedand provided feedback to the LLM via the application for prompt augmentation with the purpose of adapting or finetuning the request. Initial tests indicated that this approach may have advantages for dialogic tasks and can compensate for weaknesses in terms of a deeper understanding or "blind spots" on the part of the LLM.