Towards Goal Inference for Human-Robot Collaboration (original) (raw)

Going Beyond Literal Command-Based Instructions: Extending Robotic Natural Language Interaction Capabilities

Proceedings of the AAAI Conference on Artificial Intelligence

The ultimate goal of human natural language interaction is to communicate intentions. However, these intentions are often not directly derivable from the semantics of an utterance (e.g., when linguistic modulations are employed to convey polite-ness, respect, and social standing). Robotic architectures withsimple command-based natural language capabilities are thus not equipped to handle more liberal, yet natural uses of linguistic communicative exchanges. In this paper, we propose novel mechanisms for inferring in-tentions from utterances and generating clarification requests that will allow robots to cope with a much wider range of task-based natural language interactions. We demonstrate the potential of these inference algorithms for natural human-robot interactions by running them as part of an integrated cognitive robotic architecture on a mobile robot in a dialogue-based instruction task.

Combining Goal Inference and Natural-Language Dialogue for Human-Robot Joint Action

We demonstrate how combining the reasoning components from two existing systems designed for human-robot joint action produces an integrated system with greater capabilities than either of the individual systems. One of the systems supports primarily non-verbal interaction and uses dynamic neural fields to infer the user's goals and to suggest appropriate system responses; the other emphasises natural-language interaction and uses a dialogue manager to process user input and select appropriate system responses. Combining these two methods of reasoning results in a robot that is able to coordinate its actions with those of the user while employing a wide range of verbal and non-verbal communicative actions.

Enabling Human-Like Task Identification From Natural Conversation

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

A robot as a coworker or a cohabitant is becoming mainstream day-by-day with the development of low-cost sophisticated hardware. However, an accompanying software stack that can aid the usability of the robotic hardware remains the bottleneck of the process, especially if the robot is not dedicated to a single job. Programming a multipurpose robot requires an on the fly mission scheduling capability that involves task identification and plan generation. The problem dimension increases if the robot accepts tasks from a human in natural language. Though recent advances in NLP and planner development can solve a variety of complex problems, their amalgamation for a dynamic robotic task handler is used in a limited scope. Specifically, the problem of formulating a planning problem from natural language instructions is not studied in details. In this work, we provide a non-trivial method to combine an NLP engine and a planner such that a robot can successfully identify tasks and all the relevant parameters and generate an accurate plan for the task. Additionally, some mechanism is required to resolve the ambiguity or missing pieces of information in natural language instruction. Thus, we also develop a dialogue strategy that aims to gather additional information with minimal question-answer iterations and only when it is necessary. This work makes a significant stride towards enabling a human-like task understanding capability in a robot.

Goal tracking in a natural language interface: towards achieving adjustable autonomy

Proceedings 1999 IEEE International Symposium on Computational Intelligence in Robotics and Automation. CIRA'99 (Cat. No.99EX375), 1999

Intelligent mobile robots that interact with humans must exhibit adjustable autonomy; that is, the ability to dynamically adjust the level of self-sufficiency of an agent depending on the situation. When intelligent robots require close interactions with humans, they will require modes of communication that enhance the ability for humans to communicate naturally and that allow greater interaction, as well as adapt as a team member or sole agent in achieving various goals. Our previous work examined the use of multiple modes of communication, specifically natural language and gestures, to disambiguate the communication between a human and a robot. In this paper, we propose using context predicates to keep track of various goals during human-robot interactions. These context predicates allow the robot to maintain multiple goals, each with possibly different levels of required autonomy. They permit direct human interruption of the robot, while allowing the robot to smoothly return to a high level of autonomy.

Enabling Robots to Understand Indirect Speech Acts in Task-Based Interactions

Journal of Human-Robot Interaction, 2017

An important open problem for enabling truly taskable robots is the lack of task-general natural language mechanisms within cognitive robot architectures that enable robots to understand typical forms of human directives and generate appropriate responses. In this paper, we first provide experimental evidence that humans tend to phrase their directives to robots indirectly, especially in socially conventionalized contexts. We then introduce pragmatic and dialogue-based mechanisms to infer intended meanings from such indirect speech acts and demonstrate that these mechanisms can handle all indirect speech acts found in our experiment as well as other common forms of requests.

Textual Inference and Meaning Representation in Human Robot Interaction

This paper provides a first investigation over existing textual inference paradigms in order to propose a generic framework able to capture major semantic aspects in Human Robot Interaction (HRI). We investigate the use of general semantic paradigms used in Natural Language Understanding (NLU) tasks, such as Semantic Role Labeling, over typical robot commands. The semantic information obtained is then represented under the Abstract Meaning Representation. AMR is a general representation language useful to express different level of semantic information without a strong dependence to the syntactic structure of an underlying sentence. The final aim of this work is to find an effective synergy between HRI and NLU.

From Commands to Goal-Based Dialogs: A Roadmap to Achieve Natural Language Interaction in RoboCup@Home

RoboCup 2018: Robot World Cup XXII

On the one hand, speech is a key aspect to people's communication. On the other, it is widely acknowledged that language proficiency is related to intelligence. Therefore, intelligent robots should be able to understand, at least, people's orders within their application domain. These insights are not new in RoboCup@Home, but we lack of a long-term plan to evaluate this approach. In this paper we conduct a brief review of the achievements on automated speech recognition and natural language understanding in RoboCup@Home. Furthermore, we discuss main challenges to tackle in spoken human-robot interaction within the scope of this competition. Finally, we contribute by presenting a pipelined road map to engender research in the area of natural language understanding applied to domestic service robotics.

Effective and Robust Natural Language Understanding for Human-Robot Interaction

Robots are slowly becoming part of everyday life, as they are being marketed for commercial applications (viz. telepresence, cleaning or entertainment). Thus, the ability to interact with non-expert users is becoming a key requirement. Even if user utterances can be efficiently recognized and transcribed by Automatic Speech Recognition systems, several issues arise in translating them into suitable robotic actions. In this paper, we will discuss both approaches providing two existing Natural Language Understanding workflows for Human Robot Interaction. First, we discuss a grammar based approach: it is based on grammars thus recognizing a restricted set of commands. Then, a data driven approach, based on a free-from speech recognizer and a statistical semantic parser, is discussed. The main advantages of both approaches are discussed, also from an engineering perspective, i.e. considering the effort of realizing HRI systems, as well as their reusability and robustness. An empirical evaluation of the proposed approaches is carried out on several datasets, in order to understand performances and identify possible improvements towards the design of NLP components in HRI.

Natural Language For Human Robot Interaction

2015

Natural Language Understanding (NLU) was one of the main original goals of artificial intelligence and cognitive science. This has proven to be extremely challenging and was nearly abandoned for decades. We describe an implemented system that supports full NLU for tasks of moderate complexity. The natural language interface is based on Embodied Construction Grammar and simulation semantics. The system described here supports human dialog with an agent controlling a simulated robot, but is flexible with respect to both input language and output task.

Natural Language Interpretation for an Interactive Service Robot in Domestic Domains

Communications in Computer and Information Science, 2013

In this paper, we propose a flexible system for robust natural language interpretation of spoken commands on a mobile robot in domestic service robotics applications. Existing language processing for instructing a mobile robot is often restricted by using a simple grammar where precisely pre-defined utterances are directly mapped to system calls. These approaches do not regard fallibility of human users and they only allow for binary processing of an utterance; either a command is part of the grammar and hence understood correctly, or it is not part of the grammar and gets rejected. We model the language processing as an interpretation process where the utterance needs to be mapped to the robot's capabilities. We do so by casting the processing as a (decisiontheoretic) planning problem on interpretation actions. This allows for a flexible system that can resolve ambiguities and which is also capable of initiating steps to achieve clarification. We show how we evaluated several versions of the system with multiple utterances of different complexity as well as with incomplete and erroneous requests.