A communicative robot to learn about us and the world (original) (raw)
Related papers
Leolani: A Robot That Communicates and Learns about the Shared World
2019
People and robots make mistakes and should therefore recognize and communicate about their “imperfectness” when they collaborate. In previous work [3, 2], we described a female robot model Leolani(L) that supports open-domain learning through natural language communication, having a drive to learn new information and build social relationships. The absorbed knowledge consists of everything people tell her and the situations and objects she perceives. For this demo, we focus on the symbolic representation of the resulting knowledge. We describe how L can query and reason over her knowledge and experiences as well as access the Semantic Web. As such, we envision L to become a semantic agent which people could naturally interact with.1.
Robot, tell me what you know about...?: Expressing robot's knowledge through interaction
Explicitly showing the robot's knowledge about the states of the world and the agents' capabilities in such states is essential in human robot interaction. This way, the human partner can better understand the robot's intentions and beliefs in order to provide missing information that may eventually improve the interaction. We present our current approach for modeling the robot's knowledge from a symbolic point of view based on an ontology. This knowledge is fed by two sources: direct interaction with the human, and geometric reasoning. We present an interactive task scenario where we exploit the robot's knowledge to interact with the human while showing its internal geometric reasoning when possible.
Robot Learning Theory of Mind through Self-Observation: Exploiting the Intentions-Beliefs Synergy
arXiv (Cornell University), 2022
In complex environments, where the human sensory system reaches its limits, our behaviour is strongly driven by our beliefs about the state of the world around us. Accessing others' beliefs, intentions, or mental states in general, could thus allow for more effective social interactions in natural contexts. Yet these variables are not directly observable. Theory of Mind (TOM), the ability to attribute to other agents' beliefs, intentions, or mental states in general, is a crucial feature of human social interaction and has become of interest to the robotics community. Recently, new models that are able to learn TOM have been introduced. In this paper, we show the synergy between learning to predict low-level mental states, such as intentions and goals, and attributing high-level ones, such as beliefs. Assuming that learning of beliefs can take place by observing own decision and beliefs estimation processes in partially observable environments and using a simple feedforward deep learning model, we show that when learning to predict others' intentions and actions, faster and more accurate predictions can be acquired if beliefs attribution is learnt simultaneously with action and intentions prediction. We show that the learning performance improves even when observing agents with a different decision process and is higher when observing beliefs-driven chunks of behaviour. We propose that our architectural approach can be relevant for the design of future adaptive social robots that should be able to autonomously understand and assist human partners in novel natural environments and tasks.
Leolani: A Reference Machine with a Theory of Mind for Social Communication
Text, Speech, and Dialogue
Our state of mind is based on experiences and what other people tell us. This may result in conflicting information, uncertainty, and alternative facts. We present a robot that models relativity of knowledge and perception within social interaction following principles of the theory of mind. We utilized vision and speech capabilities on a Pepper robot to build an interaction model that stores the interpretations of perceptions and conversations in combination with provenance on its sources. The robot learns directly from what people tell it, possibly in relation to its perception. We demonstrate how the robot's communication is driven by hunger to acquire more knowledge from and on people and objects, to resolve uncertainties and conflicts, and to share awareness of the perceived environment. Likewise, the robot can make reference to the world and its knowledge about the world and the encounters with people that yielded this knowledge.
2020
Social robots and artificial agents should be able to interact with the user in the most natural way possible. This work describes the basic principles of a conversation system designed for social robots and artificial agents, which relies on knowledge encoded in the form of an Ontology. Given the knowledge-driven approach, the possibility of expanding the Ontology in run-time, during the verbal interaction with the users is of the utmost importance: this paper also deals with the implementation of a system for the run-time expansion of the knowledge base, thanks to a crowdsourcing approach.
RoboBrain: Large-Scale Knowledge Engine for Robots
arXiv (Cornell University), 2014
In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge of dealing with multiple data modalities including symbols, natural language, haptic senses, robot trajectories, visual features and many others. The knowledge stored in the engine comes from multiple sources including physical interactions that robots have while performing tasks (perception, planning and control), knowledge bases from the Internet and learned representations from several robotics research groups. We discuss various technical aspects and associated challenges such as modeling the correctness of knowledge, inferring latent information and formulating different robotic tasks as queries to the knowledge engine. We describe the system architecture and how it supports different mechanisms for users and robots to interact with the engine. Finally, we demonstrate its use in three important research areas: grounding natural language, perception, and planning, which are the key building blocks for many robotic tasks. This knowledge engine is a collaborative effort and we call it RoboBrain.
2018
This paper describes an integrated architecture for representing, reasoning with, and interactively learning domain knowledge in the context of human-robot collaboration. Specifically, Answer Set Prolog, a declarative language, is used to represent and reason with incomplete commonsense knowledge about the domain. Non-monotonic logical reasoning identifies knowledge gaps and guides the interactive learning of relations that represent actions, and of axioms that encode affordances and action preconditions and effects. Learning uses probabilistic models of uncertainty, and observations from active exploration, reactive action execution, and human (verbal) descriptions. The learned actions and axioms are used for subsequent reasoning. The architecture is evaluated on a simulated robot assisting humans in an indoor domain.
Theory of Mind for a Humanoid Robot
Autonomous Robots, 2002
If we are to build human-like robots that can interact naturally with people, our robots must know not only about the properties of objects but also the properties of animate agents in the world. One of the fundamental social skills for humans is the attribution of beliefs, goals, and desires to other people. This set of skills has often been called a "theory of mind." This paper presents the theories of Leslie and Baron-Cohen [2] on the development of theory of mind in human children and discusses the potential application of both of these theories to building robots with similar capabilities. Initial implementation details and basic skills (such as finding faces and eyes and distinguishing animate from inanimate stimuli) are introduced. I further speculate on the usefulness of a robotic implementation in evaluating and comparing these two models.
2013
The community-based generation of content has been tremendously successful in the World Wide Web—people help each other by providing information that could be useful to others. We are trying to transfer this approach to robotics in order to help robots acquire the vast amounts of knowledge needed to competently perform everyday tasks. RoboEarth is intended to be a web community by robots for robots to autonomously share descriptions of tasks they have learned, object models they have created, and environments they have explored. In this paper, we report on the formal language we developed for encoding this information and present our approaches to solve the inference problems related to finding information, to determining if information is usable by a robot, and to grounding it on the robot platform.
How? Why? What? Where? When? Who? Grounding Ontology in the Actions of a Situated Social Agent
Robotic agents are spreading, incarnated as embodied entities, exploring the tangible world and interacting with us, or as virtual agents crawling over the web, parsing and generating data. In both cases, they require: (i) processes to acquire information; (ii) structures to model and store information as usable knowledge; (iii) reasoning systems to interpret the information; and (iv) finally, ways to express their interpretations. The H5W (How, Why, What, Where, When, Who) framework is a conceptualization of the problems faced by any agent situated in a social environment, which has defined several robotic studies. We introduce the H5W framework, through a description of its underlying neuroscience and the psychological considerations it embodies, we then demonstrate a specific implementation of the framework. We will focus on the motivation and implication of the pragmatic decisions we have taken. We report the numerous studies that have relied upon this technical implementation as a proof of its robustness and polyvalence; moreover, we conduct an additional validation of its applicability to the natural language domain by designing an information exchange task as a benchmark.