Linguistic Primitives: A New Model for Language Development in Robotics (original) (raw)
Related papers
The Science and Engineering of Linguistic Behavior : The Robot Baby
Our hard-science linguistics has been focussing on the high level structure of an assemblage of communicating people and their surroundings. We have been developing a high-level understanding of the relationships between the members of a group as well as their physical environment, discovering the significant properties and constituents and exploring their representation in terms of linkages. This high level is the level we think in, and reflects the way we characterize the situation we find ourselves in. However there is a considerable distance from the recognition of objects and people, their relationships and their roles, to the basic perceptual input and motor output which is our interface with the world.
Integration of action and language knowledge: A roadmap for developmental robotics
Autonomous Mental …, 2010
Abstract-This position paper proposes that the study of embodied cognitive agents, such as humanoid robots, can advance our understanding of the cognitive development of complex sensorimotor, linguistic and social learning skills. This in turn will benefit the design of cognitive robots capable of learning to handle and manipulate objects and tools autonomously, to cooperate and communicate with other robots and humans, and to adapt their abilities to changing internal, environmental, and social conditions. Four key areas of research challenges are discussed, specifically for the issues related to the understanding of: (i) how agents learn and represent compositional actions; (ii) how agents learn and represent compositional lexicons; (iii) the dynamics of social interaction and learning; and (iv) how compositional action and language representations are integrated to bootstrap the cognitive system. The review of specific issues and progress in these areas is then translated into a practical roadmap based on a series of milestones. These milestones provide a possible set of cognitive robotics goals and test-scenarios, thus acting as a research roadmap for future work on cognitive developmental robotics.
Interactive Language Learning by Robots: The Transition from Babbling to Word Forms
PLoS ONE, 2012
The advent of humanoid robots has enabled a new approach to investigating the acquisition of language, and we report on the development of robots able to acquire rudimentary linguistic skills. Our work focuses on early stages analogous to some characteristics of a human child of about 6 to 14 months, the transition from babbling to first word forms. We investigate one mechanism among many that may contribute to this process, a key factor being the sensitivity of learners to the statistical distribution of linguistic elements. As well as being necessary for learning word meanings, the acquisition of anchor word forms facilitates the segmentation of an acoustic stream through other mechanisms. In our experiments some salient one-syllable word forms are learnt by a humanoid robot in real-time interactions with naive participants. Words emerge from random syllabic babble through a learning process based on a dialogue between the robot and the human participant, whose speech is perceived by the robot as a stream of phonemes. Numerous ways of representing the speech as syllabic segments are possible. Furthermore, the pronunciation of many words in spontaneous speech is variable. However, in line with research elsewhere, we observe that salient content words are more likely than function words to have consistent canonical representations; thus their relative frequency increases, as does their influence on the learner. Variable pronunciation may contribute to early word form acquisition. The importance of contingent interaction in real-time between teacher and learner is reflected by a reinforcement process, with variable success. The examination of individual cases may be more informative than group results. Nevertheless, word forms are usually produced by the robot after a few minutes of dialogue, employing a simple, real-time, frequency dependent mechanism. This work shows the potential of human-robot interaction systems in studies of the dynamics of early language acquisition.
From babbling towards first words: The emergence of speech in a robot in real-time interaction
2011 IEEE Symposium on Artificial Life (ALIFE), 2011
This paper describes a system that simulates word acquisition without meaning as a preliminary stage in language acquisition, through interaction with a human teacher. The system extracts phonemes from the teacher's speech and supplies them to a linguistically enabled synthetic agent. The agent babbles initially but gradually, words begin to emerge as the agent biases its babble towards the speech of the teacher. Experiments are conducted in real-time with a human teacher interacting with the agent embodied in the iCub humanoid robot.
It is thought that meaning may be grounded in early childhood language learning via the physical and social interaction of the infant with those around him or her, and that the capacity to use words, phrases and their meaning are acquired through shared referential 'inference' in pragmatic interactions. In this paper we report on experimental proposals which would allow a robot to carry out language learning in a manner analogous to that in early child development. In order to create appropriate conditions for language learning it would therefore be necessary to expose the robot to similar physical and social contexts. However in the early stages of language learning it is estimated that a 2-year-old child can be exposed to as many as 7,000 utterances per day in varied contextual situations. In this paper we present a series of experiments which we hope will 'short cut' holophrase learning in physical robots. This is achieved by moving from, firstly, simulated babbling through mechanisms which will yield basic word or holophrase structures, to, secondly, an interaction environment between a human and a robot where shared 'intentional' referencing and the associations between physical, visual and speech modalities can be experienced by the robot. The output of these experiments, combined to yield word or holophrase structures grounded in the robot's own actions and modalities, would provide scaffolding for further proto-grammatical usage-based learning via interaction with the physical and social environment involving human feedback to bootstrap developing linguistic competencies. These structures would then form the basis for further studies on language acquisition, including the emergence of negation and more complex grammar. This paper describes progress on these proposals to date and planned future studies.
Paladyn, Journal of Behavioral Robotics, 2011
While we are capable of modeling the shape, e.g. face, arms, etc. of humanoid robots in a nearly natural or humanlike way, it is much more di cult to generate human-like facial or body movements and human-like behavior like e.g. speaking and co-speech gesturing. In this paper it will be argued for a developmental robotics approach for learning to speak. On the basis of current literature a blueprint of a brain model will be outlined for this kind of robots and preliminary scenarios for knowledge acquisition will be described. Furthermore it will be illustrated that natural speech acquisition mainly results from learning during face-to-face communication and it will be argued that learning to speak should be based on human-robot face-to-face communication. Here the human acts like a caretaker or teacher and the robot acts like a speech-acquiring toddler. This is a fruitful basic scenario not only for learning to speak, but also for learning to communicate in general, including to produce co-verbal manual gestures and to produce co-verbal facial expressions.
Building a talking baby robot: A contribution to the study of speech acquisition and evolution
Interaction Studies, 2005
Speech is a perceptuo-motor system. A natural computational modeling framework is provided by cognitive robotics, or more precisely speech robotics, which is also based on embodiment, multimodality, development, and interaction. This paper describes the bases of a virtual baby robot which consists in an articulatory model that integrates the non-uniform growth of the vocal tract, a set of sensors, and a learning model. The articulatory model delivers sagittal contour, lip shape and acoustic formants from seven input parameters that characterize the configurations of the jaw, the tongue, the lips and the larynx. To simulate the growth of the vocal tract from birth to adulthood, a process modifies the longitudinal dimension of the vocal tract shape as a function of age. The auditory system of the robot comprises a "phasic" system for event detection over time, and a "tonic" system to track formants. The model of visual perception specifies the basic lips characteristics: height, width, area and protrusion. The orosensorial channel, which provides the tactile sensation on the lips, the tongue and the palate, is elaborated as a model for the prediction of tongue-palatal contacts from articulatory commands. Learning involves Bayesian programming, in which there are two phases: (i) specification of the variables, decomposition of the joint distribution and identification of the free parameters through exploration of a learning set, and (ii) utilization which relies on questions about the joint distribution.
2009
A robot that can communicate with humans using natural language will have to acquire a grammatical framework. This paper analyses some crucial underlying mechanisms that are needed in the construction of such a framework. The work is inspired by language acquisition in infants, but it also draws on the emergence of language in evolutionary time and in ontogenic (developmental) time. It focuses on issues arising from the use of real language with all its evolutionary baggage, in contrast to an artificial communication system, and describes approaches to addressing these issues. We can deconstruct grammar to derive underlying primitive mechanisms, including serial processing, segmentation, categorization, compositionality, and forward planning. Implementing these mechanisms are necessary preparatory steps to reconstruct a working syntactic/semantic/pragmatic processor which can handle real language. An overview is given of our own initial experiments in which a robot acquires some basic linguistic capacity via interacting with a human.
From motor babbling to hierarchical learning by imitation: a robot developmental pathway
How does an individual use the knowledge acquired through self exploration as a manipulable model through which to understand others and benefit from their knowledge? How can developmental and social learning be combined for their mutual benefit? In this paper we review a hierarchical architecture (HAMMER) which allows a principled way for combining knowledge through exploration and knowledge from others, through the creation and use of multiple inverse and forward models. We describe how Bayesian Belief Networks can be used to learn the association between a robot's motor commands and sensory consequences (forward models), and how the inverse association can be used for imitation. Inverse models created through self exploration, as well as those from observing others can coexist and compete in a principled unified framework, that utilises the simulation theory of mind approach to mentally rehearse and understand the actions of others.
Integration of Action and Language Knowledge: Key Challenges for Developmental Robotics Research
Recent theoretical and experimental research on action and language processing in humans and animals clearly demonstrates the strict interaction and codependence between language and action (among others Cappa & Perani, 2003; Glenberg & Kaschak, 2002; Pulvermuller et al., 2003; Rizzolatti & Arbib, 1998). For example, neuroscientific studies of the mirror neurons system (Fadiga, Fogassi, Gallese, & Rizzolatti, 2000; Gallese, Fadiga, Fogassi, & Rizzolatti, 1996) and brain imaging studies on language processing provide an ...