Jochen Triesch - Academia.edu (original) (raw)

Papers by Jochen Triesch

Research paper thumbnail of Coordination in behavior and cognition: From Neurons to Mind

Research paper thumbnail of Time to augment self-supervised visual representation learning

arXiv (Cornell University), Jul 27, 2022

Biological vision systems are unparalleled in their ability to learn visual representations witho... more Biological vision systems are unparalleled in their ability to learn visual representations without supervision. In machine learning, self-supervised learning (SSL) has led to major advances in forming object representations in an unsupervised fashion. Such systems learn representations invariant to augmentation operations over images, like cropping or flipping. In contrast, biological vision systems exploit the temporal structure of the visual experience during natural interactions with objects. This gives access to "augmentations" not commonly used in SSL, like watching the same object from multiple viewpoints or against different backgrounds. Here, we systematically investigate and compare the potential benefits of such time-based augmentations during natural interactions for learning object categories. Our results show that time-based augmentations achieve large performance gains over state-of-the-art image augmentations. Specifically, our analyses reveal that: 1) 3-D object manipulations drastically improve the learning of object categories; 2) viewing objects against changing backgrounds is important for learning to discard background-related information from the latent representation. Overall, we conclude that time-based augmentations during natural interactions with objects can substantially improve self-supervised learning, narrowing the gap between artificial and biological vision systems.

Research paper thumbnail of Self-organization of complex cortex-like wiring in a spiking neural network model

BMC Neuroscience, Dec 1, 2015

Figure 1 Panel A: Mature synaptic weight distribution (in logarithmic space) for a simulation run... more Figure 1 Panel A: Mature synaptic weight distribution (in logarithmic space) for a simulation run with associated Gaussian (lognormal in linear space) fit. Panel B: Topological graph of mature network for a simulation run. Panel C: Triangular motif count (relative to random and corrected for an overrepresentation of bidirectional connections, similar to [1]) for topological graph of mature network. Motif key to right.

Research paper thumbnail of Using Eye Direction Cues for Gaze Following - A Developmental Model

We present a reinforcement learning model of gaze following that learns to incorporate eye cues i... more We present a reinforcement learning model of gaze following that learns to incorporate eye cues in a developmental trajectory similar to that found in infants: older infants follow gaze more frequently when both the caregiver's eyes and head are turned than when only her head is turned. Similarly, infants learn to follow gaze less when the caregiver's eyes are closed than when her eyes are open. The model works through the maximization of visual reward, and not by representing an estimate of the other person's attentional state, as would be expected by adherents of a mentalist interpretation of gaze following. Within the debate about the age of onset in the use of eye cues for gaze following, we hypothesize that eye cues have an effect as soon as the infant learns to follow gaze, but at first it might be small and therefore difficult to measure. Finally, we propose this learning approach for modeling other aspects of theory of mind.

Research paper thumbnail of Precise Synaptic Efficacy Alignment Suggests Potentiation Dominated Learning

Frontiers in Neural Circuits, Jan 13, 2016

Recent evidence suggests that parallel synapses from the same axonal branch onto the same dendrit... more Recent evidence suggests that parallel synapses from the same axonal branch onto the same dendritic branch have almost identical strength. It has been proposed that this alignment is only possible through learning rules that integrate activity over long time spans. However, learning mechanisms such as spike-timing-dependent plasticity (STDP) are commonly assumed to be temporally local. Here, we propose that the combination of temporally local STDP and a multiplicative synaptic normalization mechanism is sufficient to explain the alignment of parallel synapses. To address this issue, we introduce three increasingly complex models: First, we model the idealized interaction of STDP and synaptic normalization in a single neuron as a simple stochastic process and derive analytically that the alignment effect can be described by a so-called Kesten process. From this we can derive that synaptic efficacy alignment requires potentiationdominated learning regimes. We verify these conditions in a single-neuron model with independent spiking activities but more realistic synapses. As expected, we only observe synaptic efficacy alignment for long-term potentiation-biased STDP. Finally, we explore how well the findings transfer to recurrent neural networks where the learning mechanisms interact with the correlated activity of the network. We find that due to the self-reinforcing correlations in recurrent circuits under STDP, alignment occurs for both long-term potentiation-and depression-biased STDP, because the learning will be potentiation dominated in both cases due to the potentiating events induced by correlated activity. This is in line with recent results demonstrating a dominance of potentiation over depression during waking and normalization during sleep. This leads us to predict that individual spine pairs will be more similar after sleep compared to after sleep deprivation. In conclusion, we show that synaptic normalization in conjunction with coordinated potentiation-in this case, from STDP in the presence of correlated pre-and post-synaptic activity-naturally leads to an alignment of parallel synapses.

Research paper thumbnail of Fast Temporal Dynamics of Visual Cue Integration

Perception, Apr 1, 2002

Introduction A fundamental question in neuroscience is how the brain integrates information deriv... more Introduction A fundamental question in neuroscience is how the brain integrates information derived from different sensors, cues, and modalities into coherent percepts. Many researchers have looked at this question from behavioral, computational, and neurophysiological viewpoints. Unfortunately, different integration strategies have been observed in different experiments. Examples are weighted averaging (von Holst 1950; Bruno and Cutting 1988) or extensions thereof (Landy et al 1995), multiplicative interactions (Stein and Meredith 1993), Boolean logic (Newman and Hartline 1982), fuzzy logic (Massaro and Friedman 1990), and linear and nonlinear Bayesian inference (Yuille and Bu« lthoff 1996). It seems that the way in which different cues are integrated depends on the cues involved, the nature of the task, characteristics of the sensory environment, and prior experience and knowledge of the observers. We believe that an important reason for investigators being unable to identify``the cue-integration strategy'' is that observers do not use a single, immutable strategy. Rather, they use a collection of context-sensitive strategies that are adaptable in an experience-dependent manner. Evidence of the adaptability of cue-integration strategies has been mentioned quite early (von Holst 1950), but has recently begun to accumulate. Jacobs and Fine (1999) used a cue-conflict experimental paradigm to show that observers' cue-combination strategies for visual depth are adaptable as a function of training; subjects adjusted their cue-combination rules to use a visual cue more heavily after training in which that cue was informative than after training in which the cue was irrelevant. Moreover, these researchers showed that observers can learn multiple cue-combination rules, and can learn to apply each rule in its appropriate context. Ernst et al (2000) studied the adaptability of observers' cue-integration strategies using a virtual-reality environment that allowed subjects to interact with viewed objects by touching them. They showed

Research paper thumbnail of Autonomous Development of Active Binocular and Motion Vision Through Active Efficient Coding

Frontiers in Neurorobotics, Jul 16, 2019

We present a model for the autonomous and simultaneous learning of active binocular and motion vi... more We present a model for the autonomous and simultaneous learning of active binocular and motion vision. The model is based on the Active Efficient Coding (AEC) framework, a recent generalization of classic efficient coding theories to active perception. The model learns how to efficiently encode the incoming visual signals generated by an object moving in 3-D through sparse coding. Simultaneously, it learns how to produce eye movements that further improve the efficiency of the sensory coding. This learning is driven by an intrinsic motivation to maximize the system's coding efficiency. We test our approach on the humanoid robot iCub using simulations. The model demonstrates self-calibration of accurate object fixation and tracking of moving objects. Our results show that the model keeps improving until it hits physical constraints such as camera or motor resolution, or limits on its internal coding capacity. Furthermore, we show that the emerging sensory tuning properties are in line with results on disparity, motion, and motion-in-depth tuning in the visual cortex of mammals. The model suggests that vergence and tracking eye movements can be viewed as fundamentally having the same objective of maximizing the coding efficiency of the visual system and that they can be learned and calibrated jointly through AEC.

Research paper thumbnail of Bridging structure and function: A model of sequence learning and prediction in primary visual cortex

PLOS Computational Biology, Jun 5, 2018

Research paper thumbnail of An explanation of the familiarity-to-novelty-shift in infant habituation

Frontiers in Computational Neuroscience, 2009

Research paper thumbnail of MIMo: A Multi-Modal Infant Model for Studying Cognitive Development in Humans and AIs

Research paper thumbnail of An Active Efficient Coding Model of Binocular Vision Development Under Normal and Abnormal Rearing Conditions

Lecture Notes in Computer Science, 2018

The development of binocular vision encompasses the formation of binocular receptive fields tuned... more The development of binocular vision encompasses the formation of binocular receptive fields tuned to different disparities and the calibration of accurate vergence eye movements. Experiments have shown that this development is impaired when the animal is exposed to certain abnormal rearing conditions such as growing up in an environment that is deprived of horizontal or vertical edges. Here we test the effect of abnormal rearing conditions on a recently proposed computational model of binocular development. The model is formulated in the Active Efficient Coding framework, a generalization of classic efficient coding ideas to active perception. We show that abnormal rearing conditions lead to differences in the model's development that qualitatively match those seen in animal experiments. Furthermore, the model predicts systematic changes in vergence accuracy due to abnormal rearing. We discuss implications of the model for the treatment of developmental disorders of binocular vision such as amblyopia and strabismus.

Research paper thumbnail of The development of active binocular vision under normal and alternate rearing conditions

eLife, Aug 17, 2021

The development of binocular vision is an active learning process comprising the development of d... more The development of binocular vision is an active learning process comprising the development of disparity tuned neurons in visual cortex and the establishment of precise vergence control of the eyes. We present a computational model for the learning and self-calibration of active binocular vision based on the Active Efficient Coding framework, an extension of classic efficient coding ideas to active perception. Under normal rearing conditions with naturalistic input, the model develops disparity tuned neurons and precise vergence control, allowing it to correctly interpret random dot stereograms. Under altered rearing conditions modeled after neurophysiological experiments, the model qualitatively reproduces key experimental findings on changes in binocularity and disparity tuning. Furthermore, the model makes testable predictions regarding how altered rearing conditions impede the learning of precise vergence control. Finally, the model predicts a surprising new effect that impaired vergence control affects the statistics of orientation tuning in visual cortical neurons.

Research paper thumbnail of Global and local statistical regularities control visual attention to object sequences

Many previous studies have shown that both infants and adults are skilled statistical learners. B... more Many previous studies have shown that both infants and adults are skilled statistical learners. Because statistical learning is affected by attention, learners' ability to manage their attention can play a large role in what they learn. However, it is still unclear how learners allocate their attention in order to gain information in a visual environment containing multiple objects, especially how prior visual experience (i.e., familiarly of objects) influences where people look. To answer these questions, we collected eye movement data from adults exploring multiple novel objects while manipulating object familiarity with global (frequencies) and local (repetitions) regularities. We found that participants are sensitive to both global and local statistics embedded in their visual environment and they dynamically shift their attention to prioritize some objects over others as they gain knowledge of the objects and their distributions within the task.

Research paper thumbnail of An active-efficient-coding model of optokinetic nystagmus

Journal of Vision, Nov 10, 2016

Optokinetic nystagmus (OKN) is an involuntary eye movement responsible for stabilizing retinal im... more Optokinetic nystagmus (OKN) is an involuntary eye movement responsible for stabilizing retinal images in the presence of relative motion between an observer and the environment. Fully understanding the development of optokinetic nystagmus requires a neurally plausible computational model that accounts for the neural development and the behavior. To date, work in this area has been limited. We propose a neurally plausible framework for the joint development of disparity and motion tuning in the visual cortex and of optokinetic and vergence eye movement behavior. To our knowledge, this framework is the first developmental model to describe the emergence of OKN in a behaving organism. Unlike past models, which were based on scalar models of overall activity in different neural areas, our framework models the development of the detailed connectivity both from the retinal input to the visual cortex, as well as from the visual cortex to the motor neurons. This framework accounts for the importance of the development of normal vergence control and binocular vision in achieving normal monocular OKN (mOKN) behaviors. Because the model includes behavior, we can simulate the same perturbations as performed in past experiments, such as artificially induced strabismus. The proposed model agrees both qualitatively and quantitatively with a number of findings from the literature on both binocular vision as well as the optokinetic reflex. Finally, our model also makes quantitative predictions about OKN behavior using the same methods used to characterize OKN in the experimental literature.

Research paper thumbnail of Emerging Bayesian Priors in a Self-Organizing Recurrent Network

Springer eBooks, 2011

We explore the role of local plasticity rules in learning statistical priors in a self-organizing... more We explore the role of local plasticity rules in learning statistical priors in a self-organizing recurrent neural network (SORN). The network receives input sequences composed of different symbols and learns the structure embedded in these sequences via a simple spiketiming-dependent plasticity rule, while synaptic normalization and intrinsic plasticity maintain a low level of activity. After learning, the network exhibits spontaneous activity that matches the stimulus-evoked activity during training and thus can be interpreted as samples from the network's prior probability distribution over evoked activity states. Further, we show how learning the frequency and spatio-temporal characteristics of the input sequences influences network performance in several classification tasks. These results suggest a novel connection between low level learning mechanisms and high level concepts of statistical inference.

Research paper thumbnail of Autonomous learning of smooth pursuit and vergence through active efficient coding

We present a model for the autonomous and simultaneous learning of smooth pursuit and vergence ey... more We present a model for the autonomous and simultaneous learning of smooth pursuit and vergence eye movements based on principles of efficient coding. The model accounts for the joint development of visual encoding and eye movement control. Sparse coding models encode the incoming data and capture the statistics of the input in spatio-temporal basis functions while a reinforcement learner generates eye movements to optimise the efficiency of the encoding. We consider the embodiment of the approach in the iCub simulator and demonstrate the emergence of a self-calibrating smooth pursuit and vergence behaviour. Unlike standard computer vision approaches, it is driven by the interaction between sensory encoding and eye movements. Interestingly, our analysis shows that the emerging representations learned by this model are in line with results on velocity and disparity tuning properties of neurons in visual cortex.

Research paper thumbnail of Intrinsically motivated learning of visual motion perception and smooth pursuit

We extend the framework of efficient coding, which has been used to model the development of sens... more We extend the framework of efficient coding, which has been used to model the development of sensory processing in isolation, to model the development of the perception/action cycle. Our extension combines sparse coding and reinforcement learning so that sensory processing and behavior co-develop to optimize a shared intrinsic motivational signal: the fidelity of the neural encoding of the sensory input under resource constraints. Applying this framework to a model system consisting of an active eye behaving in a time varying environment, we find that this generic principle leads to the simultaneous development of both smooth pursuit behavior and model neurons whose properties are similar to those of primary visual cortical neurons selective for different directions of visual motion. We suggest that this general principle may form the basis for a unified and integrated explanation of many perception/action loops.

Research paper thumbnail of OpenEyeSim: A biomechanical model for simulation of closed-loop visual perception

Journal of Vision, Dec 22, 2016

We introduce OpenEyeSim, a detailed three-dimensional biomechanical model of the human extraocula... more We introduce OpenEyeSim, a detailed three-dimensional biomechanical model of the human extraocular eye muscles including a visualization of a virtual environment. The main purpose of OpenEyeSim is to serve as a platform for developing models of the joint learning of visual representations and eye-movement control in the perception-action cycle. The architecture and dynamic muscle properties are based on measurements of the human oculomotor system. We show that our model can reproduce different types of eye movements. Additionally, our model is able to calculate metabolic costs of eye movements. It is also able to simulate different eye disorders, such as different forms of strabismus. We propose OpenEyeSim as a platform for studying many of the complexities of oculomotor control and learning during normal and abnormal visual development.

Research paper thumbnail of Active efficient coding explains the development of binocular vision and its failure in amblyopia

Proceedings of the National Academy of Sciences of the United States of America, Mar 2, 2020

Research paper thumbnail of Self-calibrating smooth pursuit through active efficient coding

Robotics and Autonomous Systems, Sep 1, 2015

This paper presents a model for the autonomous learning of smooth pursuit eye movements based on ... more This paper presents a model for the autonomous learning of smooth pursuit eye movements based on an efficient coding criterion for active perception. This model accounts for the joint development of visual encoding and eye control. Sparse coding models encode the incoming data at two different spatial resolutions and capture the statistics of the input in spatio-temporal basis functions. A reinforcement learner controls eye velocity so as to maximize a reward signal based on the efficiency of the encoding. We consider the embodiment of the approach in the iCub simulator and real robot.Motion perception and smooth pursuit control are not explicitly expressed as tasks for the robot to achieve but emerge as the result of the system's active attempt to efficiently encode its sensory inputs. Experiments demonstrate that the proposed approach is selfcalibrating and robust to strong perturbations of the perception-action link.

Research paper thumbnail of Coordination in behavior and cognition: From Neurons to Mind

Research paper thumbnail of Time to augment self-supervised visual representation learning

arXiv (Cornell University), Jul 27, 2022

Biological vision systems are unparalleled in their ability to learn visual representations witho... more Biological vision systems are unparalleled in their ability to learn visual representations without supervision. In machine learning, self-supervised learning (SSL) has led to major advances in forming object representations in an unsupervised fashion. Such systems learn representations invariant to augmentation operations over images, like cropping or flipping. In contrast, biological vision systems exploit the temporal structure of the visual experience during natural interactions with objects. This gives access to "augmentations" not commonly used in SSL, like watching the same object from multiple viewpoints or against different backgrounds. Here, we systematically investigate and compare the potential benefits of such time-based augmentations during natural interactions for learning object categories. Our results show that time-based augmentations achieve large performance gains over state-of-the-art image augmentations. Specifically, our analyses reveal that: 1) 3-D object manipulations drastically improve the learning of object categories; 2) viewing objects against changing backgrounds is important for learning to discard background-related information from the latent representation. Overall, we conclude that time-based augmentations during natural interactions with objects can substantially improve self-supervised learning, narrowing the gap between artificial and biological vision systems.

Research paper thumbnail of Self-organization of complex cortex-like wiring in a spiking neural network model

BMC Neuroscience, Dec 1, 2015

Figure 1 Panel A: Mature synaptic weight distribution (in logarithmic space) for a simulation run... more Figure 1 Panel A: Mature synaptic weight distribution (in logarithmic space) for a simulation run with associated Gaussian (lognormal in linear space) fit. Panel B: Topological graph of mature network for a simulation run. Panel C: Triangular motif count (relative to random and corrected for an overrepresentation of bidirectional connections, similar to [1]) for topological graph of mature network. Motif key to right.

Research paper thumbnail of Using Eye Direction Cues for Gaze Following - A Developmental Model

We present a reinforcement learning model of gaze following that learns to incorporate eye cues i... more We present a reinforcement learning model of gaze following that learns to incorporate eye cues in a developmental trajectory similar to that found in infants: older infants follow gaze more frequently when both the caregiver's eyes and head are turned than when only her head is turned. Similarly, infants learn to follow gaze less when the caregiver's eyes are closed than when her eyes are open. The model works through the maximization of visual reward, and not by representing an estimate of the other person's attentional state, as would be expected by adherents of a mentalist interpretation of gaze following. Within the debate about the age of onset in the use of eye cues for gaze following, we hypothesize that eye cues have an effect as soon as the infant learns to follow gaze, but at first it might be small and therefore difficult to measure. Finally, we propose this learning approach for modeling other aspects of theory of mind.

Research paper thumbnail of Precise Synaptic Efficacy Alignment Suggests Potentiation Dominated Learning

Frontiers in Neural Circuits, Jan 13, 2016

Recent evidence suggests that parallel synapses from the same axonal branch onto the same dendrit... more Recent evidence suggests that parallel synapses from the same axonal branch onto the same dendritic branch have almost identical strength. It has been proposed that this alignment is only possible through learning rules that integrate activity over long time spans. However, learning mechanisms such as spike-timing-dependent plasticity (STDP) are commonly assumed to be temporally local. Here, we propose that the combination of temporally local STDP and a multiplicative synaptic normalization mechanism is sufficient to explain the alignment of parallel synapses. To address this issue, we introduce three increasingly complex models: First, we model the idealized interaction of STDP and synaptic normalization in a single neuron as a simple stochastic process and derive analytically that the alignment effect can be described by a so-called Kesten process. From this we can derive that synaptic efficacy alignment requires potentiationdominated learning regimes. We verify these conditions in a single-neuron model with independent spiking activities but more realistic synapses. As expected, we only observe synaptic efficacy alignment for long-term potentiation-biased STDP. Finally, we explore how well the findings transfer to recurrent neural networks where the learning mechanisms interact with the correlated activity of the network. We find that due to the self-reinforcing correlations in recurrent circuits under STDP, alignment occurs for both long-term potentiation-and depression-biased STDP, because the learning will be potentiation dominated in both cases due to the potentiating events induced by correlated activity. This is in line with recent results demonstrating a dominance of potentiation over depression during waking and normalization during sleep. This leads us to predict that individual spine pairs will be more similar after sleep compared to after sleep deprivation. In conclusion, we show that synaptic normalization in conjunction with coordinated potentiation-in this case, from STDP in the presence of correlated pre-and post-synaptic activity-naturally leads to an alignment of parallel synapses.

Research paper thumbnail of Fast Temporal Dynamics of Visual Cue Integration

Perception, Apr 1, 2002

Introduction A fundamental question in neuroscience is how the brain integrates information deriv... more Introduction A fundamental question in neuroscience is how the brain integrates information derived from different sensors, cues, and modalities into coherent percepts. Many researchers have looked at this question from behavioral, computational, and neurophysiological viewpoints. Unfortunately, different integration strategies have been observed in different experiments. Examples are weighted averaging (von Holst 1950; Bruno and Cutting 1988) or extensions thereof (Landy et al 1995), multiplicative interactions (Stein and Meredith 1993), Boolean logic (Newman and Hartline 1982), fuzzy logic (Massaro and Friedman 1990), and linear and nonlinear Bayesian inference (Yuille and Bu« lthoff 1996). It seems that the way in which different cues are integrated depends on the cues involved, the nature of the task, characteristics of the sensory environment, and prior experience and knowledge of the observers. We believe that an important reason for investigators being unable to identify``the cue-integration strategy'' is that observers do not use a single, immutable strategy. Rather, they use a collection of context-sensitive strategies that are adaptable in an experience-dependent manner. Evidence of the adaptability of cue-integration strategies has been mentioned quite early (von Holst 1950), but has recently begun to accumulate. Jacobs and Fine (1999) used a cue-conflict experimental paradigm to show that observers' cue-combination strategies for visual depth are adaptable as a function of training; subjects adjusted their cue-combination rules to use a visual cue more heavily after training in which that cue was informative than after training in which the cue was irrelevant. Moreover, these researchers showed that observers can learn multiple cue-combination rules, and can learn to apply each rule in its appropriate context. Ernst et al (2000) studied the adaptability of observers' cue-integration strategies using a virtual-reality environment that allowed subjects to interact with viewed objects by touching them. They showed

Research paper thumbnail of Autonomous Development of Active Binocular and Motion Vision Through Active Efficient Coding

Frontiers in Neurorobotics, Jul 16, 2019

We present a model for the autonomous and simultaneous learning of active binocular and motion vi... more We present a model for the autonomous and simultaneous learning of active binocular and motion vision. The model is based on the Active Efficient Coding (AEC) framework, a recent generalization of classic efficient coding theories to active perception. The model learns how to efficiently encode the incoming visual signals generated by an object moving in 3-D through sparse coding. Simultaneously, it learns how to produce eye movements that further improve the efficiency of the sensory coding. This learning is driven by an intrinsic motivation to maximize the system's coding efficiency. We test our approach on the humanoid robot iCub using simulations. The model demonstrates self-calibration of accurate object fixation and tracking of moving objects. Our results show that the model keeps improving until it hits physical constraints such as camera or motor resolution, or limits on its internal coding capacity. Furthermore, we show that the emerging sensory tuning properties are in line with results on disparity, motion, and motion-in-depth tuning in the visual cortex of mammals. The model suggests that vergence and tracking eye movements can be viewed as fundamentally having the same objective of maximizing the coding efficiency of the visual system and that they can be learned and calibrated jointly through AEC.

Research paper thumbnail of Bridging structure and function: A model of sequence learning and prediction in primary visual cortex

PLOS Computational Biology, Jun 5, 2018

Research paper thumbnail of An explanation of the familiarity-to-novelty-shift in infant habituation

Frontiers in Computational Neuroscience, 2009

Research paper thumbnail of MIMo: A Multi-Modal Infant Model for Studying Cognitive Development in Humans and AIs

Research paper thumbnail of An Active Efficient Coding Model of Binocular Vision Development Under Normal and Abnormal Rearing Conditions

Lecture Notes in Computer Science, 2018

The development of binocular vision encompasses the formation of binocular receptive fields tuned... more The development of binocular vision encompasses the formation of binocular receptive fields tuned to different disparities and the calibration of accurate vergence eye movements. Experiments have shown that this development is impaired when the animal is exposed to certain abnormal rearing conditions such as growing up in an environment that is deprived of horizontal or vertical edges. Here we test the effect of abnormal rearing conditions on a recently proposed computational model of binocular development. The model is formulated in the Active Efficient Coding framework, a generalization of classic efficient coding ideas to active perception. We show that abnormal rearing conditions lead to differences in the model's development that qualitatively match those seen in animal experiments. Furthermore, the model predicts systematic changes in vergence accuracy due to abnormal rearing. We discuss implications of the model for the treatment of developmental disorders of binocular vision such as amblyopia and strabismus.

Research paper thumbnail of The development of active binocular vision under normal and alternate rearing conditions

eLife, Aug 17, 2021

The development of binocular vision is an active learning process comprising the development of d... more The development of binocular vision is an active learning process comprising the development of disparity tuned neurons in visual cortex and the establishment of precise vergence control of the eyes. We present a computational model for the learning and self-calibration of active binocular vision based on the Active Efficient Coding framework, an extension of classic efficient coding ideas to active perception. Under normal rearing conditions with naturalistic input, the model develops disparity tuned neurons and precise vergence control, allowing it to correctly interpret random dot stereograms. Under altered rearing conditions modeled after neurophysiological experiments, the model qualitatively reproduces key experimental findings on changes in binocularity and disparity tuning. Furthermore, the model makes testable predictions regarding how altered rearing conditions impede the learning of precise vergence control. Finally, the model predicts a surprising new effect that impaired vergence control affects the statistics of orientation tuning in visual cortical neurons.

Research paper thumbnail of Global and local statistical regularities control visual attention to object sequences

Many previous studies have shown that both infants and adults are skilled statistical learners. B... more Many previous studies have shown that both infants and adults are skilled statistical learners. Because statistical learning is affected by attention, learners' ability to manage their attention can play a large role in what they learn. However, it is still unclear how learners allocate their attention in order to gain information in a visual environment containing multiple objects, especially how prior visual experience (i.e., familiarly of objects) influences where people look. To answer these questions, we collected eye movement data from adults exploring multiple novel objects while manipulating object familiarity with global (frequencies) and local (repetitions) regularities. We found that participants are sensitive to both global and local statistics embedded in their visual environment and they dynamically shift their attention to prioritize some objects over others as they gain knowledge of the objects and their distributions within the task.

Research paper thumbnail of An active-efficient-coding model of optokinetic nystagmus

Journal of Vision, Nov 10, 2016

Optokinetic nystagmus (OKN) is an involuntary eye movement responsible for stabilizing retinal im... more Optokinetic nystagmus (OKN) is an involuntary eye movement responsible for stabilizing retinal images in the presence of relative motion between an observer and the environment. Fully understanding the development of optokinetic nystagmus requires a neurally plausible computational model that accounts for the neural development and the behavior. To date, work in this area has been limited. We propose a neurally plausible framework for the joint development of disparity and motion tuning in the visual cortex and of optokinetic and vergence eye movement behavior. To our knowledge, this framework is the first developmental model to describe the emergence of OKN in a behaving organism. Unlike past models, which were based on scalar models of overall activity in different neural areas, our framework models the development of the detailed connectivity both from the retinal input to the visual cortex, as well as from the visual cortex to the motor neurons. This framework accounts for the importance of the development of normal vergence control and binocular vision in achieving normal monocular OKN (mOKN) behaviors. Because the model includes behavior, we can simulate the same perturbations as performed in past experiments, such as artificially induced strabismus. The proposed model agrees both qualitatively and quantitatively with a number of findings from the literature on both binocular vision as well as the optokinetic reflex. Finally, our model also makes quantitative predictions about OKN behavior using the same methods used to characterize OKN in the experimental literature.

Research paper thumbnail of Emerging Bayesian Priors in a Self-Organizing Recurrent Network

Springer eBooks, 2011

We explore the role of local plasticity rules in learning statistical priors in a self-organizing... more We explore the role of local plasticity rules in learning statistical priors in a self-organizing recurrent neural network (SORN). The network receives input sequences composed of different symbols and learns the structure embedded in these sequences via a simple spiketiming-dependent plasticity rule, while synaptic normalization and intrinsic plasticity maintain a low level of activity. After learning, the network exhibits spontaneous activity that matches the stimulus-evoked activity during training and thus can be interpreted as samples from the network's prior probability distribution over evoked activity states. Further, we show how learning the frequency and spatio-temporal characteristics of the input sequences influences network performance in several classification tasks. These results suggest a novel connection between low level learning mechanisms and high level concepts of statistical inference.

Research paper thumbnail of Autonomous learning of smooth pursuit and vergence through active efficient coding

We present a model for the autonomous and simultaneous learning of smooth pursuit and vergence ey... more We present a model for the autonomous and simultaneous learning of smooth pursuit and vergence eye movements based on principles of efficient coding. The model accounts for the joint development of visual encoding and eye movement control. Sparse coding models encode the incoming data and capture the statistics of the input in spatio-temporal basis functions while a reinforcement learner generates eye movements to optimise the efficiency of the encoding. We consider the embodiment of the approach in the iCub simulator and demonstrate the emergence of a self-calibrating smooth pursuit and vergence behaviour. Unlike standard computer vision approaches, it is driven by the interaction between sensory encoding and eye movements. Interestingly, our analysis shows that the emerging representations learned by this model are in line with results on velocity and disparity tuning properties of neurons in visual cortex.

Research paper thumbnail of Intrinsically motivated learning of visual motion perception and smooth pursuit

We extend the framework of efficient coding, which has been used to model the development of sens... more We extend the framework of efficient coding, which has been used to model the development of sensory processing in isolation, to model the development of the perception/action cycle. Our extension combines sparse coding and reinforcement learning so that sensory processing and behavior co-develop to optimize a shared intrinsic motivational signal: the fidelity of the neural encoding of the sensory input under resource constraints. Applying this framework to a model system consisting of an active eye behaving in a time varying environment, we find that this generic principle leads to the simultaneous development of both smooth pursuit behavior and model neurons whose properties are similar to those of primary visual cortical neurons selective for different directions of visual motion. We suggest that this general principle may form the basis for a unified and integrated explanation of many perception/action loops.

Research paper thumbnail of OpenEyeSim: A biomechanical model for simulation of closed-loop visual perception

Journal of Vision, Dec 22, 2016

We introduce OpenEyeSim, a detailed three-dimensional biomechanical model of the human extraocula... more We introduce OpenEyeSim, a detailed three-dimensional biomechanical model of the human extraocular eye muscles including a visualization of a virtual environment. The main purpose of OpenEyeSim is to serve as a platform for developing models of the joint learning of visual representations and eye-movement control in the perception-action cycle. The architecture and dynamic muscle properties are based on measurements of the human oculomotor system. We show that our model can reproduce different types of eye movements. Additionally, our model is able to calculate metabolic costs of eye movements. It is also able to simulate different eye disorders, such as different forms of strabismus. We propose OpenEyeSim as a platform for studying many of the complexities of oculomotor control and learning during normal and abnormal visual development.

Research paper thumbnail of Active efficient coding explains the development of binocular vision and its failure in amblyopia

Proceedings of the National Academy of Sciences of the United States of America, Mar 2, 2020

Research paper thumbnail of Self-calibrating smooth pursuit through active efficient coding

Robotics and Autonomous Systems, Sep 1, 2015

This paper presents a model for the autonomous learning of smooth pursuit eye movements based on ... more This paper presents a model for the autonomous learning of smooth pursuit eye movements based on an efficient coding criterion for active perception. This model accounts for the joint development of visual encoding and eye control. Sparse coding models encode the incoming data at two different spatial resolutions and capture the statistics of the input in spatio-temporal basis functions. A reinforcement learner controls eye velocity so as to maximize a reward signal based on the efficiency of the encoding. We consider the embodiment of the approach in the iCub simulator and real robot.Motion perception and smooth pursuit control are not explicitly expressed as tasks for the robot to achieve but emerge as the result of the system's active attempt to efficiently encode its sensory inputs. Experiments demonstrate that the proposed approach is selfcalibrating and robust to strong perturbations of the perception-action link.