Constantin Rothkopf | Technische Universität Darmstadt (original) (raw)
Uploads
Papers by Constantin Rothkopf
2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
arXiv (Cornell University), Feb 17, 2019
Nature Machine Intelligence, Mar 23, 2022
Journal of Vision, Sep 1, 2018
Eye movements in extended sequential behavior are known to reflect task demands much more than lo... more Eye movements in extended sequential behavior are known to reflect task demands much more than low-level feature saliency. However, the more naturalistic the task is the more difficult it becomes to establish what cognitive processes a particular task elicits moment by moment. Here we ask the question, which sequential model is required to capture gaze sequences so that the ongoing task can be inferred reliably. Specifically, we consider eye movements of human subjects navigating a walkway while avoiding obstacles and approaching targets in a virtual environment. We show that Hidden-Markov Models, which have been used extensively in modeling human sequential behavior, can be augmented with few state variables describing the egocentric position of subjects relative to objects in the environment to dramatically increase successful classification of the ongoing task and to generate gaze sequences, that are very close to those observed in human subjects.
What is the link between eye movements and sensory learning? Although some theories have argued f... more What is the link between eye movements and sensory learning? Although some theories have argued for an automatic interaction between what we know and where we look that continuously modulates human information gathering behavior during both implicit and explicit learning, there exists limited experimental evidence supporting such an ongoing interplay. To address this issue, we used a visual statistical learning paradigm combined with a gaze contingent stimulus presentation and manipulated the explicitness of the task to explore how learning and eye movements interact. During both implicit exploration and explicit visual learning of unknown composite visual scenes, spatial eye movement patterns systematically and gradually changed in accordance with the underlying statistical structure of the scenes. Moreover, the degree of change was directly correlated with the amount and type of knowledge the observers acquired. This suggests that eye-movements are potential indicators of active learning, a process where long-term knowledge, current visual stimuli and an inherent tendency to reduce uncertainty about the visual environment jointly determine where we look.
arXiv (Cornell University), Nov 14, 2022
Frontiers in artificial intelligence, May 20, 2020
Allowing machines to choose whether to kill humans would be devastating for world peace and secur... more Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? In this study, we show that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct. We create a template list of prompts and responses, such as "Should I [action]?", "Is it okay to [action]?", etc. with corresponding answers of "Yes/no, I should (not)." and "Yes/no, it is (not)." The model's bias score is the difference between the model's score of the positive response ("Yes, I should") and that of the negative response ("No, I should not"). For a given choice, the model's overall bias score is the mean of the bias scores of all question/answer templates paired with that choice. Specifically, the resulting model, called the Moral Choice Machine (MCM), calculates the bias score on a sentence level using embeddings of the Universal Sentence Encoder since the moral value of an action to be taken depends on its context. It is objectionable to kill living beings, but it is fine to kill time. It is essential to eat, yet one might not eat dirt. It is important to spread information, yet one should not spread misinformation. Our results indicate that text corpora contain recoverable and accurate imprints of our social, ethical and moral choices, even with context information. Actually, training the Moral Choice Machine on different temporal news and book corpora from the year 1510 to 2008/2009 demonstrate the evolution of moral and ethical choices over different time periods for both atomic actions and actions with context information. By training it on different cultural sources such as the Bible and the constitution of different countries, the dynamics of moral choices in culture, including technology are revealed. That is the fact that moral biases can be extracted, quantified, tracked, and compared across cultures and over time.
Journal of Vision, Oct 20, 2020
bioRxiv (Cold Spring Harbor Laboratory), Dec 23, 2021
PLOS Computational Biology, Oct 19, 2020
Wiley Interdisciplinary Reviews: Cognitive Science, Sep 22, 2010
Historically, the study of visual perception has followed a reductionist strategy, with the goal ... more Historically, the study of visual perception has followed a reductionist strategy, with the goal of understanding complex visually guided behavior by separate analysis of its elemental components. Recent developments in monitoring behavior, such as measurement of eye movements in unconstrained observers, have allowed investigation of the use of vision in the natural world. This has led to a variety of insights that would be difficult to achieve in more constrained experimental contexts. In general, it shifts the focus of vision away from the properties of the stimulus toward a consideration of the behavioral goals of the observer. It appears that behavioral goals are a critical factor in controlling the acquisition of visual information from the world. This insight has been accompanied by a growing understanding of the importance of reward in modulating the underlying neural mechanisms and by theoretical developments using reinforcement learning models of complex behavior. These developments provide us with the tools to understanding how tasks are represented in the brain, and how they control acquisition of information through use of gaze.
arXiv (Cornell University), Oct 5, 2021
arXiv (Cornell University), Apr 16, 2021
Journal of Vision, Dec 5, 2022
2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
arXiv (Cornell University), Feb 17, 2019
Nature Machine Intelligence, Mar 23, 2022
Journal of Vision, Sep 1, 2018
Eye movements in extended sequential behavior are known to reflect task demands much more than lo... more Eye movements in extended sequential behavior are known to reflect task demands much more than low-level feature saliency. However, the more naturalistic the task is the more difficult it becomes to establish what cognitive processes a particular task elicits moment by moment. Here we ask the question, which sequential model is required to capture gaze sequences so that the ongoing task can be inferred reliably. Specifically, we consider eye movements of human subjects navigating a walkway while avoiding obstacles and approaching targets in a virtual environment. We show that Hidden-Markov Models, which have been used extensively in modeling human sequential behavior, can be augmented with few state variables describing the egocentric position of subjects relative to objects in the environment to dramatically increase successful classification of the ongoing task and to generate gaze sequences, that are very close to those observed in human subjects.
What is the link between eye movements and sensory learning? Although some theories have argued f... more What is the link between eye movements and sensory learning? Although some theories have argued for an automatic interaction between what we know and where we look that continuously modulates human information gathering behavior during both implicit and explicit learning, there exists limited experimental evidence supporting such an ongoing interplay. To address this issue, we used a visual statistical learning paradigm combined with a gaze contingent stimulus presentation and manipulated the explicitness of the task to explore how learning and eye movements interact. During both implicit exploration and explicit visual learning of unknown composite visual scenes, spatial eye movement patterns systematically and gradually changed in accordance with the underlying statistical structure of the scenes. Moreover, the degree of change was directly correlated with the amount and type of knowledge the observers acquired. This suggests that eye-movements are potential indicators of active learning, a process where long-term knowledge, current visual stimuli and an inherent tendency to reduce uncertainty about the visual environment jointly determine where we look.
arXiv (Cornell University), Nov 14, 2022
Frontiers in artificial intelligence, May 20, 2020
Allowing machines to choose whether to kill humans would be devastating for world peace and secur... more Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? In this study, we show that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct. We create a template list of prompts and responses, such as "Should I [action]?", "Is it okay to [action]?", etc. with corresponding answers of "Yes/no, I should (not)." and "Yes/no, it is (not)." The model's bias score is the difference between the model's score of the positive response ("Yes, I should") and that of the negative response ("No, I should not"). For a given choice, the model's overall bias score is the mean of the bias scores of all question/answer templates paired with that choice. Specifically, the resulting model, called the Moral Choice Machine (MCM), calculates the bias score on a sentence level using embeddings of the Universal Sentence Encoder since the moral value of an action to be taken depends on its context. It is objectionable to kill living beings, but it is fine to kill time. It is essential to eat, yet one might not eat dirt. It is important to spread information, yet one should not spread misinformation. Our results indicate that text corpora contain recoverable and accurate imprints of our social, ethical and moral choices, even with context information. Actually, training the Moral Choice Machine on different temporal news and book corpora from the year 1510 to 2008/2009 demonstrate the evolution of moral and ethical choices over different time periods for both atomic actions and actions with context information. By training it on different cultural sources such as the Bible and the constitution of different countries, the dynamics of moral choices in culture, including technology are revealed. That is the fact that moral biases can be extracted, quantified, tracked, and compared across cultures and over time.
Journal of Vision, Oct 20, 2020
bioRxiv (Cold Spring Harbor Laboratory), Dec 23, 2021
PLOS Computational Biology, Oct 19, 2020
Wiley Interdisciplinary Reviews: Cognitive Science, Sep 22, 2010
Historically, the study of visual perception has followed a reductionist strategy, with the goal ... more Historically, the study of visual perception has followed a reductionist strategy, with the goal of understanding complex visually guided behavior by separate analysis of its elemental components. Recent developments in monitoring behavior, such as measurement of eye movements in unconstrained observers, have allowed investigation of the use of vision in the natural world. This has led to a variety of insights that would be difficult to achieve in more constrained experimental contexts. In general, it shifts the focus of vision away from the properties of the stimulus toward a consideration of the behavioral goals of the observer. It appears that behavioral goals are a critical factor in controlling the acquisition of visual information from the world. This insight has been accompanied by a growing understanding of the importance of reward in modulating the underlying neural mechanisms and by theoretical developments using reinforcement learning models of complex behavior. These developments provide us with the tools to understanding how tasks are represented in the brain, and how they control acquisition of information through use of gaze.
arXiv (Cornell University), Oct 5, 2021
arXiv (Cornell University), Apr 16, 2021
Journal of Vision, Dec 5, 2022
Dynamic Coordination in the Brain, Aug 2, 2010
Effective perception requires the integration of many noisy and ambiguous sensory signals across ... more Effective perception requires the integration of many noisy and ambiguous sensory signals across different modalities (eg, vision, audition) into stable percepts. This chapter discusses some of the core questions related to sensory integration: Why does the brain integrate sensory signals, and how does it do so? How does it learn this ability? How does it know when to integrate signals and when to treat them separately? How dynamic is the process of sensory integration?
Dynamic Coordination in the Brain, Aug 2, 2010
Effective perception requires the integration of many noisy and ambiguous sensory signals across ... more Effective perception requires the integration of many noisy and ambiguous sensory signals across different modalities (eg, vision, audition) into stable percepts. This chapter discusses some of the core questions related to sensory integration: Why does the brain integrate sensory signals, and how does it do so? How does it learn this ability? How does it know when to integrate signals and when to treat them separately? How dynamic is the process of sensory integration?
th.physik.uni-frankfurt.de
Abstract: Computational modeling largely based on advances in artificial intelligence and machine... more Abstract: Computational modeling largely based on advances in artificial intelligence and machine learning has helped furthering the understanding of some of the principles and mechanisms of multisensory object perception. Furthermore, this theoretical work has led to the ...
Computational and Robotic Models of the Hierarchical Organization of Behavior, 2013
th.physik.uni-frankfurt.de
Computational modeling largely based on advances in artificial intelligence and machine learning ... more Computational modeling largely based on advances in artificial intelligence and machine learning has helped furthering the understanding of some of the principles and mechanisms of multisensory object perception. Furthermore, this theoretical work has led to the development of new experimental paradigms and to important new questions. The last 20 years have seen an increasing emphasis on models that explicitly compute with uncertainties, a crucial aspect of the relation between sensory signals and states of the world. Bayesian models allow for the formulation of such relationships and also of explicit optimality criteria against which human performance can be compared. They therefore allow answering the question, how close human performance comes to a specific formulation of best performance. Maybe even more importantly, Bayesian methods allow comparing quantitatively different models by how well they account for observed data. The success of such techniques in explaining perceptual phenomena has also led to a large number of new open questions, especially about how the brain is able to perform computations that are consistent with these functional models and also about the origin of the algorithms in the brain. We briefly review some key empirical evidence of crossmodal perception and proceed to give an overview of the computational principles evident form this work. The presentation of current modeling approaches to multisensory perception considers Bayesian models, models at an intermediate level, and neural models implementing multimodal computations. Finally, this chapter specifically emphasizes current open questions in theoretical models of multisensory object perception.
Computational modeling largely based on advances in artificial intelligence and machine learning ... more Computational modeling largely based on advances in artificial intelligence and machine learning has helped furthering the understanding of some of the principles and mechanisms of multisensory object perception. Furthermore, this theoretical work has led to the development of new experimental paradigms and to important new questions. The last 20 years have seen an increasing emphasis on models that explicitly compute with uncertainties, a crucial aspect of the relation between sensory signals and states of the world. Bayesian models allow for the formulation of such relationships and also of explicit optimality criteria against which human performance can be compared. They therefore allow answering the question, how close human performance comes to a specific formulation of best performance. Maybe even more importantly, Bayesian methods allow comparing quantitatively different models by how well they account for observed data. The success of such techniques in explaining perceptual phenomena has also led to a large number of new open questions, especially about how the brain is able to perform computations that are consistent with these functional models and also about the origin of the algorithms in the brain. We briefly review some key empirical evidence of crossmodal perception and proceed to give an overview of the computational principles evident form this work. The presentation of current modeling approaches to multisensory perception considers Bayesian models, models at an intermediate level, and neural models implementing multimodal computations. Finally, this chapter specifically emphasizes current open questions in theoretical models of multisensory object perception.
Computational and Robotic Models of the Hierarchical Organization of Behavior, 2013
Journal of vision, 2015
While several recent studies have established the optimality of spatial targeting of gaze in huma... more While several recent studies have established the optimality of spatial targeting of gaze in humans, it is still unknown whether this optimality extends to the timing of gaze in time-varying environments. Moreover, it is unclear to what extent visual attention is guided by learning processes, which facilitate adaption to changes in the world. We present empirical evidence for significant changes in attentive visual behavior due to an observer's experience and learning. Crucially, we present a hierarchical Bayesian model, that not only fits our behavioral data but also explains dynamic changes of gaze patterns in terms of learning. We devised a controlled experiment to investigate how humans divide their attentional resources over time among multiple targets in order to achieve task-specific goals. Eye movement data was collected from the participants. During each trial, three stimuli were presented on a computer screen arranged in a triangle. Each stimulus consisted of a small d...
PLOS Computational Biology
Although a standard reinforcement learning model can capture many aspects of rewardseeking behavi... more Although a standard reinforcement learning model can capture many aspects of rewardseeking behaviors, it may not be practical for modeling human natural behaviors because of the richness of dynamic environments and limitations in cognitive resources. We propose a modular reinforcement learning model that addresses these factors. Based on this model, a modular inverse reinforcement learning algorithm is developed to estimate both the rewards and discount factors from human behavioral data, which allows predictions of human navigation behaviors in virtual reality with high accuracy across different subjects and with different tasks. Complex human navigation trajectories in novel environments can be reproduced by an artificial agent that is based on the modular model. This model provides a strategy for estimating the subjective value of actions and how they influence sensory-motor decisions in natural behavior.
2019 Conference on Cognitive Computational Neuroscience
Journal of …, Jan 1, 2005
Abstract The deployment of visual attention is commonly framed as being determined by the propert... more Abstract The deployment of visual attention is commonly framed as being determined by the properties of the visual scene. Top-down factors have been acknowledged, but they have been described as modulating bottom-up saliency. Alternative models have proposed to understand visual attention in terms of the requirements of goal directed tasks. In such a setting, the underlying task structure is the focus of the observed fixation patterns.
Journal of Vision, Jan 1, 2005
Visual cortex must calibrate the receptive fields of billions of neurons in a hierarchy of maps. ... more Visual cortex must calibrate the receptive fields of billions of neurons in a hierarchy of maps. Modeling this process is daunting, but a promising direction is minimum description length theory (MDL). In MDL, the cortex builds a theory of itself and does this by trading off the bits to ...
Journal of Vision, Jan 1, 2009
To investigate the role of higher order statistics and task behavior we obtain image sequences by... more To investigate the role of higher order statistics and task behavior we obtain image sequences by simulating an agent navigating through simulated wooded environments. Unsupervised learning algorithms are used to learn a sparse code of the images but contrary to previous ...
Journal of Vision, Jan 1, 2006
Subjects were immersed in a virtual scene and hit virtual balls directed towards them with a tabl... more Subjects were immersed in a virtual scene and hit virtual balls directed towards them with a table tennis paddle, receiving vibrotactile feedback when hitting, and audible feedback from the bounce. This setup allows to control the trajectories, the physical properties of the ball ...
Journal of Vision, Jan 1, 2008
Neurophysiological and psychophysical studies in primates and humans have shown the pervasive rol... more Neurophysiological and psychophysical studies in primates and humans have shown the pervasive role of reward in learning of visuomotor activities. Eye movements and hand movements to targets are executed so as to maximize reward. Moreover, reinforcement learning algorithms ...
Psychophysical studies in humans have demonstrated that visuomotor behavior such as rapid hand mo... more Psychophysical studies in humans have demonstrated that visuomotor behavior such as rapid hand movements to targets are sensitive to the reward structure of the goal and are executed so as to maximize reward1. Similarly, animal studies have shown that learning of such behaviors is driven by reward. Evidence suggests that the neurotransmitter dopamine is involved in the process of learning visomotor behaviors2.
Evidence suggests that neurotransmitter dopamine is involved in the process of learning viso-moto... more Evidence suggests that neurotransmitter dopamine is involved in the process of learning viso-motor behaviors1. Moreover, reinforcement learning (RL) algorithms have been formulated that characterize well the neuronal signals of dopaminergic neurons in response to the occurrences of stimuli associated with rewards and the delivery of the rewards across learning2. Such algorithms have been successful in modeling those responses in cases where only a single variable describes the current state of the world.
Theories of optimal coding propose to understand early sensory processing of stimuli as being ada... more Theories of optimal coding propose to understand early sensory processing of stimuli as being adapted to the statistics of the signals naturally occurring when interacting with the environment [1, 2, 3]. Relating this approach to vision, the regularities of natural image sequences have been investigated, eg.[4, 5]. But, given that perception is active, the stimuli at the retina are dependent on oculomotor control and therefore on the executed task [6]. How do different tasks affect the statistics of image features at the fixation location?