Constantin Rothkopf - Profile on Academia.edu (original) (raw)
Papers by Constantin Rothkopf
Learning Individualized Automatic Content Magnification in Gaze-based Interaction
Optimal feedback control under uncertainty explains errors, variability and behavioral strategies in human navigation
What Can I Help You With: Towards Task-Independent Detection of Intentions for Interaction in a Human-Robot Environment
2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
arXiv (Cornell University), Feb 17, 2019
The Federal Government of Germany aims to boost the research in the field of Artificial Intellige... more The Federal Government of Germany aims to boost the research in the field of Artificial Intelligence (AI). For instance, 100 new professorships are said to be established. However, the white paper of the government does not answer what an AI professorship is at all. In order to give colleagues, politicians, and citizens an idea, we present a view that is often followed when appointing professors for AI at German and international universities. We hope that it will help to establish a guideline with internationally accepted measures and thus make the public debate more informed. Zusammenfassung: Die deutsche Bundesregierung will die Forschung in der Künstlichen Intelligenz in Deutschland deutlich mehr fördern als bisher. Es sollen z.B. 100 neue Professuren entstehen. Allerdings beantwortet das Strategiepapier nicht, was eine Professur für Künstliche Intelligenz überhaupt ist. Um Kollegen, Politikern und Bürgern eine Idee zu geben, stellen wir eine Einordnung vor, wie sie in Berufungsverfahren an deutschen und internationalen Universitäten üblich ist. Wir hoffen, dass eine solche Einordnung einen Beitrag zu einem Leitfaden mit Messpunkten liefert und so die öffentliche Debatte aufgeklärter gestaltet. erkannte Notwendigkeit, Experten in der Quantifizierung und Formalisierung menschlicher Denkvorgänge, was heute als Kognitionswissenschaft gilt, einzubinden. Als thematische Liste wurde schon bei diesem Workshop Computer, Verarbeitung natürlicher Sprache, Neuronale Netzwerke, Theorie der Computation, Abstraktion und Kreativität genannt, welche noch heute relevant sind. Somit bezog sich der Begriff Künstliche Intelligenz von Anfang an unmittelbar auch auf menschliche Intelligenz, wobei es sich herausgestellt hat, dass eine klare Definition, was menschliche oder natürlich Intelligenz überhaupt sei, recht schwierig ist, und man mit dieser Unsicherheit leben muss, so wie in anderen akzeptierten Wissenschaftsdisziplinen wie z.B. der Psychologie, der Biologie oder der Ökonomie auch. Dementsprechend hat mit der Geburt der Künstliche Intelligenz auch eine weitere Wissenschaft das Licht der Welt erblickt, nämlich die Cognitive Science, die sich damit beschäftigt zu verstehen, was menschliche oder natürliche Intelligenz überhaupt ausmacht. So haben sich die beiden Felder der Künstlichen Intelligenz und der Cognitive Science immer wieder gegenseitig inspiriert und beeinflusst. Aber was ist denn nun KI? Im Zuge des Dartmouth Workshops führte John McCarthy den Begriff "Künstliche Intelligenz" (KI, engl. artificial intelligence) ein und definierte 3 ihn wie folgt: KI ist also die (Ingenieur)Wissenschaft intelligenter Maschinen, insbesondere intelligenter Computerprogramme. Das sind Algorithmen 4 -eindeutige Handlungsvorschriften zur Lösung eines Problems oder einer Klasse von Problemen und bei einer bestimmten Eingabe eine bestimmte Ausgabe überführen -des Problemlösens, Denkens und Lernens. Daher ist die KI-Forschung auch stark in der Informatik zu verankern.
Nature Machine Intelligence, Mar 23, 2022
Artificial writing is permeating our lives due to recent advances in large-scale, transformer-bas... more Artificial writing is permeating our lives due to recent advances in large-scale, transformer-based language models (LMs) such as BERT, its variants, GPT-2/3, and others. Using them as pre-trained models and fine-tuning them for specific tasks, researchers have extended state of the art for many NLP tasks and shown that they capture not only linguistic knowledge but also retain general knowledge implicitly present in the data. Unfortunately, LMs trained on unfiltered text corpora suffer from degenerated and biased behaviour. While this is well established, we show that recent LMs also contain human-like biases of what is right and wrong to do, some form of ethical and moral norms of the society -they bring a "moral direction" to surface. That is, we show that these norms can be captured geometrically by a direction, which can be computed, e.g., by a PCA, in the embedding space, reflecting well the agreement of phrases to social norms implicitly expressed in the training texts and providing a path for attenuating or even preventing toxic degeneration in LMs. Being able to rate the (non-)normativity of arbitrary phrases without explicitly training the LM for this task, we demonstrate the capabilities of the "moral direction" for guiding (even other) LMs towards producing normative text and showcase it on RealToxicityPrompts testbed, preventing the neural toxic degeneration in GPT-2. Large-scale, transformer-based language models (LMs) such as BERT [1], its variants [2, 3], GPT-2/3 [4], and others have shown improvements on various NLP tasks. By now, they are so good at generating human-like text that articles and social media often describe it as the "world's most impressive AI" and "terrifyingly good" . Several studies revealed improved syntactic and semantic abilities of large-scale transform-based LMs [6,, 10] compared to previous models such as RNNs. Furthermore, Talmor et al. [11] demonstrated that LMs exhibit reasoning abilities, although not in an abstract manner, and Roberts et al. showed that LMs' capability to store and retrieve knowledge scales with model size. Petroni et al. [13] demonstrated that, besides learning linguistic knowledge, recent transformer-based LMs even retain general knowledge implicitly present in the training data. While these successes are very exciting, there are also risks associated with developing them as also discussed in . Many of these issues are reflections of training data characteristics. Already language itself contains recoverable and accurate imprints of our historical biases, and Machine Learning algorithms such as LMs may capture these regularities, as e.g. Caliskan et al. [20] have demonstrated. Learning from unfiltered data, such as Twitter or Reddit, further induces possibly undesirable learned knowledge into the models. LMs used for downstream tasks such as credit risk prediction are propagating this implicit knowledge to the classifier, and LMs with generative capabilities are suffering from toxic degeneration [15], i.e. they are prone to generating non-normative
Although humans are prone to perceptual illusions and decision biases, they perform very well in ... more Although humans are prone to perceptual illusions and decision biases, they perform very well in every-day tasks with varying difficulties and complexities. It has been shown that humans learn to adopt to the statistical regularities of the environment. However, whether humans have correct physical intuitions about these ordinary processes and reflect related dynamics in an appropriate internal model has been disputed. Recent studies have shown that human behavior in diverse physical judgment tasks can indeed be explained with probabilistic models based on realistic, Newtonian functions while considering sensory uncertainties. Here, we examined whether humans use physical models of their environment in a control task, which involves non-linearities in the involved dynamics. Participants were asked to shoot a puck into a target area affected by realistic friction. By deploying Bayesian models we can show that humans are capable to adopt to these physical relationships and have appropriate internal beliefs about relevant quantities.
Perceptual explaining away in depth judgements
Journal of Vision, Sep 1, 2018
In many situations encountered in our daily lives where we have several options to choose from, w... more In many situations encountered in our daily lives where we have several options to choose from, we need to balance the amount of planning into the future with the number of alternatives we want to consider to achieve our long-term goals. A popular way to study these planning problems in controlled environments is maze-solving tasks, since they can be precisely defined and controlled in terms of their topology. In our study, participants solved mazes that differed systematically in topological properties regulating the number of alternatives and depth of paths. Replicating previous results, we show the influence of these spatial features on performance and stopping times. Longer and more branched solution paths lead to more planning effort and longer solution times. Additionally, we measured subjects' eye movements to investigate their planning horizon. Our results suggest that people decrease their planning depth with increasing number of alternatives.
Minimal sequential gaze models for inferring walkers' tasks
Eye movements in extended sequential behavior are known to reflect task demands much more than lo... more Eye movements in extended sequential behavior are known to reflect task demands much more than low-level feature saliency. However, the more naturalistic the task is the more difficult it becomes to establish what cognitive processes a particular task elicits moment by moment. Here we ask the question, which sequential model is required to capture gaze sequences so that the ongoing task can be inferred reliably. Specifically, we consider eye movements of human subjects navigating a walkway while avoiding obstacles and approaching targets in a virtual environment. We show that Hidden-Markov Models, which have been used extensively in modeling human sequential behavior, can be augmented with few state variables describing the egocentric position of subjects relative to objects in the environment to dramatically increase successful classification of the ongoing task and to generate gaze sequences, that are very close to those observed in human subjects.
Eye movements reflect active statistical learning
What is the link between eye movements and sensory learning? Although some theories have argued f... more What is the link between eye movements and sensory learning? Although some theories have argued for an automatic interaction between what we know and where we look that continuously modulates human information gathering behavior during both implicit and explicit learning, there exists limited experimental evidence supporting such an ongoing interplay. To address this issue, we used a visual statistical learning paradigm combined with a gaze contingent stimulus presentation and manipulated the explicitness of the task to explore how learning and eye movements interact. During both implicit exploration and explicit visual learning of unknown composite visual scenes, spatial eye movement patterns systematically and gradually changed in accordance with the underlying statistical structure of the scenes. Moreover, the degree of change was directly correlated with the amount and type of knowledge the observers acquired. This suggests that eye-movements are potential indicators of active learning, a process where long-term knowledge, current visual stimuli and an inherent tendency to reduce uncertainty about the visual environment jointly determine where we look.
arXiv (Cornell University), Nov 14, 2022
Frontiers in artificial intelligence, May 20, 2020
Allowing machines to choose whether to kill humans would be devastating for world peace and secur... more Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? In this study, we show that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct. We create a template list of prompts and responses, such as "Should I [action]?", "Is it okay to [action]?", etc. with corresponding answers of "Yes/no, I should (not)." and "Yes/no, it is (not)." The model's bias score is the difference between the model's score of the positive response ("Yes, I should") and that of the negative response ("No, I should not"). For a given choice, the model's overall bias score is the mean of the bias scores of all question/answer templates paired with that choice. Specifically, the resulting model, called the Moral Choice Machine (MCM), calculates the bias score on a sentence level using embeddings of the Universal Sentence Encoder since the moral value of an action to be taken depends on its context. It is objectionable to kill living beings, but it is fine to kill time. It is essential to eat, yet one might not eat dirt. It is important to spread information, yet one should not spread misinformation. Our results indicate that text corpora contain recoverable and accurate imprints of our social, ethical and moral choices, even with context information. Actually, training the Moral Choice Machine on different temporal news and book corpora from the year 1510 to 2008/2009 demonstrate the evolution of moral and ethical choices over different time periods for both atomic actions and actions with context information. By training it on different cultural sources such as the Bible and the constitution of different countries, the dynamics of moral choices in culture, including technology are revealed. That is the fact that moral biases can be extracted, quantified, tracked, and compared across cultures and over time.
Author response: Putting perception into action with inverse optimal control for continuous psychophysics
Quantifying orientation biases across the visual field in humans and cats
Journal of Vision, Oct 20, 2020
bioRxiv (Cold Spring Harbor Laboratory), Dec 23, 2021
Psychophysical methods are a cornerstone of psychology, cognitive science, and neuroscience where... more Psychophysical methods are a cornerstone of psychology, cognitive science, and neuroscience where they have been used to quantify behavior and its neural correlates for a vast range of mental phenomena. Their power derives from the combination of controlled experiments and rigorous analysis through signal detection theory. Unfortunately, they require many tedious trials and preferably highly trained participants. A recently developed approach, continuous psychophysics, promises to transform the field by abandoning the rigid trial structure involving binary responses and replacing it with continuous behavioral adjustments to dynamic stimuli. However, what has precluded wide adoption of this approach is that current analysis methods recover perceptual thresholds, which are one order of magnitude larger compared to equivalent traditional psychophysical experiments. Here we introduce a computational analysis framework for continuous psychophysics based on Bayesian inverse optimal control. We show via simulations and on previously published data that this not only recovers the perceptual thresholds but additionally estimates subjects' action variability, internal behavioral costs, and subjective beliefs about the experimental stimulus dynamics. Taken together, we provide further evidence for the importance of including acting uncertainties, subjective beliefs, and, crucially, the intrinsic costs of behavior, even in experiments seemingly only investigating perception.
PLOS Computational Biology, Oct 19, 2020
While interacting with objects during every-day activities, e.g. when sliding a glass on a counte... more While interacting with objects during every-day activities, e.g. when sliding a glass on a counter top, people obtain constant feedback whether they are acting in accordance with physical laws. However, classical research on intuitive physics has revealed that people's judgements systematically deviate from predictions of Newtonian physics. Recent research has explained at least some of these deviations not as consequence of misconceptions about physics but instead as the consequence of the probabilistic interaction between inevitable perceptual uncertainties and prior beliefs. How intuitive physical reasoning relates to visuomotor actions is much less known. Here, we present an experiment in which participants had to slide pucks under the influence of naturalistic friction in a simulated virtual environment. The puck was controlled by the duration of a button press, which needed to be scaled linearly with the puck's mass and with the square-root of initial distance to reach a target. Over four phases of the experiment, uncertainties were manipulated by altering the availability of sensory feedback and providing different degrees of knowledge about the physical properties of pucks. A hierarchical Bayesian model of the visuomotor interaction task incorporating perceptual uncertainty and press-time variability found substantial evidence that subjects adjusted their button-presses so that the sliding was in accordance with Newtonian physics. After observing collisions between pucks, which were analyzed with a hierarchical Bayesian model of the perceptual observation task, subjects transferred the relative masses inferred perceptually to adjust subsequent sliding actions. Crucial in the modeling was the inclusion of a cost function, which quantitatively captures participants' implicit sensitivity to errors due to their motor variability. Taken together, in the present experiment we find evidence that our participants transferred their intuitive physical reasoning to a subsequent visuomotor control task consistent with Newtonian physics and weighed potential outcomes with a cost functions based on their knowledge about their own variability.
Wiley Interdisciplinary Reviews: Cognitive Science, Sep 22, 2010
Historically, the study of visual perception has followed a reductionist strategy, with the goal ... more Historically, the study of visual perception has followed a reductionist strategy, with the goal of understanding complex visually guided behavior by separate analysis of its elemental components. Recent developments in monitoring behavior, such as measurement of eye movements in unconstrained observers, have allowed investigation of the use of vision in the natural world. This has led to a variety of insights that would be difficult to achieve in more constrained experimental contexts. In general, it shifts the focus of vision away from the properties of the stimulus toward a consideration of the behavioral goals of the observer. It appears that behavioral goals are a critical factor in controlling the acquisition of visual information from the world. This insight has been accompanied by a growing understanding of the importance of reward in modulating the underlying neural mechanisms and by theoretical developments using reinforcement learning models of complex behavior. These developments provide us with the tools to understanding how tasks are represented in the brain, and how they control acquisition of information through use of gaze.
arXiv (Cornell University), Oct 5, 2021
Human mental processes allow for qualitative reasoning about causality in terms of mechanistic re... more Human mental processes allow for qualitative reasoning about causality in terms of mechanistic relations of the variables of interest, which we argue are naturally described by structural causal model (SCM). Since interpretations are being derived from mental models, the same applies for SCM. By defining a metric space on SCM, we provide a theoretical perspective on the comparison of mental models and thereby conclude that interpretations can be used for guiding a learning system towards true causality. To this effect, we present a theoretical analysis from first principles that results in a human-readable interpretation scheme consistent with the provided causality that we name structural causal interpretations (SCI). Going further, we prove that any existing neural induction method (NIM) is in fact interpretable. Our first experiment (E1) assesses the quality of such NIM-based SCI. In (E2) we observe evidence for our conjecture on improved sample-efficiency for SCI-based learning. After conducting a small user study, in (E3) we observe superiority in human-based over NIM-based SCI in support of our initial hypothesis.
arXiv (Cornell University), Apr 16, 2021
We discuss a bivariate beta distribution that can model arbitrary beta-distributed marginals with... more We discuss a bivariate beta distribution that can model arbitrary beta-distributed marginals with a positive correlation. The distribution is constructed from six independent gamma-distributed random variates. We show how the parameters of the distribution can be fit to data using moment matching. Previous work used an approximate and sometimes inaccurate method to compute the covariance. Here, we derive all product moments and the exact covariance, which can easily be computed numerically. The bivariate case can be generalized to a multivariate distribution with arbitrary beta-distributed marginals. Furthermore, we generalize the distribution from two marginal beta to two marginal Dirichlet distributions. The resulting correlated Dirichlet distribution makes it possible to model two correlated Dirichletdistributed random vectors.
Trade-off between uncertainty reduction and reward collection reveals intrinsic cost of gaze switches
Journal of Vision, Dec 5, 2022
Learning Individualized Automatic Content Magnification in Gaze-based Interaction
Optimal feedback control under uncertainty explains errors, variability and behavioral strategies in human navigation
What Can I Help You With: Towards Task-Independent Detection of Intentions for Interaction in a Human-Robot Environment
2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)
arXiv (Cornell University), Feb 17, 2019
The Federal Government of Germany aims to boost the research in the field of Artificial Intellige... more The Federal Government of Germany aims to boost the research in the field of Artificial Intelligence (AI). For instance, 100 new professorships are said to be established. However, the white paper of the government does not answer what an AI professorship is at all. In order to give colleagues, politicians, and citizens an idea, we present a view that is often followed when appointing professors for AI at German and international universities. We hope that it will help to establish a guideline with internationally accepted measures and thus make the public debate more informed. Zusammenfassung: Die deutsche Bundesregierung will die Forschung in der Künstlichen Intelligenz in Deutschland deutlich mehr fördern als bisher. Es sollen z.B. 100 neue Professuren entstehen. Allerdings beantwortet das Strategiepapier nicht, was eine Professur für Künstliche Intelligenz überhaupt ist. Um Kollegen, Politikern und Bürgern eine Idee zu geben, stellen wir eine Einordnung vor, wie sie in Berufungsverfahren an deutschen und internationalen Universitäten üblich ist. Wir hoffen, dass eine solche Einordnung einen Beitrag zu einem Leitfaden mit Messpunkten liefert und so die öffentliche Debatte aufgeklärter gestaltet. erkannte Notwendigkeit, Experten in der Quantifizierung und Formalisierung menschlicher Denkvorgänge, was heute als Kognitionswissenschaft gilt, einzubinden. Als thematische Liste wurde schon bei diesem Workshop Computer, Verarbeitung natürlicher Sprache, Neuronale Netzwerke, Theorie der Computation, Abstraktion und Kreativität genannt, welche noch heute relevant sind. Somit bezog sich der Begriff Künstliche Intelligenz von Anfang an unmittelbar auch auf menschliche Intelligenz, wobei es sich herausgestellt hat, dass eine klare Definition, was menschliche oder natürlich Intelligenz überhaupt sei, recht schwierig ist, und man mit dieser Unsicherheit leben muss, so wie in anderen akzeptierten Wissenschaftsdisziplinen wie z.B. der Psychologie, der Biologie oder der Ökonomie auch. Dementsprechend hat mit der Geburt der Künstliche Intelligenz auch eine weitere Wissenschaft das Licht der Welt erblickt, nämlich die Cognitive Science, die sich damit beschäftigt zu verstehen, was menschliche oder natürliche Intelligenz überhaupt ausmacht. So haben sich die beiden Felder der Künstlichen Intelligenz und der Cognitive Science immer wieder gegenseitig inspiriert und beeinflusst. Aber was ist denn nun KI? Im Zuge des Dartmouth Workshops führte John McCarthy den Begriff "Künstliche Intelligenz" (KI, engl. artificial intelligence) ein und definierte 3 ihn wie folgt: KI ist also die (Ingenieur)Wissenschaft intelligenter Maschinen, insbesondere intelligenter Computerprogramme. Das sind Algorithmen 4 -eindeutige Handlungsvorschriften zur Lösung eines Problems oder einer Klasse von Problemen und bei einer bestimmten Eingabe eine bestimmte Ausgabe überführen -des Problemlösens, Denkens und Lernens. Daher ist die KI-Forschung auch stark in der Informatik zu verankern.
Nature Machine Intelligence, Mar 23, 2022
Artificial writing is permeating our lives due to recent advances in large-scale, transformer-bas... more Artificial writing is permeating our lives due to recent advances in large-scale, transformer-based language models (LMs) such as BERT, its variants, GPT-2/3, and others. Using them as pre-trained models and fine-tuning them for specific tasks, researchers have extended state of the art for many NLP tasks and shown that they capture not only linguistic knowledge but also retain general knowledge implicitly present in the data. Unfortunately, LMs trained on unfiltered text corpora suffer from degenerated and biased behaviour. While this is well established, we show that recent LMs also contain human-like biases of what is right and wrong to do, some form of ethical and moral norms of the society -they bring a "moral direction" to surface. That is, we show that these norms can be captured geometrically by a direction, which can be computed, e.g., by a PCA, in the embedding space, reflecting well the agreement of phrases to social norms implicitly expressed in the training texts and providing a path for attenuating or even preventing toxic degeneration in LMs. Being able to rate the (non-)normativity of arbitrary phrases without explicitly training the LM for this task, we demonstrate the capabilities of the "moral direction" for guiding (even other) LMs towards producing normative text and showcase it on RealToxicityPrompts testbed, preventing the neural toxic degeneration in GPT-2. Large-scale, transformer-based language models (LMs) such as BERT [1], its variants [2, 3], GPT-2/3 [4], and others have shown improvements on various NLP tasks. By now, they are so good at generating human-like text that articles and social media often describe it as the "world's most impressive AI" and "terrifyingly good" . Several studies revealed improved syntactic and semantic abilities of large-scale transform-based LMs [6,, 10] compared to previous models such as RNNs. Furthermore, Talmor et al. [11] demonstrated that LMs exhibit reasoning abilities, although not in an abstract manner, and Roberts et al. showed that LMs' capability to store and retrieve knowledge scales with model size. Petroni et al. [13] demonstrated that, besides learning linguistic knowledge, recent transformer-based LMs even retain general knowledge implicitly present in the training data. While these successes are very exciting, there are also risks associated with developing them as also discussed in . Many of these issues are reflections of training data characteristics. Already language itself contains recoverable and accurate imprints of our historical biases, and Machine Learning algorithms such as LMs may capture these regularities, as e.g. Caliskan et al. [20] have demonstrated. Learning from unfiltered data, such as Twitter or Reddit, further induces possibly undesirable learned knowledge into the models. LMs used for downstream tasks such as credit risk prediction are propagating this implicit knowledge to the classifier, and LMs with generative capabilities are suffering from toxic degeneration [15], i.e. they are prone to generating non-normative
Although humans are prone to perceptual illusions and decision biases, they perform very well in ... more Although humans are prone to perceptual illusions and decision biases, they perform very well in every-day tasks with varying difficulties and complexities. It has been shown that humans learn to adopt to the statistical regularities of the environment. However, whether humans have correct physical intuitions about these ordinary processes and reflect related dynamics in an appropriate internal model has been disputed. Recent studies have shown that human behavior in diverse physical judgment tasks can indeed be explained with probabilistic models based on realistic, Newtonian functions while considering sensory uncertainties. Here, we examined whether humans use physical models of their environment in a control task, which involves non-linearities in the involved dynamics. Participants were asked to shoot a puck into a target area affected by realistic friction. By deploying Bayesian models we can show that humans are capable to adopt to these physical relationships and have appropriate internal beliefs about relevant quantities.
Perceptual explaining away in depth judgements
Journal of Vision, Sep 1, 2018
In many situations encountered in our daily lives where we have several options to choose from, w... more In many situations encountered in our daily lives where we have several options to choose from, we need to balance the amount of planning into the future with the number of alternatives we want to consider to achieve our long-term goals. A popular way to study these planning problems in controlled environments is maze-solving tasks, since they can be precisely defined and controlled in terms of their topology. In our study, participants solved mazes that differed systematically in topological properties regulating the number of alternatives and depth of paths. Replicating previous results, we show the influence of these spatial features on performance and stopping times. Longer and more branched solution paths lead to more planning effort and longer solution times. Additionally, we measured subjects' eye movements to investigate their planning horizon. Our results suggest that people decrease their planning depth with increasing number of alternatives.
Minimal sequential gaze models for inferring walkers' tasks
Eye movements in extended sequential behavior are known to reflect task demands much more than lo... more Eye movements in extended sequential behavior are known to reflect task demands much more than low-level feature saliency. However, the more naturalistic the task is the more difficult it becomes to establish what cognitive processes a particular task elicits moment by moment. Here we ask the question, which sequential model is required to capture gaze sequences so that the ongoing task can be inferred reliably. Specifically, we consider eye movements of human subjects navigating a walkway while avoiding obstacles and approaching targets in a virtual environment. We show that Hidden-Markov Models, which have been used extensively in modeling human sequential behavior, can be augmented with few state variables describing the egocentric position of subjects relative to objects in the environment to dramatically increase successful classification of the ongoing task and to generate gaze sequences, that are very close to those observed in human subjects.
Eye movements reflect active statistical learning
What is the link between eye movements and sensory learning? Although some theories have argued f... more What is the link between eye movements and sensory learning? Although some theories have argued for an automatic interaction between what we know and where we look that continuously modulates human information gathering behavior during both implicit and explicit learning, there exists limited experimental evidence supporting such an ongoing interplay. To address this issue, we used a visual statistical learning paradigm combined with a gaze contingent stimulus presentation and manipulated the explicitness of the task to explore how learning and eye movements interact. During both implicit exploration and explicit visual learning of unknown composite visual scenes, spatial eye movement patterns systematically and gradually changed in accordance with the underlying statistical structure of the scenes. Moreover, the degree of change was directly correlated with the amount and type of knowledge the observers acquired. This suggests that eye-movements are potential indicators of active learning, a process where long-term knowledge, current visual stimuli and an inherent tendency to reduce uncertainty about the visual environment jointly determine where we look.
arXiv (Cornell University), Nov 14, 2022
Frontiers in artificial intelligence, May 20, 2020
Allowing machines to choose whether to kill humans would be devastating for world peace and secur... more Allowing machines to choose whether to kill humans would be devastating for world peace and security. But how do we equip machines with the ability to learn ethical or even moral choices? In this study, we show that applying machine learning to human texts can extract deontological ethical reasoning about "right" and "wrong" conduct. We create a template list of prompts and responses, such as "Should I [action]?", "Is it okay to [action]?", etc. with corresponding answers of "Yes/no, I should (not)." and "Yes/no, it is (not)." The model's bias score is the difference between the model's score of the positive response ("Yes, I should") and that of the negative response ("No, I should not"). For a given choice, the model's overall bias score is the mean of the bias scores of all question/answer templates paired with that choice. Specifically, the resulting model, called the Moral Choice Machine (MCM), calculates the bias score on a sentence level using embeddings of the Universal Sentence Encoder since the moral value of an action to be taken depends on its context. It is objectionable to kill living beings, but it is fine to kill time. It is essential to eat, yet one might not eat dirt. It is important to spread information, yet one should not spread misinformation. Our results indicate that text corpora contain recoverable and accurate imprints of our social, ethical and moral choices, even with context information. Actually, training the Moral Choice Machine on different temporal news and book corpora from the year 1510 to 2008/2009 demonstrate the evolution of moral and ethical choices over different time periods for both atomic actions and actions with context information. By training it on different cultural sources such as the Bible and the constitution of different countries, the dynamics of moral choices in culture, including technology are revealed. That is the fact that moral biases can be extracted, quantified, tracked, and compared across cultures and over time.
Author response: Putting perception into action with inverse optimal control for continuous psychophysics
Quantifying orientation biases across the visual field in humans and cats
Journal of Vision, Oct 20, 2020
bioRxiv (Cold Spring Harbor Laboratory), Dec 23, 2021
Psychophysical methods are a cornerstone of psychology, cognitive science, and neuroscience where... more Psychophysical methods are a cornerstone of psychology, cognitive science, and neuroscience where they have been used to quantify behavior and its neural correlates for a vast range of mental phenomena. Their power derives from the combination of controlled experiments and rigorous analysis through signal detection theory. Unfortunately, they require many tedious trials and preferably highly trained participants. A recently developed approach, continuous psychophysics, promises to transform the field by abandoning the rigid trial structure involving binary responses and replacing it with continuous behavioral adjustments to dynamic stimuli. However, what has precluded wide adoption of this approach is that current analysis methods recover perceptual thresholds, which are one order of magnitude larger compared to equivalent traditional psychophysical experiments. Here we introduce a computational analysis framework for continuous psychophysics based on Bayesian inverse optimal control. We show via simulations and on previously published data that this not only recovers the perceptual thresholds but additionally estimates subjects' action variability, internal behavioral costs, and subjective beliefs about the experimental stimulus dynamics. Taken together, we provide further evidence for the importance of including acting uncertainties, subjective beliefs, and, crucially, the intrinsic costs of behavior, even in experiments seemingly only investigating perception.
PLOS Computational Biology, Oct 19, 2020
While interacting with objects during every-day activities, e.g. when sliding a glass on a counte... more While interacting with objects during every-day activities, e.g. when sliding a glass on a counter top, people obtain constant feedback whether they are acting in accordance with physical laws. However, classical research on intuitive physics has revealed that people's judgements systematically deviate from predictions of Newtonian physics. Recent research has explained at least some of these deviations not as consequence of misconceptions about physics but instead as the consequence of the probabilistic interaction between inevitable perceptual uncertainties and prior beliefs. How intuitive physical reasoning relates to visuomotor actions is much less known. Here, we present an experiment in which participants had to slide pucks under the influence of naturalistic friction in a simulated virtual environment. The puck was controlled by the duration of a button press, which needed to be scaled linearly with the puck's mass and with the square-root of initial distance to reach a target. Over four phases of the experiment, uncertainties were manipulated by altering the availability of sensory feedback and providing different degrees of knowledge about the physical properties of pucks. A hierarchical Bayesian model of the visuomotor interaction task incorporating perceptual uncertainty and press-time variability found substantial evidence that subjects adjusted their button-presses so that the sliding was in accordance with Newtonian physics. After observing collisions between pucks, which were analyzed with a hierarchical Bayesian model of the perceptual observation task, subjects transferred the relative masses inferred perceptually to adjust subsequent sliding actions. Crucial in the modeling was the inclusion of a cost function, which quantitatively captures participants' implicit sensitivity to errors due to their motor variability. Taken together, in the present experiment we find evidence that our participants transferred their intuitive physical reasoning to a subsequent visuomotor control task consistent with Newtonian physics and weighed potential outcomes with a cost functions based on their knowledge about their own variability.
Wiley Interdisciplinary Reviews: Cognitive Science, Sep 22, 2010
Historically, the study of visual perception has followed a reductionist strategy, with the goal ... more Historically, the study of visual perception has followed a reductionist strategy, with the goal of understanding complex visually guided behavior by separate analysis of its elemental components. Recent developments in monitoring behavior, such as measurement of eye movements in unconstrained observers, have allowed investigation of the use of vision in the natural world. This has led to a variety of insights that would be difficult to achieve in more constrained experimental contexts. In general, it shifts the focus of vision away from the properties of the stimulus toward a consideration of the behavioral goals of the observer. It appears that behavioral goals are a critical factor in controlling the acquisition of visual information from the world. This insight has been accompanied by a growing understanding of the importance of reward in modulating the underlying neural mechanisms and by theoretical developments using reinforcement learning models of complex behavior. These developments provide us with the tools to understanding how tasks are represented in the brain, and how they control acquisition of information through use of gaze.
arXiv (Cornell University), Oct 5, 2021
Human mental processes allow for qualitative reasoning about causality in terms of mechanistic re... more Human mental processes allow for qualitative reasoning about causality in terms of mechanistic relations of the variables of interest, which we argue are naturally described by structural causal model (SCM). Since interpretations are being derived from mental models, the same applies for SCM. By defining a metric space on SCM, we provide a theoretical perspective on the comparison of mental models and thereby conclude that interpretations can be used for guiding a learning system towards true causality. To this effect, we present a theoretical analysis from first principles that results in a human-readable interpretation scheme consistent with the provided causality that we name structural causal interpretations (SCI). Going further, we prove that any existing neural induction method (NIM) is in fact interpretable. Our first experiment (E1) assesses the quality of such NIM-based SCI. In (E2) we observe evidence for our conjecture on improved sample-efficiency for SCI-based learning. After conducting a small user study, in (E3) we observe superiority in human-based over NIM-based SCI in support of our initial hypothesis.
arXiv (Cornell University), Apr 16, 2021
We discuss a bivariate beta distribution that can model arbitrary beta-distributed marginals with... more We discuss a bivariate beta distribution that can model arbitrary beta-distributed marginals with a positive correlation. The distribution is constructed from six independent gamma-distributed random variates. We show how the parameters of the distribution can be fit to data using moment matching. Previous work used an approximate and sometimes inaccurate method to compute the covariance. Here, we derive all product moments and the exact covariance, which can easily be computed numerically. The bivariate case can be generalized to a multivariate distribution with arbitrary beta-distributed marginals. Furthermore, we generalize the distribution from two marginal beta to two marginal Dirichlet distributions. The resulting correlated Dirichlet distribution makes it possible to model two correlated Dirichletdistributed random vectors.
Trade-off between uncertainty reduction and reward collection reveals intrinsic cost of gaze switches
Journal of Vision, Dec 5, 2022
Dynamic Coordination in the Brain, Aug 2, 2010
Effective perception requires the integration of many noisy and ambiguous sensory signals across ... more Effective perception requires the integration of many noisy and ambiguous sensory signals across different modalities (eg, vision, audition) into stable percepts. This chapter discusses some of the core questions related to sensory integration: Why does the brain integrate sensory signals, and how does it do so? How does it learn this ability? How does it know when to integrate signals and when to treat them separately? How dynamic is the process of sensory integration?
Dynamic Coordination in the Brain, Aug 2, 2010
Effective perception requires the integration of many noisy and ambiguous sensory signals across ... more Effective perception requires the integration of many noisy and ambiguous sensory signals across different modalities (eg, vision, audition) into stable percepts. This chapter discusses some of the core questions related to sensory integration: Why does the brain integrate sensory signals, and how does it do so? How does it learn this ability? How does it know when to integrate signals and when to treat them separately? How dynamic is the process of sensory integration?
ʻComputational Modeling of Multisensory Object Perceptionʼ
th.physik.uni-frankfurt.de
Abstract: Computational modeling largely based on advances in artificial intelligence and machine... more Abstract: Computational modeling largely based on advances in artificial intelligence and machine learning has helped furthering the understanding of some of the principles and mechanisms of multisensory object perception. Furthermore, this theoretical work has led to the ...
Generalization and Interference in Human Motor Control
Computational and Robotic Models of the Hierarchical Organization of Behavior, 2013
th.physik.uni-frankfurt.de
Computational modeling largely based on advances in artificial intelligence and machine learning ... more Computational modeling largely based on advances in artificial intelligence and machine learning has helped furthering the understanding of some of the principles and mechanisms of multisensory object perception. Furthermore, this theoretical work has led to the development of new experimental paradigms and to important new questions. The last 20 years have seen an increasing emphasis on models that explicitly compute with uncertainties, a crucial aspect of the relation between sensory signals and states of the world. Bayesian models allow for the formulation of such relationships and also of explicit optimality criteria against which human performance can be compared. They therefore allow answering the question, how close human performance comes to a specific formulation of best performance. Maybe even more importantly, Bayesian methods allow comparing quantitatively different models by how well they account for observed data. The success of such techniques in explaining perceptual phenomena has also led to a large number of new open questions, especially about how the brain is able to perform computations that are consistent with these functional models and also about the origin of the algorithms in the brain. We briefly review some key empirical evidence of crossmodal perception and proceed to give an overview of the computational principles evident form this work. The presentation of current modeling approaches to multisensory perception considers Bayesian models, models at an intermediate level, and neural models implementing multimodal computations. Finally, this chapter specifically emphasizes current open questions in theoretical models of multisensory object perception.
Computational modeling largely based on advances in artificial intelligence and machine learning ... more Computational modeling largely based on advances in artificial intelligence and machine learning has helped furthering the understanding of some of the principles and mechanisms of multisensory object perception. Furthermore, this theoretical work has led to the development of new experimental paradigms and to important new questions. The last 20 years have seen an increasing emphasis on models that explicitly compute with uncertainties, a crucial aspect of the relation between sensory signals and states of the world. Bayesian models allow for the formulation of such relationships and also of explicit optimality criteria against which human performance can be compared. They therefore allow answering the question, how close human performance comes to a specific formulation of best performance. Maybe even more importantly, Bayesian methods allow comparing quantitatively different models by how well they account for observed data. The success of such techniques in explaining perceptual phenomena has also led to a large number of new open questions, especially about how the brain is able to perform computations that are consistent with these functional models and also about the origin of the algorithms in the brain. We briefly review some key empirical evidence of crossmodal perception and proceed to give an overview of the computational principles evident form this work. The presentation of current modeling approaches to multisensory perception considers Bayesian models, models at an intermediate level, and neural models implementing multimodal computations. Finally, this chapter specifically emphasizes current open questions in theoretical models of multisensory object perception.
Learning and Coordinating Repertoires of Behaviors with Common Reward: Credit Assignment and Module Activation
Computational and Robotic Models of the Hierarchical Organization of Behavior, 2013
Modelling the dynamics of visual attention under uncertainty
Journal of vision, 2015
While several recent studies have established the optimality of spatial targeting of gaze in huma... more While several recent studies have established the optimality of spatial targeting of gaze in humans, it is still unknown whether this optimality extends to the timing of gaze in time-varying environments. Moreover, it is unclear to what extent visual attention is guided by learning processes, which facilitate adaption to changes in the world. We present empirical evidence for significant changes in attentive visual behavior due to an observer's experience and learning. Crucially, we present a hierarchical Bayesian model, that not only fits our behavioral data but also explains dynamic changes of gaze patterns in terms of learning. We devised a controlled experiment to investigate how humans divide their attentional resources over time among multiple targets in order to achieve task-specific goals. Eye movement data was collected from the participants. During each trial, three stimuli were presented on a computer screen arranged in a triangle. Each stimulus consisted of a small d...
Learning when to blink: Environmental statistics guide blinking behavior
Journal of Vision
PLOS Computational Biology
Although a standard reinforcement learning model can capture many aspects of rewardseeking behavi... more Although a standard reinforcement learning model can capture many aspects of rewardseeking behaviors, it may not be practical for modeling human natural behaviors because of the richness of dynamic environments and limitations in cognitive resources. We propose a modular reinforcement learning model that addresses these factors. Based on this model, a modular inverse reinforcement learning algorithm is developed to estimate both the rewards and discount factors from human behavioral data, which allows predictions of human navigation behaviors in virtual reality with high accuracy across different subjects and with different tasks. Complex human navigation trajectories in novel environments can be reproduced by an artificial agent that is based on the modular model. This model provides a strategy for estimating the subjective value of actions and how they influence sensory-motor decisions in natural behavior.
Adaptation to environmental statistics in an action control task
2019 Conference on Cognitive Computational Neuroscience
Steering a car to intercept a moving target: Can people learn a better interception solution?
Journal of Vision
Perceptual explaining away in depth judgements
Journal of Vision
Bayesian modeling of task dependent visual attention strategy in a virtual reality environment
Journal of …, Jan 1, 2005
Abstract The deployment of visual attention is commonly framed as being determined by the propert... more Abstract The deployment of visual attention is commonly framed as being determined by the properties of the visual scene. Top-down factors have been acknowledged, but they have been described as modulating bottom-up saliency. Alternative models have proposed to understand visual attention in terms of the requirements of goal directed tasks. In such a setting, the underlying task structure is the focus of the observed fixation patterns.
Learning visual representations with projection pursuit
Journal of Vision, Jan 1, 2005
Visual cortex must calibrate the receptive fields of billions of neurons in a hierarchy of maps. ... more Visual cortex must calibrate the receptive fields of billions of neurons in a hierarchy of maps. Modeling this process is daunting, but a promising direction is minimum description length theory (MDL). In MDL, the cortex builds a theory of itself and does this by trading off the bits to ...
A walk through the woods explains the space variant oblique effect
Journal of Vision, Jan 1, 2009
To investigate the role of higher order statistics and task behavior we obtain image sequences by... more To investigate the role of higher order statistics and task behavior we obtain image sequences by simulating an agent navigating through simulated wooded environments. Unsupervised learning algorithms are used to learn a sparse code of the images but contrary to previous ...
Predictive eye movements in physically possible and impossible worlds: Evidence for internal models
Journal of Vision, Jan 1, 2006
Subjects were immersed in a virtual scene and hit virtual balls directed towards them with a tabl... more Subjects were immersed in a virtual scene and hit virtual balls directed towards them with a table tennis paddle, receiving vibrotactile feedback when hitting, and audible feedback from the bounce. This setup allows to control the trajectories, the physical properties of the ball ...
Human eye movements correlate with intrinsic reward structure in natural visuomotor tasks
Journal of Vision, Jan 1, 2008
Neurophysiological and psychophysical studies in primates and humans have shown the pervasive rol... more Neurophysiological and psychophysical studies in primates and humans have shown the pervasive role of reward in learning of visuomotor activities. Eye movements and hand movements to targets are executed so as to maximize reward. Moreover, reinforcement learning algorithms ...
Psychophysical studies in humans have demonstrated that visuomotor behavior such as rapid hand mo... more Psychophysical studies in humans have demonstrated that visuomotor behavior such as rapid hand movements to targets are sensitive to the reward structure of the goal and are executed so as to maximize reward1. Similarly, animal studies have shown that learning of such behaviors is driven by reward. Evidence suggests that the neurotransmitter dopamine is involved in the process of learning visomotor behaviors2.
Evidence suggests that neurotransmitter dopamine is involved in the process of learning viso-moto... more Evidence suggests that neurotransmitter dopamine is involved in the process of learning viso-motor behaviors1. Moreover, reinforcement learning (RL) algorithms have been formulated that characterize well the neuronal signals of dopaminergic neurons in response to the occurrences of stimuli associated with rewards and the delivery of the rewards across learning2. Such algorithms have been successful in modeling those responses in cases where only a single variable describes the current state of the world.
Theories of optimal coding propose to understand early sensory processing of stimuli as being ada... more Theories of optimal coding propose to understand early sensory processing of stimuli as being adapted to the statistics of the signals naturally occurring when interacting with the environment [1, 2, 3]. Relating this approach to vision, the regularities of natural image sequences have been investigated, eg.[4, 5]. But, given that perception is active, the stimuli at the retina are dependent on oculomotor control and therefore on the executed task [6]. How do different tasks affect the statistics of image features at the fixation location?