Surprise acts as a reducer of outcome value (original) (raw)

Surprise Acts as a Reducer of Outcome Value in Human Reinforcement Learning

Surprise occurs because of differences between a decision outcome and its predicted outcome (prediction error), regardless of whether the error is positive or negative. It has recently been postulated that surprise affects the reward value of the action outcome; studies have indicated that increasing surprise as an absolute value of prediction error decreases the value of the outcome. However, how surprise affects the value of the outcome and subsequent decision making is unclear. We suggest that, on the assumption that surprise decreases the outcome value, agents will increase their risk-averse choices when an outcome is often surprising. Here, we propose the surprisesensitive utility model, a reinforcement learning model that states that surprise decreases the outcome value, to explain how surprise affects subsequent decision making. To investigate the properties of the proposed model, we compare the model with previous reinforcement learning models on two probabilistic learning tasks by simulations. As a result, the proposed model explains the risk-averse choices like the previous models, and the risk-averse choices increase as the surprise-based modulation parameter of outcome value increases. We also performed statistical model selection by using two experimental datasets with different tasks. The proposed model fits these datasets better than the other models with the same number of free parameters, indicating that the model can better capture the trial-by-trial dynamics of choice behavior.

Behavioral reactions reflecting differential reward expectations in monkeys

Experimental Brain Research, 2001

Learning theory emphasizes the importance of expectations in the control of instrumental action. This study investigated the variation of behavioral reactions toward different rewards as an expression of differential expectations of outcomes in primates. We employed several versions of two basic behavioral paradigms, the spatial delayed response task and the delayed reaction task. These tasks are commonly used in neurobiological studies of working memory, movement preparation, and event expectation involving the frontal cortex and basal ganglia. An initial visual instruction stimulus indicated to the animal which one of several food or liquid rewards would be delivered after each correct behavioral response, or whether or not a reward could be obtained. We measured the reaction times of the operantly conditioned arm movement necessary for obtaining the reward, and the durations of anticipatory licking prior to liquid reward delivery as a Pavlovian conditioned response. The results showed that both measures varied depending on the reward predicted by the initial instruction. Arm movements were performed with significantly shorter reaction times for foods or liquids that were more preferred by the animal than for less preferred ones. Still larger differences were observed between rewarded and unrewarded trials. An interesting effect was found in unrewarded trials, in which reaction times were significantly shorter when a highly preferred reward was delivered in the alternative rewarded trials of the same trial block as compared to a less preferred reward. Anticipatory licks preceding the reward were significantly longer when highly preferred rather than less preferred rewards, or no rewards, were predicted. These results demonstrate that behavioral reactions preceding rewards may vary depending on the predicted future reward and suggest that monkeys differentially expect particular outcomes in the presently investigated tasks.

Corrugator activity confirms immediate negative affect in surprise

Frontiers in psychology, 2015

The emotion of surprise entails a complex of immediate responses, such as cognitive interruption, attention allocation to, and more systematic processing of the surprising stimulus. All these processes serve the ultimate function to increase processing depth and thus cognitively master the surprising stimulus. The present account introduces phasic negative affect as the underlying mechanism responsible for this switch in operating mode. Surprising stimuli are schema-discrepant and thus entail cognitive disfluency, which elicits immediate negative affect. This affect in turn works like a phasic cognitive tuning switching the current processing mode from more automatic and heuristic to more systematic and reflective processing. Directly testing the initial elicitation of negative affect by surprising events, the present experiment presented high and low surprising neutral trivia statements to N = 28 participants while assessing their spontaneous facial expressions via facial electromy...

Task Learnability Modulates Surprise but Not Valence Processing for Reinforcement Learning in Probabilistic Choice Tasks

Journal of Cognitive Neuroscience, 2021

The goal of temporal difference (TD) reinforcement learning is to maximize outcomes and improve future decision-making. It does so by utilizing a prediction error (PE), which quantifies the difference between the expected and the obtained outcome. In gambling tasks, however, decision-making cannot be improved because of the lack of learnability. On the basis of the idea that TD utilizes two independent bits of information from the PE (valence and surprise), we asked which of these aspects is affected when a task is not learnable. We contrasted behavioral data and ERPs in a learning variant and a gambling variant of a simple two-armed bandit task, in which outcome sequences were matched across tasks. Participants were explicitly informed that feedback could be used to improve performance in the learning task but not in the gambling task, and we predicted a corresponding modulation of the aspects of the PE. We used a model-based analysis of ERP data to extract the neural footprints of...

The Cognitive-Evolutionary Model of Surprise: A Review of the Evidence

Topics in Cognitive Science

Research on surprise relevant to the cognitive-evolutionary model of surprise proposed by Meyer, Reisenzein, and Schützwohl (1997) is reviewed. The majority of the assumptions of the model are found empirically supported. Surprise is evoked by unexpected (schema-discrepant) events, whereas the novelty and the valence of the eliciting events probably do not have an independent effect. Unexpected events cause an automatic

Predicting Affective Responses to Unexpected Outcomes

Organizational Behavior and Human Decision Processes, 2001

In decisions under uncertainty, decision makers confront two uncertainties: the uncertain linkage between actions and outcomes and the uncertain linkage between these outcomes and his or her affective responses to them. The two studies reported here examine affective responses to expected and unexpected outcomes in various settings. In Study 1, a scenario-based laboratory experiment (N ‫؍‬ 149), we examined subjects' predicted responses to a range of outcomes, as a function of how surprising the outcome was. Study 2, a field study (N ‫؍‬ 127), involved the expectations of bowlers about their scores in an upcoming game and about their responses to various outcomes at, above, and below expectations. We also measured actual affective reactions after the bowlers had completed their games. Findings suggest that subjects both expect and experience a loss-averse, expectation-based value function broadly of the Prospect Theory type. They also anticipate, and experience, an amplifying effect of outcome surprise, though they underestimate its size. We argue that such underestimation, together with overtight prediction ranges, may expose subjects to much larger affective variation with outcome variability than they anticipate. ᭧

Running head : Surprise in Decision Making SURPRISE IN DECISION MAKING UNDER UNCERTAINTY

1999

In four experiments we investigate overand underweighting of probabilities in decisions under risk. To account for this phenomenon we propose a view of the probab ility weighting function as a composite of cognitive and emotional processes and suggest that there is no single weighting function but two separate weighting functions for each process. Data obtained from a rating as well as three choice experiments, using both between and within subjects designs, generally support the proposed view. Given this broader perspective, cognitive "biases" or "errors" may turn out as highly intelligent solutions to maximize utility. * Dr. Eduard Brandstätter, Dept. of Social and Economic Psychology, University of Linz, A-4040 Linz, Austria. Phone: 0043-732-2468-578, Fax: 0043-732-2468-9315. E-mail: e.brandstaetter@jk.uni-linz.ac.at ** Dr.Anton Kühberger, Dept. of Psychology, University of Salzburg, A-5020 Salzburg, Austria. Phone: 0043-662-8044-5112, Fax: 0043-662-8044-5126....

Intrinsic Emotional Relevance of Outcomes and Prediction Error

Journal of Psychophysiology, 2012

Infrequent events, such as unexpected absence of outcomes (prediction errors), have a detrimental effect on performance of subsequent trial in various cognitive tasks. In the present event-related potential study, we tested whether the influence of prediction error manifests itself in the early cortical processing of subsequent stimuli. Participants performed a reversal learning task in which they saw two alternating pairs of faces and indicated for each pair which one would have a declared target stimulus on its nose. The target switched to the other face after several consecutive trials with correct response, thereby inducing a prediction error, with the switch being indicated by the appearance of a disk (unexpected neutral outcome) or a spider (unexpected unpleasant outcome), depending on the condition. Results showed that after both unexpected and expected unpleasant outcomes, the amplitude of P2 decreased, while after both unexpected neutral and unpleasant outcomes, the amplitu...

Surprise in Decision Making Under Uncertainty

1999

In four experiments we investigate over-and underweighting of probabilities in decisions under risk. To account for this phenomenon we propose a view of the probability weighting function as a composite of cognitive and emotional processes and suggest that there is no single weighting function but two separate weighting functions for each process. Data obtained from a rating as well as three choice experiments, using both between and within subjects designs, generally support the proposed view. Given this broader perspective, cognitive "biases" or "errors" may turn out as highly intelligent solutions to maximize utility.

Anticipated Emotions as Guides to Choice

Current Directions in Psychological Science, 2001

When making decisions, people often anticipate the emotions they might experience as a result of the outcomes of their choices. In the process, they simulate what life would be like with one outcome or another. We examine the anticipated and actual pleasure of outcomes and their relation to choices people make in laboratory studies and real-world studies. We offer a theory of anticipated pleasure that explains why the same outcome can lead to a wide range of emotional experiences. Finally, we show how anticipated pleasure relates to risky choice within the framework of subjective expected pleasure theory.