Behavioral reactions reflecting differential reward expectations in monkeys (original) (raw)

Reward anticipation, cognition, and electrodermal activity in the conditioned monkey

Experimental Brain Research, 2003

In the present report, we examine electrodermal activity (skin conductance responses, SCRs) in monkeys trained to perform target-selection (TS) tests. In each test, the animal was presented in successive trials with the same two unequally rewarded targets on a touch screen. The probabilistic contingencies of the rewards associated with each target rendered the selection of the best difficult. Our findings revealed SCRs time-locked to the arm movements toward the rewarded targets, occurring after the target touches. Parameters of the SCRs were stable when the uncertainty of the choices and of the outcomes varied. The results support the hypothesis that the physiological processes indexed by the SCRs are the correlate of anticipatory appetitive behavior. In contrast, there is no evidence that the SCRs reflect cognitive processes associated with the detection of the best target.

Learning what to expect and when to expect it involves dissociable neural systems

Neurobiology of Learning and Memory, 2018

Two experiments with Long-Evans rats examined the potential independence of learning about different features of food reward, namely, "what" reward is to be expected and "when" it will occur. This was examined by investigating the effects of selective reward devaluation upon responding in an instrumental peak timing task in Experiment 1 and by exploring the effects of pre-training lesions targeting the basolateral amygdala (BLA) upon the selective reward devaluation effect and interval timing in a Pavlovian peak timing task in Experiment 2. In both tasks, two stimuli, each 60 s long, signaled that qualitatively distinct rewards (different flavored food pellets) could occur after 20 s. Responding on non-rewarded probe trials displayed the characteristic peak timing function with mean responding gradually increasing and peaking at approximately 20 s before more gradually declining thereafter. One of the rewards was then independently paired repeatedly with LiCl injections in order to devalue it whereas the other reward was unpaired with these injections. In a final set of test sessions in which both stimuli were presented without rewards, it was observed that responding was selectively reduced in the presence of the stimulus signaling the devalued reward compared to the stimulus signaling the still valued reward. Moreover, the timing function was mostly unaltered by this devaluation manipulation. Experiment 2 showed that pre-training BLA lesions abolished this selective reward devaluation effect, but it had no impact on peak timing functions shown by the two stimuli. It appears from these data that learning about "what" and "when" features of reward may entail separate underlying neural systems.

Dopamine neuronal responses in monkeys performing visually cued reward schedules

European Journal of Neuroscience, 2006

Dopamine neurons are important for reward-related behaviours. They have been recorded during classical conditioning and operant tasks with stochastic reward delivery. However, daily behaviour, although frequently complex in the number of steps, is often very predictable. We studied the responses of 75 dopamine neurons during schedules of trials in which the events and related reward contingencies could be well-predicted, within and across trials. In this visually cued reward schedule task, a visual cue tells the monkeys exactly how many trials, 1, 2, 3, or 4, must be performed to obtain a reward. The number of errors became larger as the number of trials remaining before the reward increased. Dopamine neurons frequently responded to the cues at the beginning and end of the schedules. Approximately 75% of the first-cue responsive neurons did not distinguish among the schedules that were beginning even though the cues were different. Approximately half of the last-cue responsive neurons depended on which schedule was ending, even though the cue signalling the last trial was the same in all schedules. Thus, the responses were related to what the monkey knew about the relation between the cues and the schedules, not the identity of the cues. These neurons also frequently responded to the go signal and ⁄ or to the OK signal indicating the end of a correctly performed trial whether a reward was forthcoming or not, and to the reward itself. Thus, dopamine neurons seem to respond to behaviourally important, i.e. salient, events even when the events have been well-predicted.

Reward prediction in primate basal ganglia and frontal cortex

Neuropharmacology, 1998

Reward information is processed in a limited number of brain structures, including fronto-basal ganglia systems. Dopamine neurons respond phasically to primary rewards and reward-predicting stimuli depending on reward unpredictability but without discriminating between rewards. These responses reflect 'errors' in the prediction of rewards in correspondence to learning theories and thus may constitute teaching signals for appetitive learning. Neurons in the striatum (caudate, putamen, ventral striatum) code reward predictions in a different manner. They are activated during several seconds when animals expect predicted rewards. During learning, these activations occur initially in rewarded and unrewarded trials and become subsequently restricted to rewarded trials. This occurs in parallel with the adaptation of reward expectations by the animals, as inferred from their behavioral reactions. Neurons in orbitofrontal cortex respond differentially to stimuli predicting different liquid rewards, without coding spatial or visual features. Thus, different structures process reward information processed in different ways. Whereas dopamine neurons emit a reward teaching signal without indicating the specific reward, striatal neurons adapt expectation activity to new reward situations, and orbitofrontal neurons process the specific nature of rewards. These reward signals need to cooperate in order for reward information to be used for learning and maintaining approach behavior.

Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task

The Journal of Neuroscience, 1993

The present investigation had two aims: (1) to study responses of dopamine neurons to stimuli with attentional and motivational significance during several steps of learning a behavioral task, and (2) to study the activity of dopamine neurons during the performance of cognitive tasks known to be impaired after lesions of these neurons. Monkeys that had previously learned a simple reaction time task were trained to perform a spatial delayed response task via two intermediate tasks. During the learning of each new task, a total of 25% of 76 dopamine neurons showed phasic responses to the delivery of primary liquid reward, whereas only 9% of 163 neurons responded to this event once task performance was established. This produced an average population response during but not after learning of each task. Reward responses during learning were significantly more numerous and pronounced in area A10, as compared to areas A8 and A9. Dopamine neurons also showed phasic responses to the two con...

Effects of Expectations for Different Reward Magnitudes on Neuronal Activity in Primate Striatum

Journal of Neurophysiology, 2003

Caudate Nucleus [PDF] [Full Text] [Abstract] , April , 2014; 21 (4): 223-231. Learn. Mem. accumbens shell during specific time segments Behavioral flexibility is increased by optogenetic inhibition of neurons in the nucleus [PDF] [Full Text] [Abstract] , June , 2014; 9 (6): 864-872. Soc Cogn Affect Neurosci Neural correlates of effective and ineffective mood induction [PDF] [Full Text] [Abstract] 2014; 83 (12): 1112-1118. Neurology Fabienne Picard and Karl Friston Predictions, perception, and a sense of self [PDF] [Full Text] [Abstract] , October 15, 2014; 34 (42): 14147-14162. Downloaded from duces aphagia and exaggerated treading following striatopallidal lesions in the rat. Behav Neurosci 104: 778 -795, 1990. Black RW. Shifts in magnitude of reward and contrast effects in instrumental and selective learning: a reinterpretation. Psych Rev 75: 114 -126, 1968. Bloomfield TM. Behavioral contrast and relative reinforcement frequency in two multiple schedules. J Exp Anal Behav 10: 151-158, 1967. Blough PM. Overall and local contrast in multiple schedules: effects of stimulus similarity and discrimination performance. Anim Learn Behav 16: 395-403, 1988. Boussaoud D. Attention versus intention in the primate premotor cortex. Neuroimage 14: S40 -S45, 2001. Boussaoud D, di Pellegrino G, and Wise SP. Frontal lobe mechanisms subserving vision for action versus vision for perception. Behav Brain Res 72: 1-15, 1995. Bowman EM, Aigner TG, and Richmond BJ. Neural signals in the monkey ventral striatum related to motivation for juice and cocaine rewards. J Neurophysiol 75: 1061-1073, 1996. Boysen ST, Berntson GG, and Mukobi KL. Size matters: impact of item size and quantity on array choice by chimpanzees (Pan troglodytes). J Comp Psych 115: 106 -110, 2001. Braver TS, Cohen JD, Nystrom LE, Jonides J, Smith EE, and Noll DD. A parametric study of prefrontal cortex involvement in human working memory. Neuroimage 5: 49 -62, 1997. Brown VJ and Bowman EM. Discriminative cues indicating reward magnitude continue to determine reaction time of rats following lesions to the nucleus accumbens. Eur J Neurosci 7: 2479 -2485, 1995. Campbell AB and Seiden LS. The effect of relative and absolute reinforcement magnitude on operant responding. Physiol Behav 12: 843-849, 1974. Capaldi EJ, Alptekin S, and Birmingham K. Discriminating between reward-produced memories: effects of differences in reward magnitude. Anim Learn Behav 25: 171-176, 1997. Capaldi EJ and Lynch D. Repeated shifts in reward magnitude: evidence in favor of an associational and absolute (noncontextual) interpretation. J Exp Psychol 75: 226 -235, 1967. Capaldi EJ and Lynch AD. Magnitude of partial reward and resistance to extinction. J Comp Physiol Psych 65: 179 -181, 1968. Cardinal RN, Pennicott DR, Sugathapala CL, Robbins TW, and Everitt BJ. Impulsive choice induced in rats by lesions of the nucleus accumbens core. Science 292: 2499 -2501, 2001. Catania AC. Concurrent performances: a baseline for the study of reinforcement magnitude. J Exp Anal Behav 6: 299 -300, 1963. Chang JY, Chen L, Luo F, Shi LH, and Woodward DJ. Neuronal and behavioral correlations in the medial prefrontal cortex and nucleus accumbens during cocaine self-administration by rats. Neuroscience 99: 433-443, 2000. Collier GH. Determinants of choice. Nebr Symp Motiv 29: 69 -127, 1982. Crespi LP. Quantitative variation in incentive and performance in the white rat. Am J Psych 40: 467-517, 1942. Critchley HG and Rolls ET. Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. J Neurophysiol 75: 1673-1686, 1996. Cromwell HC, Azarov A, Anstrom K, and Woodward DJ. Neural responses in mesoamygdala circuits during associative learning. Soc Neurosci Abstr 27: 1588, 2001. Cromwell HC and Berridge KC. Implementation of action sequences by a neostriatal site: a lesion mapping study of grooming syntax. J Neurosci 16: 3444 -3458, 1996. Cromwell HC and Schultz W. Reward discrimination in primate striatum II. Reward magnitude. Soc Neurosci Abstr 26: 1753, 2000. DeCoteau WE, Kesner RP, and Williams JM. Short-term memory for food reward magnitude: the role of the prefrontal cortex. Behav Brain Res 88: 239 -249, 1997. Delong MR, Alexander GE, Mitchel SJ, and Richardson RT. The contribution of basal ganglia to limb control. Prog Brain Res 64: 161-174, 1986. Dickinson A and Balleine B. Motivational control of goal-directed action. processes in addiction and reward. The role of amygdala-ventral striatal subsystems. Ann NY Acad Sci 877: 412-438, 2000. Flaherty AW and Graybiel AM. Two input systems for body representations in the primate striatal matrix: experimental evidence in the squirrel monkey.

Reward Processing in Primate Orbitofrontal Cortex and Basal Ganglia

Cerebral Cortex, 2000

This article reviews and interprets neuronal activities related to the expectation and delivery of reward in the primate orbitofrontal cortex, in comparison with slowly discharging neurons in the striatum (caudate, putamen and ventral striatum, including nucleus accumbens) and midbrain dopamine neurons. Orbitofrontal neurons showed three principal forms of reward-related activity during the performance of delayed response tasks, namely responses to reward-predicting instructions, activations during the expectation period immediately preceding reward and responses following reward. These activations discriminated between different rewards, often on the basis of the animals' preferences. Neurons in the striatum were also activated in relation to the expectation and detection of reward but in addition showed activities related to the preparation, initiation and execution of movements which reflected the expected reward. Dopamine neurons responded to rewards and reward-predicting stimuli, and coded an error in the prediction of reward. Thus, the investigated cortical and basal ganglia structures showed multiple, heterogeneous, partly simultaneous activations which were related to specific aspects of rewards. These activations may represent the neuronal substrates of rewards during learning and established behavioral performance. The processing of reward expectations suggests an access to central representations of rewards which may be used for the neuronal control of goaldirected behavior.

Waiting by mistake: symbolic representation of rewards modulate intertemporal choice in capuchin monkeys (Cebus apella), children and human adults

2012

In the Delay choice task subjects choose between a smaller immediate option and a larger delayed option. This paradigm, also known as intertemporal choice task, is frequently used to assess delay tolerance, interpreting a preference for the larger delayed option as willingness to wait. However, in the Delay choice task subjects face a dilemma between two preferred responses: ''go for more'' (i.e., selecting the larger, but delayed, option) vs. ''go for sooner'' (i.e., selecting the immediate, but smaller, option). When the options consist of visible food amounts, at least some of the choices of the larger delayed option might be due to a failure to inhibit a prepotent response towards the larger option rather than to a sustained delay tolerance. To disentangle this issue, we tested 10 capuchin monkeys, 101 preschool children, and 88 adult humans in a Delay choice task with food, low-symbolic tokens (objects that can be exchanged with food and have a one-to-one correspondence with food items), and high-symbolic tokens (objects that can be exchanged with food and have a one-to-many correspondence with food items). This allows evaluating how different methods of representing rewards modulate the relative contribution of the ''go for more'' and ''go for sooner'' responses. Consistently with the idea that choices for the delayed option are sometimes due to a failure at inhibiting the prepotent response for the larger quantity, we expected high-symbolic tokens to decrease the salience of the larger option, thus reducing ''go for more'' responses. In fact, previous findings have shown that inhibiting prepotent responses for quantity is easier when the problem is framed in a symbolic context. Overall, opting for the larger delayed option in the visible-food version of the Delay choice task seems to partially result from an impulsive preference for quantity, 0010-0277/$ -see front matter Ó rather than from a sustained delay tolerance. In capuchins and children high-symbolic stimuli decreased the individual's preference for the larger reward by distancing from its appetitive features. Conversely, the sophisticated symbolic skills of adult humans prevented the distancing effect of high-symbolic stimuli in this population, although this result may be due to methodological differences between adult humans and the other two populations under study. Our data extend the knowledge concerning the influence of symbols on both human and non-human primate behavior and add a new element to the interpretation of the Delay choice task. Since high-symbolic stimuli decrease the individual's preference for the larger reward by eliminating those choices due to prepotent responses towards the larger quantity, they allow to better discriminate responses based on genuine delay aversion. Thus, these findings invite greater caution in interpreting the results obtained with the visible-food version of the Delay choice task, which may overestimate delay tolerance.

Influence of Reward Expectation on Behavior-Related Neuronal Activity in Primate Striatum

Journal of Neurophysiology, 1998

Hollerman, Jeffrey R., Léon Tremblay, and Wolfram Schultz. Influence of reward expectation on behavior-related neuronal activity in primate striatum. J. Neurophysiol. 80: 947–963, 1998. Rewards constitute important goals for voluntary behavior. This study aimed to investigate how expected rewards influence behavior-related neuronal activity in the anterior striatum. In a delayed go-nogo task, monkeys executed or withheld a reaching movement and obtained liquid or sound as reinforcement. An initial instruction picture indicated the behavioral reaction to be performed and the reinforcer to be obtained after a subsequent trigger stimulus. Movements varied according to the reinforcers predicted by the instructions, suggesting that animals differentially expected the two outcomes. About 250 of nearly 1,500 neurons in anterior parts of caudate nucleus, putamen, and ventral striatum showed typical task-related activations that reflected the expectation of instructions and trigger, and the ...

Waiting by mistake: Symbolic representation of rewards modulates intertemporal choice in capuchin monkeys, preschool children and adult humans

Cognition, 2014

In the Delay choice task subjects choose between a smaller immediate option and a larger delayed option. This paradigm, also known as intertemporal choice task, is frequently used to assess delay tolerance, interpreting a preference for the larger delayed option as willingness to wait. However, in the Delay choice task subjects face a dilemma between two preferred responses: ''go for more'' (i.e., selecting the larger, but delayed, option) vs. ''go for sooner'' (i.e., selecting the immediate, but smaller, option). When the options consist of visible food amounts, at least some of the choices of the larger delayed option might be due to a failure to inhibit a prepotent response towards the larger option rather than to a sustained delay tolerance. To disentangle this issue, we tested 10 capuchin monkeys, 101 preschool children, and 88 adult humans in a Delay choice task with food, low-symbolic tokens (objects that can be exchanged with food and have a one-to-one correspondence with food items), and high-symbolic tokens (objects that can be exchanged with food and have a one-to-many correspondence with food items). This allows evaluating how different methods of representing rewards modulate the relative contribution of the ''go for more'' and ''go for sooner'' responses. Consistently with the idea that choices for the delayed option are sometimes due to a failure at inhibiting the prepotent response for the larger quantity, we expected high-symbolic tokens to decrease the salience of the larger option, thus reducing ''go for more'' responses. In fact, previous findings have shown that inhibiting prepotent responses for quantity is easier when the problem is framed in a symbolic context. Overall, opting for the larger delayed option in the visible-food version of the Delay choice task seems to partially result from an impulsive preference for quantity, 0010-0277/$ -see front matter Ó rather than from a sustained delay tolerance. In capuchins and children high-symbolic stimuli decreased the individual's preference for the larger reward by distancing from its appetitive features. Conversely, the sophisticated symbolic skills of adult humans prevented the distancing effect of high-symbolic stimuli in this population, although this result may be due to methodological differences between adult humans and the other two populations under study. Our data extend the knowledge concerning the influence of symbols on both human and non-human primate behavior and add a new element to the interpretation of the Delay choice task. Since high-symbolic stimuli decrease the individual's preference for the larger reward by eliminating those choices due to prepotent responses towards the larger quantity, they allow to better discriminate responses based on genuine delay aversion. Thus, these findings invite greater caution in interpreting the results obtained with the visible-food version of the Delay choice task, which may overestimate delay tolerance.