Ecological expected utility and the mythical neural code (original) (raw)

Coding of Reward Probability and Risk by Single Neurons in Animals

Frontiers in Neuroscience, 2011

Probability and risk are important factors for value-based decision making and optimal foraging. In order to survive in an unpredictable world, organisms must be able to assess the probability and risk attached to future events and use this information to generate adaptive behavior. Recent studies in non-human primates and rats have shown that both probability and risk are processed in a distributed fashion throughout the brain at the level of single neurons. Reward probability has mainly been shown to be coded by phasic increases and decreases in firing rates in neurons in the basal ganglia, midbrain, parietal, and frontal cortex. Reward variance is represented in orbitofrontal and posterior cingulate cortex and through a sustained response of dopaminergic midbrain neurons.

Adaptive neural coding: from biological to behavioral decision-making

Current Opinion in Behavioral Sciences, 2015

Empirical decision-making in diverse species deviates from the predictions of normative choice theory, but why such suboptimal behavior occurs is unknown. Here, we propose that deviations from optimality arise from biological decision mechanisms that have evolved to maximize choice performance within intrinsic biophysical constraints. Sensory processing utilizes specific computations such as divisive normalization to maximize information coding in constrained neural circuits, and recent evidence suggests that analogous computations operate in decision-related brain areas. These adaptive computations implement a relative value code that may explain the characteristic context-dependent nature of behavioral violations of classical normative theory. Examining decision-making at the computational level thus provides a crucial link between the architecture of biological decision circuits and the form of empirical choice behavior.

Neuronal Reward and Decision Signals: From Theories to Data

Physiological Reviews, 2015

Rewards are crucial objects that induce learning, approach behavior, choices, and emotions. Whereas emotions are difficult to investigate in animals, the learning function is mediated by neuronal reward prediction error signals which implement basic constructs of reinforcement learning theory. These signals are found in dopamine neurons, which emit a global reward signal to striatum and frontal cortex, and in specific neurons in striatum, amygdala, and frontal cortex projecting to select neuronal populations. The approach and choice functions involve subjective value, which is objectively assessed by behavioral choices eliciting internal, subjective reward preferences. Utility is the formal mathematical characterization of subjective value and a prime decision variable in economic choice theory. It is coded as utility prediction error by phasic dopamine responses. Utility can incorporate various influences, including risk, delay, effort, and social interaction. Appropriate for forma...

Efficient coding and the neural representation of value

2012

To survive in a dynamic environment, an organism must be able to effectively learn, store, and recall the expected benefits and costs of potential actions. The nature of the valuation and decision processes is thus of fundamental interest to researchers at the intersection of psychology, neuroscience, and economics. Although normative theories of choice have outlined the theoretical structure of these valuations, recent experiments have begun to reveal how value is instantiated in the activity of neurons and neural circuits.

Neural basis of utility estimation

Current Opinion in Neurobiology, 1997

The allocation of behavior among competing activities and goal objects depends on the payoffs they provide. Payoff is evaluated among multiple dimensions, including intensity, rate, delay, and kind. Recent findings suggest that by triggering a stream of action potentials in myelinated, medial forebrain bundle axons, rewarding electrical brain stimulation delivers a meaningful intensity signal to the process that computes payoff. Electronic identifier: 0959-4388-007-00198 0 Current Biology Ltd ISSN 0959-4388 Abbreviations BSR brain-stimulation reward LH lateral hypothalamus MFB medial forebrain bundle MT middle temporal (area) VTA ventral tegmental area Hamilton AL, Stellar JR, Hart EB: Reward, performance, and the response strength method in self-stimulating rats: validation and neuroleptics. Physiol Behav 1965, 36:697-904. Herrnstein RJ: Relative and absolute strength of response as a function of frequency of reinforcement J Exp Anal Behav 1961, 41267-272. Herrnstein RJ: On the law of effect J Exp Anal Behav 1970, 13:243-266. Herrnstein RJ: Formal properties of the matching law. J .Exp Anal Behav 1974, 21 :I 59-l 64. Baum WM, Rachlin H: Choice as time allocation. J fxp Anal Behav 1969, 12:661-674. Miller HL: Matching-based hedonic scaling in the pigeon. J Exp Anal Behav 1976, 26:335-347. Gallistel CR: Foraging for brain stimulation: toward a neurobiology of computation. Cognition 1994, 50:151-l 70. Gallistel CR, Leon M: Measuring the subjective magnitude of brain stimulation reward by titration with rate of reward. Behav Neurosci 1991, 105:913-925. Mark TA, Gallistel CR: Subjective reward magnitude of medial forebrain stimulation as a function of train duration and pulse frequency. Behav Neurosci 1993, 107:369-401. Davison M, McCarthy D: The Matching Law. Hillsdale, NJ: Lawrence Erlbaum Associates; 1966. Shizgal P, Matthews G: Electrical stimulation of the rat diencephalon: differential effects of interrupted stimulation on on-and off-responding. Brain Res 1977, 129:319-333. Kahneman D, Fredrickson BL, Schreiber CA, Redelmeier DA: When more pain is preferred to less: adding a better end. Psycho/ Sci 1993, 4~401-405. Mark TA, Gallistel CR: Kinetics of matching. J Exp Psycho/ 1994, 20:1-l 7. Mazur JE: Choice between single and multiple delayed reinforcers. J Exp Anal Behav 1986, 46:67-78.

Explicit neural signals reflecting reward uncertainty

Philosophical Transactions of the Royal Society B: Biological Sciences, 2008

The acknowledged importance of uncertainty in economic decision making has stimulated the search for neural signals that could influence learning and inform decision mechanisms. Current views distinguish two forms of uncertainty, namely risk and ambiguity, depending on whether the probability distributions of outcomes are known or unknown. Behavioural neurophysiological studies on dopamine neurons revealed a risk signal, which covaried with the standard deviation or variance of the magnitude of juice rewards and occurred separately from reward value coding. Human imaging studies identified similarly distinct risk signals for monetary rewards in the striatum and orbitofrontal cortex (OFC), thus fulfilling a requirement for the mean variance approach of economic decision theory. The orbitofrontal risk signal covaried with individual risk attitudes, possibly explaining individual differences in risk perception and risky decision making. Ambiguous gambles with incomplete probabilistic information induced stronger brain signals than risky gambles in OFC and amygdala, suggesting that the brain's reward system signals the partial lack of information. The brain can use the uncertainty signals to assess the uncertainty of rewards, influence learning, modulate the value of uncertain rewards and make appropriate behavioural choices between only partly known options.

Reading the Neural Code: What do Spikes Mean for Behavior?

Nature Precedings, 2007

The present study reveals the existence of an intrinsic spatial code within neuronal spikes that predicts behavior. As rats learnt a T-maze procedural task, simultaneous changes in temporal occurrence of spikes and spike directivity are evidenced in "expert" neurons. While the number of spikes between the tone delivery and the beginning of turn phase reduced with learning, the generated spikes between these two events acquired behavioral meaning that is of highest value for action selection.

Neuronal adaptation and optimal coding in economic decisions

2017

During economic decisions, neurons in orbitofrontal cortex (OFC) encode the values of offered goods. Importantly, their responses adapt to the range of values available in any given context. Prima facie, range adaptation seems to provide an efficient representation. However, uncorrected adaptation in the encoding of offer values would induce arbitrary choice biases. Thus a fundamental and open question is whether range adaptation is behaviorally advantageous. Here we present a theory of optimal coding for economic decisions. In a nutshell, the representation of offer values is optimal if it ensures maximal expected payoff. In this framework, we examine the activity ofoffer valuecells in non-human primates. We show that their firing rates are quasi-linear functions of the offered values, even when optimal tuning functions would be highly non-linear. Most importantly, we demonstrate that for linear tuning functions range adaptation maximizes the expected payoff, even if the effects of...

Distributed neural representation of expected value

Journal of Neuroscience, 2005

Anticipated reward magnitude and probability comprise dual components of expected value (EV), a cornerstone of economic and psychological theory. However, the neural mechanisms that compute EV have not been characterized. Using event-related functional magnetic resonance imaging, we examined neural activation as subjects anticipated monetary gains and losses that varied in magnitude and probability. Group analyses indicated that, although the subcortical nucleus accumbens (NAcc) activated proportional to anticipated gain magnitude, the cortical mesial prefrontal cortex (MPFC) additionally activated according to anticipated gain probability. Individual difference analyses indicated that, although NAcc activation correlated with self-reported positive arousal, MPFC activation correlated with probability estimates. These findings suggest that mesolimbic brain regions support the computation of EV in an ascending and distributed manner: whereas subcortical regions represent an affective component, cortical regions also represent a probabilistic component, and, furthermore, may integrate the two.

Neural Coding of Distinct Statistical Properties of Reward Information in Humans

Cerebral Cortex, 2005

Brain processing of reward information is essential for complex functions such as learning and motivation. Recent primate electrophysiological studies using concepts from information, economic and learning theories indicate that the midbrain may code two statistical parameters of reward information: a transient reward error prediction signal that varies linearly with reward probability and a sustained signal that varies highly non-linearly with reward probability and that is highest with maximal reward uncertainty (reward probability 5 0.5). Here, using event-related functional magnetic resonance imaging, we disentangled these two signals in humans using a novel paradigm that systematically varied monetary reward probability, magnitude and expected reward value. The midbrain was activated both transiently with the error prediction signal and in a sustained fashion with reward uncertainty. Moreover, distinct activity dynamics were observed in postsynaptic midbrain projection sites: the prefrontal cortex responded to the transient error prediction signal while the ventral striatum covaried with the sustained reward uncertainty signal. These data suggest that the prefrontal cortex may generate the reward prediction while the ventral striatum may be involved in motivational processes that are useful when an organism needs to obtain more information about its environment. Our results indicate that distinct functional brain networks code different aspects of the statistical properties of reward information in humans.