Human substantia nigra neurons encode unexpected financial rewards - PubMed (original) (raw)

Human substantia nigra neurons encode unexpected financial rewards

Kareem A Zaghloul et al. Science. 2009.

Erratum in

Abstract

The brain's sensitivity to unexpected outcomes plays a fundamental role in an organism's ability to adapt and learn new behaviors. Emerging research suggests that midbrain dopaminergic neurons encode these unexpected outcomes. We used microelectrode recordings during deep brain stimulation surgery to study neuronal activity in the human substantia nigra (SN) while patients with Parkinson's disease engaged in a probabilistic learning task motivated by virtual financial rewards. Based on a model of the participants' expected reward, we divided trial outcomes into expected and unexpected gains and losses. SN neurons exhibited significantly higher firing rates after unexpected gains than unexpected losses. No such differences were observed after expected gains and losses. This result provides critical support for the hypothesized role of the SN in human reinforcement learning.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

(A) Intraoperative plan for DBS surgery with targeting of the STN. Microelectrodes are advanced along a tract through the anterior thalamic nuclei (Th), zona incerta (ZI), STN, and into the SN to record neural activity. Each anatomical region is identified by surgical navigation maps overlayed with a standard brain atlas (top) and by its unique firing pattern and microelectrode position (bottom). Depth measurements on the right of the screen begin 15 mm above the pre-operatively identified target, the inferior border of STN. In this example, the microelectrode tip lays 0.19 mm below the target. A, anterior; P, posterior. (B) Probability learning task. Participants are presented with two decks of cards on a computer screen. They are instructed to repeatedly draw cards from either deck to determine which deck yields the higher reward probability. Participants are given up to four seconds for each draw. After each draw, positive or negative feedback is presented for two seconds. Decks are then immediately presented on the screen for the next choice.

Fig. 2

Fig. 2

(A) Learning rates are quantified by dividing the total number of trials (draws from the decks) into 10 equally sized blocks and determining how often participants correctly chose the (objectively) better deck during that block. Trace represents mean learning rate across all participants. Error bars represent SEM. (B) Expected reward associated with one deck in a single experiment. For each trial, we show the expected reward computed for the left deck, _El_[_n_] (blue line) (Eq. 1). The outcome of each trial when this deck was selected is shown as a circle. Circles having value 1 represent positive outcomes, whereas circles having value 0 represent negative outcomes. Black circles denote expected outcomes, and red circles denote unexpected outcomes. We base our analysis on unexpected outcomes. (C) Mean waveforms of three unique spike clusters from one participant are shown in black, with SD colored for each cluster. Scale bar, 10 mV and 0.5 msec. (D) For each identified cluster, we calculated the average time from the beginning of the spike waveform to its return to baseline (a) and the average time between the two positive peaks of the waveform (b). We restricted our analysis to those clusters that had average baseline widths greater than 2 msec and peak-to-peak widths greater than 0.8 msec. (E) Mean(n = 4703) waveform of spikes from a single cell from one participant is shown in black with SD in gray. (Inset) Example waveform. Inset scale bar, 1 mV and 1 msec.

Fig. 3

Fig. 3

(A) Spike raster for a single experiment from one participant. Individual spike activity recorded from SN for trials during positive (blue) and negative (black) feedback is shown for each trial as a function of time. Below each spike raster is the average _z_-scored continuous-time firing rate (continuous trace) and histogram (bars, 75-msec intervals). The red vertical line indicates feedback onset. (B) Individual spike activity, recorded from the same cell as shown in Fig. 3A, for trials in response to unexpected gains (blue) and losses (black) is shown for each trial as a function of time.

Fig. 4

Fig. 4

(A)Average _z_-scored spike rate for unexpected gains (blue trace) compared with unexpected losses (black trace). The red line indicates feedback onset. The gray shaded region indicates the 225-msec interval between 150 and 375 msec after feedback onset. Traces represent average activity from 15 SN cells recorded from 10 participants. (B) Average _z_-scored spike histograms for unexpected gains (blue bars) compared to unexpected losses (black bars). The red vertical line indicates feedback onset. Histograms represent average _z_-scored spike counts from the same 15 SN cells. (C) Average _z_-scored spike rate for expected gains (blue trace) did not differ significantly from expected losses (black trace) for any interval. The red line indicates feedback onset. (D) For every participant, the median positive and negative trial-to-trial change in expected reward, as determined by Eq. 1, is used to classify prediction error into large and small positive and negative differences. Mean _z_-scored spike rate, captured between 150 and 375 msec after feedback onset for all cells, is shown for each level of prediction error. Error bars represent SEM.

References

    1. Rescorla RA, Wagner AR. Classical Conditioning //: Current Research and Theory. Appleton Century Crofts; New York: 1972. pp. 64–99.
    1. Sutton R, Barto A. Learning and Computational Neuroscience: Foundations of Adaptive Networks. MIT Press; Cambridge, MA: 1990. pp. 497–437.
    1. Montague PR, Dayan P, Sejnowski TJ. J. Neurosci. 1996;16:1936. - PMC - PubMed
    1. Mirenowicz J, Schultz W. Nature. 1996;379:449. - PubMed
    1. Schultz W, Dayan P, Montague PR. Science. 1997;275:1593. - PubMed

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources