Action and outcome encoding in the primate caudate nucleus - PubMed (original) (raw)

Action and outcome encoding in the primate caudate nucleus

Brian Lau et al. J Neurosci. 2007.

Abstract

The basal ganglia appear to have a central role in reinforcement learning. Previous experiments, focusing on activity preceding movement execution, support the idea that dorsal striatal neurons bias action selection according to the expected values of actions. However, many phasically active striatal neurons respond at a time too late to initiate or select movements. Given the data suggesting a role for the basal ganglia in reinforcement learning, postmovement activity may therefore reflect evaluative processing important for learning the values of actions. To better understand these postmovement neurons, we determined whether individual striatal neurons encode information about saccade direction, whether a reward had been received, or both. We recorded from phasically active neurons in the caudate nucleus while monkeys performed a probabilistically rewarded delayed saccade task. Many neurons exhibited peak responses after saccade execution (77 of 149) that were often tuned for the direction of the preceding saccade (61 of 77). Of those neurons responding during the reward epoch, one subset showed direction tuning for the immediately preceding saccade (43 of 60), whereas another subset responded differentially on rewarded versus unrewarded trials (35 of 60). We found that there was relatively little overlap of these properties in individual neurons. The encoding of action and outcome was performed by largely separate populations of caudate neurons that were active after movement execution. Thus, striatal neurons active primarily after a movement appear to be segregated into two distinct groups that provide complimentary information about the outcomes of actions.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Timeline of delayed saccade task. The fixation point and the peripheral cue were coilluminated during the delay, which was randomly selected to be 1, 1.2, or 1.4 s in duration.

Figure 2.

Figure 2.

Temporal response profiles for four example PANs. a, Neuron responding before and during cue presentation (Gaussian smoothing σ = 100 ms). Rasters show five trials for each direction arranged according to the right axis (only a subset of the total trials recorded is plotted for clarity). Spike density functions were estimated from the three cue directions (italicized directions on the right axis) that elicited the largest responses. The gray regions represent the first times to half-maximal response before and after the peak response time (see Materials and Methods). b, Neuron responding to cue presentation (ς = 55 ms). c, Neuron responding before saccade (ς = 45 ms). d, Neuron responding immediately after the completion of a saccade, but before the onset of reward (ς = 32 ms).

Figure 3.

Figure 3.

Temporal response profiles of all neurons during the delayed saccade task (n = 149). a, Each row represents a spike density function for a single neuron, estimated using trials from the three target directions that elicited the greatest response for each neuron. Each spike density function was normalized by its peak response, and the data are sorted by peak response time. b, Individual peak response times (symbols) with first times to half-maximal response before and after the peak response time (half-max windows, beginning and end of the line for each row). Neurons are plotted in the same order as in a. The different colors represent the neuronal categories defined in the Materials and Methods.

Figure 4.

Figure 4.

Phasically active neurons exhibit a variety of tuning for cue or saccade direction. Polar plots of direction tuning for eight example neurons (two neurons each for Precue, Cue/Delay, Saccade, and Postsaccade categories; examples for the Reward category are shown in Figs. 6, 8). The TS and response strength (M) are also listed for each neuron. Contralateral movements are to the right. The panels marked with asterisks correspond to the neurons in Figure 2. Firing rates were estimated using the times to half-maximal response before and after the peak response time for each neuron. The half-max windows were as follows: −757 to 301 ms and −758 to 246 ms relative to cue onset for the Precue neurons; 166 to 1262 ms and −2 to 287 ms relative to cue onset for the Cue/Delay neurons, −518 to 50 ms and −642 to 73 ms relative to saccade completion for the Saccade neurons; 13 to 238 ms and −53 to 208 ms relative to saccade completion for the Postsaccade neurons.

Figure 5.

Figure 5.

Direction-tuning summary for neurons with peak responses preceding reward onset. a, Histograms of TS. Colored bars indicate statistically significant direction tuning (p < 0.05, bootstrap test for TS). Arrows indicate means for significant TS. There were no significantly tuned Precue neurons. b, Stacked polar histogram of preferred directions for significantly tuned neurons. The population preferred directions are significantly biased toward contralateral directions (p < 0.01, Rayleigh test), with a circular mean of 352 ± 29° (± 95% CI, triangle and arc along outer edge). c, Population tuning functions. Each category is plotted separately, and except for the Precue category, only significantly tuned neurons are included in the average. The symbols represent the mean (±1 SEM), and the lines represent the best-fitting von Mises (circular Gaussian) function. The firing rate of each neuron was normalized to its average response to all directions; a value of 1 indicates no modulation from the mean (i.e., untuned).

Figure 6.

Figure 6.

Neurons with responses during the reward epoch respond differentially to reward delivery. a, Example neurons with peak responses after reward onset. The reward responsiveness index for each neuron is listed in the top left corner of each polar plot. b, Histogram of _RR_s for all neurons with the first time to half-maximal response after the peak response time half-max windows extending at least 150 ms into the reward epoch (n = 60, 4 Postsaccade neurons, 56 Reward neurons). Only one Postsaccade neuron exhibited a significant RR (tuning functions plotted in Fig. 8).

Figure 7.

Figure 7.

Neurons responding after saccade execution are direction tuned. a, Tuning strength estimated from rewarded plotted against TS estimated from unrewarded trials for neurons with a significant tuning strength for either condition (light gray symbols, n = 43, 3 Postsaccade neurons, 40 Reward neurons). The dark filled symbols represent neurons that have jointly significant tuning strengths for rewarded and unrewarded trials (n = 22, 2 Postsaccade neurons, 20 Reward neurons). b, Preferred direction estimated from rewarded plotted against preferred direction estimated from unrewarded trials.

Figure 8.

Figure 8.

Relation between direction tuning and absolute reward responsiveness for neurons with peak responses after saccade execution. The tuning sharpness along the abscissa is taken as an average of the tuning sharpness for the rewarded and unrewarded conditions, weighted by the reciprocal of the width of the associated confidence interval. Only neurons that are significant for either RR or TS are plotted (n = 54, 3 Postsaccade neurons, 51 Reward neurons). Four example neurons that were jointly significant for both TS and RR are shown (the single Postsaccade neuron differentially responsive to reward, and 3 Reward neurons). The dashed lines represent the marginal medians. The quadrants defined by the intersection of these lines would each contain 25% of the data points under the hypothesis of independence between tuning sharpness and reward responsiveness.

Figure 9.

Figure 9.

Postmovement responses align better to the primary instructed saccade than to the secondary saccade. a, Example postmovement neurons, showing the metrics used to quantify differences between the primary saccade to the instructed target and the secondary saccade after reward delivery. b, Summary for postmovement neurons (21 Postsaccade, open symbols; 56 Reward, shaded symbols). The response is higher and narrower for the majority of neurons. c, Neurons most active immediately preceding the primary saccade are plotted for comparison.

Figure 10.

Figure 10.

Dynamic tuning analysis, population averages. a, Population significance of TS as a function of time. The solid black line represents the percentage significance from the entire population (n = 149). The dashed line shows percentage significance from all neurons responding above baseline in each window. The mean population response (blue line) was estimated by averaging response strength (M) across neurons for each 200 ms window. b, Average response (M) for each neuron category. c, Average TS for each neuron category. Data are plotted for those windows in which at least 15 neurons responded at least 1.5 times above baseline.

Figure 11.

Figure 11.

Dynamic tuning analysis, individual data. a, TS for each neuron plotted as a function of time. The order of neurons is the same as in Figure 3. TS is only plotted for those windows in which direction tuning is significant (p < 0.05), and the neuronal response was above baseline. b, Neurons with significant RR, resorted according to increasing |RR|.

Figure 12.

Figure 12.

Recording locations. a, Structural MRI image for monkey B. We used vitamin E-filled capillary tubes as grid markers, which allowed us to section the image in the plane of the recording grid. The white line through the center of the grid represents the angle of approach, and its length is the same as the guide tube used during recording. Cd, Caudate nucleus; CS, cingulate sulcus; Put, putamen. b, Camera lucida drawings of histological sections for monkey H. The position of each section is estimated relative to the anterior commissure (AC). The asterisk in each section indicates the location of electrolytic lesions recovered in the caudate nucleus. LV, Lateral ventricle. c, Recording locations for task-related PANs, plotted separately for each neuronal category. The caudate outline represents the boundaries of the nucleus when viewed from above as taken from the atlas of Francois et al. (1996).

Comment in

Similar articles

Cited by

References

    1. Alexander GE, DeLong MR, Strick PL. Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annu Rev Neurosci. 1986;9:357–381. - PubMed
    1. Anderson ME, Horak FB. Influence of the globus pallidus on arm movements in monkeys. III. Timing of movement-related information. J Neurophysiol. 1985;54:433–448. - PubMed
    1. Aosaki T, Tsubokawa H, Ishida A, Watanabe K, Graybiel AM, Kimura M. Responses of tonically active neurons in the primate's striatum undergo systematic changes during behavioral sensorimotor conditioning. J Neurosci. 1994;14:3969–3984. - PMC - PubMed
    1. Apicella P, Ljungberg T, Scarnati E, Schultz W. Responses to reward in monkey dorsal and ventral striatum. Exp Brain Res. 1991;85:491–500. - PubMed
    1. Apicella P, Scarnati E, Ljungberg T, Schultz W. Neuronal activity in monkey striatum related to the expectation of predictable environmental events. J Neurophysiol. 1992;68:945–960. - PubMed

MeSH terms

LinkOut - more resources