Punishing an Error Improves Learning: The Influence of Punishment Magnitude on Error-Related Neural Activity and Subsequent Learning

Articles, Behavioral/Systems/Cognitive

Journal of Neuroscience 17 November 2010, 30 (46) 15600-15607; https://doi.org/10.1523/JNEUROSCI.2565-10.2010

Abstract

Punishing an error to shape subsequent performance is a major tenet of individual and societal level behavioral interventions. Recent work examining error-related neural activity has identified that the magnitude of activity in the posterior medial frontal cortex (pMFC) is predictive of learning from an error, whereby greater activity in this region predicts adaptive changes in future cognitive performance. It remains unclear how punishment influences error-related neural mechanisms to effect behavior change, particularly in key regions such as pMFC, which previous work has demonstrated to be insensitive to punishment. Using an associative learning task that provided monetary reward and punishment for recall performance, we observed that when recall errors were categorized by subsequent performance—whether the failure to accurately recall a number–location association was corrected at the next presentation of the same trial—the magnitude of error-related pMFC activity predicted future correction. However, the pMFC region was insensitive to the magnitude of punishment an error received and it was the left insula cortex that predicted learning from the most aversive outcomes. These findings add further evidence to the hypothesis that error-related pMFC activity may reflect more than a prediction error in representing the value of an outcome. The novel role identified here for the insular cortex in learning from punishment appears particularly compelling for our understanding of psychiatric and neurologic conditions that feature both insular cortex dysfunction and a diminished capacity for learning from negative feedback or punishment.

Introduction

One tenet of human learning that permeates society is the understanding that punishing an error reduces the likelihood of an undesirable behavior being repeated. Manipulating the magnitude of punishment for an error has also been shown to influence behavioral change, whereby larger penalties increase the likelihood of adaptation (Martin, 1963). Recent work examining error-related neural activity has identified that the magnitude of activity in the posterior medial frontal cortex (pMFC) is predictive of learning from an error (Klein et al., 2007; Hester et al., 2008), whereby greater activity in this region predicts adaptive changes in future cognitive performance. Models of reinforcement learning have suggested that such error-related pMFC activity may represent the evaluation of a reward prediction error—the difference between the expected and actual reward outcomes (Holroyd and Coles, 2002; Brown and Braver, 2005). The pMFC then transmits this value to the region(s) in the brain responsible for coordinating the response, so as to reinforce the desirable, or extinguish the undesirable, response. The magnitude of activity is considered important to the likelihood of the reinforcement resulting in behavioral change.

The outcome expectancy models suggest that the magnitude of punishment will result in a larger reward prediction error and higher level of error-related pMFC activity and, hence, translate to greater levels of behavioral change. To date, however, neuroimaging studies suggest that the pMFC region is not sensitive to increases in the magnitude of punishment (Yeung and Sanfey, 2004), whereas regions such as the rostral anterior cingulate (Taylor et al., 2006), insula, orbitofrontal cortex, and ventrolateral prefrontal cortex do show such a relationship (Seymour et al., 2007).

The aim of the current experiment was to identify what neural mechanisms complement the role of pMFC in the error punishment effect—where performance improves as a function of the level of negative feedback or punishment. If pMFC activity is predictive of learning from an error but is not sensitive to punishment magnitude, what neural mechanism(s) underlie the error punishment effect? We administered a multirepetition, paired-associate learning task during fMRI data collection that allowed us to manipulate the level of punishment during an error.

Performance information was provided after each recall response in two separate fMRI epochs: feedback, the participant's accuracy and the level of reward/punishment (5 or 50¢); and re-encoding, the correct response. The latter afforded participants an opportunity to re-encode the correct number–location association, which was then tested several trials later during a subsequent recall probe. Recall error events were categorized into “corrected” and “repeated” on the basis of subsequent performance.

Providing feedback on incorrect responses was expected to elicit significant blood oxygen level-dependent (BOLD) signals from the pMFC during all recall errors. Within this functionally defined, error-related pMFC region, we examined whether the magnitude of BOLD activity differed between corrected and repeated errors, different penalty magnitudes (5 or 50¢ penalties), and the interaction between penalty magnitude and subsequent correction.

Materials and Methods

Participants.

Sixteen healthy volunteers (11 females; mean age, 24.4 years; range, 21–29 years) participated in the experiment. All participants provided written informed consent, which was approved by ethics committees at The University of Melbourne and Wesley Hospital (Auchenflower, Australia).

Experimental protocols.

A spatial learning task consisting of an array of location–number associations that were to be learned by participants was administered (Fig. 1). All aspects of stimulus delivery and response recording were controlled by E-Prime software (version 1.1, Psychology Software Tools), running on a laptop PC (Celeron 2 Ghz, 128 mb Nvidia video card) that was interfaced with the MR scanner during acquisition of fMRI data. The task began with an encoding phase in which eight locations designated as gray squares were presented simultaneously on a black background. The locations of the squares on the background were selected in a quasi-random fashion from an 8 × 8 matrix, with two locations randomly chosen from each of the four quadrants of the display.

Figure 1.

The spatial paired associated learning task used in the present study, represented by the screen transitions for the encoding and recall phases of the task. Each block of trials began with an encoding phase that presented the two-digit number associated with each location (2000 ms) and an intertrial interval display (1000 ms). All eight number–location associations were presented once during the encoding phase and were immediately followed by the recall phase. A single trial in the recall phase began by highlighting a location in yellow to cue the participant to respond with the two-digit number they associated with the location. Following a variable interstimulus delay, feedback was provided that consisted of presenting the accuracy of the response (red background for an error, blue for correct) and the magnitude of the reward/punishment (an Australian 5 or 50¢ coin). Following a second variable interstimulus display, participants were presented with the actual number associated with the location to enable encoding of the correct response (re-encoding epoch), regardless of prior recall accuracy.

At the commencement of the task, each location in turn had superimposed upon it a two-digit number. The number remained visible for two seconds, and was followed by an interstimulus interval of one second. The digits of each number consisted of 1, 2, 3, or 4, and participants identified the number by entering each digit using the appropriate buttons on a pair of MR-compatible response boxes (fiber optic response pads; Current Designs). Two digit numbers were used to reduce the probability of guessing the correct answer to 6%.

Following the encoding phase in which numbers were shown for each of the eight locations, a series of recall trials was presented. During a recall trial, one of the eight locations was highlighted in yellow, cueing the participant to respond with the two-digit number associated with that location. Participants were required to respond within 3 s, after which a variable interstimulus interval (ISI) was presented for 2–6 s. During the ISI, the location remained highlighted by a yellow border. Feedback (2 s) was then provided for the validity of the response and also the magnitude of reward/penalty. The location square turned blue to indicate a correct response and turned red to indicate an incorrect response. An Australian 5 or 50 cent coin was superimposed over the colored background. Feedback magnitude was randomly assigned to each location (following a participant's response) but modeled to ensure equal amounts of 5 or 50¢ feedback magnitudes for correct trials and error trials (separately). Once assigned, the feedback magnitude of a location was fixed for round 2 recall trials, ensuring that round 1 feedback predicted the future reward and punishment value of a location. Following the feedback epoch, a second ISI was presented for 2–6 seconds, during which the target square remained colored (in either blue or red, depending on accuracy). Following the second ISI, the correct two-digit number was presented on the colored location to allow the participant the opportunity to re-encode the correct answer. The variable ISI delays had the effect of jittering the onset of each task epoch (feedback, re-encoding), which is necessary for event-related fMRI designs in which BOLD changes are modeled for single trials. Each of the eight locations in the array was highlighted once before a second round of highlighting began, in a different pseudo-random order. The recall trials were pseudorandomly ordered across the two rounds of presentation for a single task block to ensure that the interval between the two presentations of any trial was 7–9 trials. The average interval between the two presentations of corrected error and repeated error recall trial was 8.11 and 8.02 trials respectively. The order of stimulus presentation for both encoding and recall trials was consistent across and within blocks. Nine blocks of the encoding/recall cycle were administered to each participant, with each block involving a different array of locations and two-digit numbers. This provided 72 first round recall trials within which we could examine feedback and re-encoding related activity. No location in the array was used more than once throughout the nine runs, and the two digit numbers were not repeated on consecutive blocks.

Image acquisition.

Functional MR images were acquired at the Wesley Hospital using a whole-body 4 tesla Bruker Medspec Varian scanner with a gradient-echo echo-planar imaging (EPI) sequence. The scanner was equipped with a standard radiofrequency birdcage head coil for signal transmission and reception. Lateral head stabilizers were used to minimize head movement. EPI images were acquired using a gradient-echo pulse sequence and sequential slice acquisition [repetition time (TR), 2000 ms; echo time, 30 ms; flip angle, 90°; 32 contiguous slices of 3 mm thickness, 10% gap, in-plane resolution of 3.5 × 3.5 pixels in a field of view of 224 mm]. Each functional run began with two volume acquisitions that were later discarded, to allow for steady-state tissue magnetization. A total of 150 EPI volumes were collected for each functional run, and a total of eight functional runs were performed for each participant. Activation data were registered to high-resolution T1-weighted isotropic (1 mm3) structural magnetization-prepared rapid-acquisition gradient echo images to localize the pattern of physiological changes associated with the task.

Data analysis.

All analyses were conducted using AFNI software (http://afni.nimh.nih.gov/afni/) (Cox, 1996). Following image reconstruction, the time-series data were time-shifted using Fourier interpolation to remove differences in slice acquisition times, and motion-corrected using three-dimensional volume registration (least-squares alignment of three translational and three rotational parameters). Activation outside the brain was also removed using edge detection techniques.

Behavioral data from each participant was used to categorize the recall events into a series of different categories, as follows: successful responses, errors receiving a 5¢ penalty that on the subsequent round of recall trials was correctly recalled (corrected 5¢ error), corrected 50¢ errors, errors receiving a 5¢ penalty that on the subsequent round of recall trials was again incorrectly recalled (repeated 5¢ errors), and repeated 50¢ errors. Errors were classified in this way according to the response made on the subsequent presentation of the same location–number pair (Fig. 2). An incorrect response that was followed by another incorrect response for the same location in the subsequent round was classed as a repeated error, whereas an error that was followed by a correct response in the following round was classed as a corrected error. Errors in the second round of presentations could therefore not be included in this analysis because they did not precede another attempt at recall.

Figure 2.

Method used to classify corrected and repeated errors. Feedback for a participant's response involved presentation of the correct number on either a red background, indicating an error, or a blue background, indicating an accurate recall response. Categorization as either a corrected or repeated error was determined by the participant's performance for the same trial during the next round. In the corrected error example, the participant incorrectly recalled the digits associated with the top left location (responding with 33 rather than 44) during round 1, but correctly recalled these digits during round 2 and so the initial round 1 error is categorized as a corrected error. In the repeated error example, the participant incorrectly responded to the presentation of the top right location during both round 1 and 2, and therefore the initial error is categorized as a repeated error. Dots represent the intervening trials.

A first-level analysis calculated hemodynamic impulse-response functions (IRFs) at 2 s temporal resolution using deconvolution techniques. Separate IRFs were calculated for each of the event types (correct, error), during both the feedback and re-encoding epochs. Response functions for all regressor events were initiated at individual epoch onsets (e.g., separately for feedback and re-encoding) because the presentation of all epochs-of-interest was timed to coincide with the beginning of the 2 s TR cycle. Additional regressors were included to model the activity related to feedback and re-encoding for correct trials, the recall period for errors (before feedback), second round trials, and other inconsequential task events (e.g., instruction screens), to avoid contamination of the baseline and event-related activity estimates, but were not subjected to further analysis. A nonlinear regression program determined the best-fitting gamma-variate function for these IRFs, as previously described (Garavan et al., 1999). The area under the curve of the gamma-variate function was expressed as a percentage of the area under the baseline. The baseline estimate was the mean activation recorded during the variable delay periods between recall trials. This period consisted of viewing the eight gray locations on the screen while waiting for the next memory probe, thus having similar stimulus and memory load requirements as the events of interest.

The percentage area (event-related activation) map voxels were resampled at 1 mm3 resolution then spatially normalized to standard MNI space (MNI 152 template) and spatially blurred with a 3 mm isotropic root-mean-squared Gaussian kernel. Group activation maps for the two event types (correct, error) were determined with one-sample t tests against the null hypothesis of zero event-related activation changes (i.e., no change relative to baseline). Significant voxels passed a voxelwise statistical threshold (t = 4.28, p ≤ 0.001) and were required to be part of a larger 144 μl cluster of contiguous significant voxels. By using a combination of probability thresholding and cluster thresholding, the aim is to maximize the power of the statistical test while holding the likelihood of false positives to a minimum. The Alphasim program (http://afni.nimh.nih.gov/pub/dist/doc/program_help/AlphaSim.html) was used to determine the cluster threshold. The program is provided with the number of voxels in the group map, the spatial correlation of voxels (must be contiguous on three sides), and the voxelwise threshold (in this study, p = 0.001). Using these values, the program conducts a series of Monte Carlo simulations (1000 iterations for our study) to determine the frequency of each conforming cluster size produced purely by chance. From this frequency distribution, the cluster size (144 μl given our parameters) that occurs <1% of the time by chance can be selected, giving a threshold of p = 0.01 (corrected).

The activation clusters from whole-brain analyses of errors during either the feedback or re-encoding epoch were used to create an OR map for the purpose of a regions of interest (ROI) analysis. An OR map includes the voxels of activation indicated as significant from either of the constituent maps. The events of interest for the group map were the errors from round 1 of recall, of which participants made, on average, 40 (range = 21–63). A second analysis was then performed, which entered additional regressors into the deconvolution process to separately estimate activity related to each of the four types of error-related events, relative to baseline. The mean activation for clusters in the OR map was then calculated for the purposes of an ROI analysis, deriving mean activation levels for corrected 5¢, corrected 50¢, repeated 5¢, and repeated 50¢ errors. Theses estimates were compared using 2 × 2 repeated-measures ANOVA, corrected for the number of ROIs via a modified Bonferroni procedure for multiple comparisons (Keppel, 1991).

Although the cognitive task was designed specifically to examine errors and error-related neural activity, correct recall trials from round 1 were also of interest. However, the high rate of successful retention for correct trials from round 1 (e.g., correct recall during both rounds of presentation), 92% for correct recall responses receiving a 50¢ reward and 85% for 5¢ reward trials, provided insufficient trial numbers to conduct the same two-factor (subsequent accuracy, feedback magnitude) analysis that was conducted for errors. For example, the average number of round 1 correctly answered recall trials, which received a 50¢ reward, that were incorrectly recalled in round 2 was 2.06 trials (SD = 1.4), with four participants having zero trials in this category.

To examine the relationship between pMFC activity during the feedback epoch and subsequent activity in the hippocampus during the re-encoding epoch, we also performed intraindividual single-trial analysis. Regressors were constructed for each feedback and re-encoding epoch for each separate trial by inserting a single standard hemodynamic response function at the appropriate time point into an all-zero regressor. All 144 regressors (72 trials × 2 epochs) were included in a general linear model analysis along with additional regressors to model the activity related to other inconsequential task events (recall, instructions, etc.) that were not subjected to further analysis. Beta weights were calculated for each regressor that indexed the percentage change in activation to these single events. Utilizing the functionally defined pMFC and hippocampal ROIs from the group analysis, an average beta weight was calculated for the pMFC ROI during feedback and the hippocampal ROI during re-encoding for each trial. Pairs of BOLD activity estimates for each trial were concatenated and contrasted using a Pearson correlation coefficient analysis to examine the relationship between the values for each participant, and again for the group. To clarify the specificity of this effect, we replicated this analysis using the feedback-related activity for the four other regions differentiating corrected from repeated errors: the left insula, left occipital, left middle temporal, and right inferior frontal (IFG) gyri (Table 1), using a Bonferroni correction for p values to compensate for multiple comparisons.

Table 1.

Regions of error-related activity differentiating corrected from repeated errors during the feedback epoch

Finally, due to a priori interest in the activity of the ventral striatum in response to performance outcomes, anatomically defined ROI analyses were conducted on right (130 μl; x = 11, y = 9, z = −8) and left (157 μl; x = −13, y = 9, z = −8) nucleus accumbens (NAcc), defined by the Montreal Neurological Institute atlas of the AFNI toolbox.

Results

Behavioral results

Recall of the number–location associations significantly improved across the two rounds of presentations, t(15) = 114.6, p = 0.01. Participants were accurate on 45.6% of recall trials in round 1 and 67.1% in round 2. Of those recall trials that were unsuccessfully recalled during round 1, 56.6% were corrected on the next presentation (during round 2). The magnitude of punishment associated with first round errors significantly influenced the level of subsequent performance correction, F(1,15) = 6.23, p < 0.05. On average, 50.2% of round 1 errors receiving a 5¢ penalty were corrected, compared with 63.7% error correction for errors receiving a 50¢ penalty. Reaction times (in ms) for the four categories of errors (corrected 5¢, 2192.3; corrected 5¢, 2204.5; repeat 5¢, 2166.5; repeat 50¢, 2286.2) did not demonstrate a significant main effect of correction (F(1,15) = 0.24, p = 0.631), penalty magnitude (F(1,15) = 1.12, p = 0.306), or an interaction therein (F(1,15) = 0.699, p = 0.416).

fMRI BOLD activity

Corrected versus repeated errors

Feedback indicating erroneous recall was associated with significant activity in the posterior medial frontal cortex (Fig. 3A). The center of mass for this cluster of activity was located at MNI coordinates x = −2, y = 0, z = 47, which fall within the rostral cingulate zone highlighted by Ridderinkhof and colleagues' (2004) review of performance monitoring. Within this functionally defined ROI, corrected errors were associated with significantly higher levels of BOLD activity compared with repeated errors (F(1,15) = 6.80, p = 0.02). Significant activity was also detected during the re-encoding period, however activity patterns were inversely related to that seen during feedback, with significantly greater activity for repeated errors compared with corrected (p = 0.01).

Figure 3.

A, Three-dimensional rendering from the axial perspective of the medial prefrontal cortex functional derived region of interest (MNI coordinates: x = −2; y = 0; z = 47) activated during the feedback epoch for recall errors. Cluster activity was determined relative to averages across intertrial delay periods. B, Estimates of mean percentage change in BOLD activity during the feedback epoch for corrected and repeated errors receiving either 5 or 50¢ monetary punishment.

Activity in several other regions also differentiated corrected from repeated errors (Table 1), including the left insula, left occipital, left middle temporal, and right IFG. All differences indicated significantly greater activity for corrected errors compared with repeated errors, with p values less than or equal to 0.023 (corrected for multiple comparisons using modified Bonferroni procedure). pMFC activity during corrected errors also correlated with activity in the left insula (r = 0.49, p = 0.02). Activity in other regions predicting error correction were not significantly related.

During the re-encoding epoch, only one region, the right hippocampus (x = 37, y = −31, z = −9) (Fig. 4), demonstrated a significant main effect for the difference between corrected and repeated errors (F(1,15) = 5.79, p = 0.020). Activity during corrected errors was significantly higher than during repeated errors.

Figure 4.

A–C, Three-dimensional rendering of the right hippocampal functionally derived region of interest (MNI coordinates: x = 37, y = −31, z = −9) activated during the re-encoding epoch for recall errors, from sagittal (A), axial (B), and coronal (C) perspectives. BOLD activity estimates were derived for each individual trial from the pMFC and left insula clusters during feedback and the hippocampus cluster during re-encoding. D, The correlation coefficient for these paired values across all trials was calculated to estimate the relationship. A significant positive correlation was identified between pMFC and hippocampal cortex (HC) activity estimates (mean intraindividual correlation, r = 0.22; range, r = −0.16–0.51; p = 8.6135 × 10−14), and left insula and hippocampal activity estimates (mean intraindividual correlation, r = 0.18; range, r = −0.11–0.52; p = 7.1 × 10−21) and the intraindividual correlation value is plotted in D for each participant.

To explore the relationship between feedback-related pMFC activity and re-encoding-related hippocampal activity, we conducted an intraindividual correlation analysis of single-trial activity. BOLD activity estimates were derived for each individual trial from the pMFC cluster during feedback and the hippocampus cluster during re-encoding. The correlation coefficient on these paired values across all trials was calculated to estimate the relationship. A significant positive correlation was identified between the two activity estimates (mean intraindividual correlation, r = 0.22; range, r = −0.16–0.51; p = 8.6135 × 10−14), which indicate that a higher magnitude of pMFC activity during feedback was associated with higher levels of hippocampal cluster activity during re-encoding. To clarify the specificity of this effect, we replicated this analysis using the feedback-related activity for the four other regions differentiating corrected from repeated errors. Only the left insula demonstrated a significant relationship, with a significant positive correlation between the two activity estimates (mean intraindividual correlation, r = 0.18; range, r = −0.11–0.52; p = 7.1 × 10−21), indicating higher levels of left insular activity during feedback were associated with higher levels of hippocampal activity during re-encoding.

Penalty magnitude

A comparison of errors punished with 5 or 50¢ examined the main effect of penalty magnitude on feedback and re-encoding activity. Activity in the pMFC during the feedback epoch was not differentiated by the magnitude of penalty an error had received, nor was the activity in any other error-related regions. Repeating the analysis with the inclusion of only corrected errors (5 vs 50¢) identified two regions sensitive to punishment magnitude. The left insula cortex (x = −41, y = 8, z = 7) had significantly greater activity for corrected errors receiving a 50¢ penalty compared with a 5¢ penalty (p = 0.011), whereas the right inferior frontal gyrus was significantly higher for 5¢ penalties compared with 50¢ (p = 0.009).

Examination of activity during the re-encoding epoch did not identify any regions significantly sensitive to penalty magnitude.

Interaction between penalty magnitude and correction

A test of the interaction between penalty magnitude and error correction status was conducted on the BOLD activity from the four types of errors (5¢ corrected, 50¢ corrected, 5¢ repeated, 50¢ repeated). pMFC activity during the feedback epoch did not show a significant interaction effect (F(1.15) = 0.284, p = 0.60) with only the left insula region (F(1,15) = 9.67, p = 0.007) (Fig. 3) and a region in the right cerebellum (cerebellar vermis: x = 5, y = −67, z = −46), having significant interaction terms. The pattern of activity in the left insula indicated that 50¢ penalties significantly increased the difference between activity for corrected and repeated errors compared with 5¢ penalties (Fig. 5). Right cerebellar activity demonstrated a significantly higher activity for corrected 50¢ errors compared with repeated 50¢ errors, but no difference between corrected and repeated 5¢ errors. Activity during the re-encoding epoch revealed one region demonstrating a significant interaction effect in the right caudate (F(1,15) = 14.16, p = 0.002). A significant difference was evident between corrected and repeated errors for the 50¢ condition, but not the 5¢ condition.

Figure 5.

A, Three-dimensional rendering, from both the coronal and sagittal perspective, of the left anterior insula cortex functionally derived region of interest (MNI coordinates: x = −41, y = 8, z = 7) activated during the feedback epoch for recall errors. Cluster activity was determined relative to baseline activity averaged across intertrial delay periods. B, Estimates from the feedback epoch of mean percentage change in BOLD activity during corrected and repeated errors receiving either 5 or 50¢ monetary punishment. C, Pearson correlation coefficient scatterplot of the significant relationship between feedback epoch activity for corrected errors in the pMFC and left insula cortex regions. L, Left; R, right.

Nucleus accumbens

In the absence of a functionally defined ROI and given the previous findings of a role for the NAcc in response to reward and punishment, we examined mean activity estimates from anatomically defined NAcc ROIs during the feedback epoch for the four types of errors. The 2 × 2 ANOVA indicated a main effect of penalty magnitude for the left NAcc ROI (F(1,15) = 5.40, p = 0.03), whereby activity was significantly higher during errors receiving a 50¢ penalty compared with a 5¢ penalty. The penalty magnitude main effect in the right NAcc cluster only approached significance (p = 0.12). The data indicated no main effect of error correction (corrected vs repeated errors) or an interaction between penalty magnitude and error correction.

Discussion

Our results indicate that delayed adjustments to behavior, in the form of learning arbitrary associations, were associated with the magnitude of error-related activity in the pMFC. We found that when recall errors were categorized by subsequent performance—essentially whether the failure to accurately recall a number–location association was corrected at the next presentation of the same trial—the magnitude of error-related pMFC activity predicted future correction. A higher level of pMFC activity was observed during feedback for corrected errors compared with repeated errors, consistent with previous findings (Cohen and Ranganath, 2007; Klein et al., 2007; Hester et al., 2008). The predictive relationship between error-related pMFC activity and adaptive future performance was present only during performance feedback. Activity levels in the pMFC during the re-encoding epoch for the exact same events—when participants were provided an opportunity to encode the correct number-location association—did not differentiate corrected from repeated errors. The specificity of this relationship appears consistent with the theories positing the pMFC's role in monitoring outcomes and communicating the value of behavior (Holroyd and Coles, 2002; Nieuwenhuis et al., 2004; Brown and Braver, 2005; Rushworth and Behrens, 2008), rather than a reflection of applying greater attention or effort to correcting the mistake, for example, when re-encoding the correct answer (Paus, 2001; Critchley et al., 2003).

Our manipulation of error penalties, with both small (5¢) and large (50¢) monetary fines imposed randomly on participants following an error, significantly influenced recall performance. Errors given a 50¢ penalty were corrected significantly more often than those receiving a 5¢ penalty (63% vs 50%), whereas error-related pMFC activity showed greater levels of activity for corrected errors compared with repeated errors regardless of the penalty imposed (Fig. 3). Models of error-related pMFC activity hypothesize that the magnitude of activity may represent the evaluation of a reward prediction error—the difference between the expected and actual reward outcomes (Holroyd and Coles, 2002; Brown and Braver, 2005). Therefore, unpredictable increases in the magnitude of punishment should result in a larger prediction error and a higher level of error-related pMFC activity. In the present task, small and large penalties were both equiprobable and unpredictable and we hypothesized that receiving a 50¢ error penalty would be considered an outcome that was worse than expected. Contrary to this hypothesis but consistent with previous neuroimaging studies (Yeung and Sanfey, 2004; Taylor et al., 2006), we found error-related pMFC activity to be insensitive to increases in the magnitude of punishment. Another activation cluster, in the left anterior insula cortex, did show such a relationship, demonstrating a significant interaction between penalty magnitude and error correction, whereby activity levels were significantly higher for corrected than for repeated errors; the magnitude of this effect increased with a larger monetary penalty (Fig. 4). It is also worth noting that error-related activity in the pMFC and insula regions positively correlated, with activity increases in response to feedback in one region paralleled by the other.

The pattern of error-related pMFC activity observed in the current study appears more consistent with theories suggesting that, in addition to reflecting the prediction error, pMFC activity represents the integrated value of an action, including the degree to which feedback will guide future behavior (Rushworth and Behrens, 2008; Jocham et al., 2009). Previous studies have focused on learning environments that vary the reliability and validity of feedback to assess the response of the pMFC, finding that this region was able to assess the relative weight of error feedback. The present data, where the reliability and validity of the feedback was consistent across trials but the monetary value of an error was varied, would support this contention insofar as error-related pMFC activity reflected, regardless of the error penalty magnitude, when feedback information had influenced learning and subsequent performance.

The error punishment effect seen in participants' behavioral performance, with increased recall correction rates for errors receiving larger monetary punishment, was most closely related to activity in the left insula cortex. BOLD activity in this region was consistently greater for corrected errors compared with repeated errors, and increasing the magnitude of monetary punishment for an error inflated this difference. Similarly, left insula activity was significantly greater for corrected 50¢ errors compared with corrected 5¢ errors, indicating a relationship between the magnitude of insula activity and the likelihood of learning from high value errors.

Previous studies have identified insular cortex activity during punishment (Sanfey et al., 2003; Ullsperger and von Cramon, 2003; Seymour et al., 2004; Wächter et al., 2009), with the level of activity linked to the magnitude of (Elliott et al., 2000) and individual sensitivity to (Samanez-Larkin et al., 2008) punishment. Insula activity in response to punishment has also been shown to be greatest when it precedes a change in decision making (O'Doherty et al., 2003), with recent evidence suggesting that it may represent an aversive prediction error (Pessiglione et al., 2006). The region is thought to be generally representative of negative emotional states, including diverse states such as hunger, pain, anger, and disgust (Naqvi and Bechara, 2009). Craig (2009) has hypothesized that, rather than a reaction to punishment directly, anterior insula activity represents awareness of an outcomes' salience and our emotional reaction to it. Greater activity in the insula therefore indicates heightened awareness of the emotional significance of an outcome. This interpretation appears parsimonious with our finding of heightened insula activity during feedback for corrected 50¢ errors, but not the absence of activity for repeated 50¢ errors, which would appear not to reflect a direct representation of the magnitude of punishment.

The coactivation and correlation of activity in the anterior insula and pMFC during corrected errors is intriguing in the context of theories about the hypothesized role of each. The coactivation of these regions is a common phenomena in cognitive tasks (Dosenbach et al., 2006; Heimer and Van Hoesen, 2006) and has been taken as support for the hypothesis that insula activity represents awareness of the salience, or value, of an outcome, and the pMFC represents volitional agency, or the control of directed effort toward dealing with the ramifications of the outcome (Craig, 2009). Similarly, authors attempting to explain error-related pMFC activity have suggested that activity in this region reflects an assessment of value of an action (Rushworth and Behrens, 2008; Jocham et al., 2009). The present results appear consistent with the value hypothesis, in that the left insula activity reflected the most aversive events whereas the pMFC activity reflected those aversive events that prompted correction or learning.

Previous studies have highlighted that error-related pMFC activity, in addition to reflecting outcomes that were worse than expected, is influenced by the reliability and validity of feedback (Kennerley et al., 2006; Behrens et al., 2007; Walton et al., 2007; Jocham et al., 2009) as part of a more integrated assessment of an action outcome's value. An evaluation of the emotional salience of an outcome from the insula may contribute to the pMFC's assessment of the value of outcome feedback. Having assessed such a range of factors, it may be that activity in the pMFC indicates engagement with the cortical regions critical to the action execution so as to reinforce or extinguish the behavior. Such functionality would be consistent with the control filter role first proposed by Holroyd and Coles (2002) for pMFC activity in response to negative prediction error information relayed from the mesencephalic dopamine system. For example, both group level and single-trial within-subject analyses indicated increased pMFC activity during the feedback epoch preceded elevated levels of hippocampal activity during the re-encoding period, which in turn predicted encoding of the correct number–location association. The hippocampus is critical to the successful encoding of arbitrary associations (Small et al., 2001) and activity in this region has previously been shown to predict accurate future recall (Stark and Okado, 2003; Degonda et al., 2005). The relationship between pMFC and hippocampus was not unique, with a similar strength of relationship identified between feedback-related insula activity and re-encoding-related hippocampal activity. The high correspondence within individuals (Fig. 4) between feedback-related pMFC and insula activity and its subsequent association to re-encoding-related hippocampal activity may to some degree reflect the high level of pMFC and insula coactivation. Craig's (2009) hypothesis suggests that pMFC activity should initiate remedial behavior necessitated by awareness from the insula of an aversive event. To discriminate these roles, we would ideally have had trials that afforded or prevented the remediation of an aversive outcome to examine the relative levels of pMFC and insula activity.

The support from the present data for the insular cortex having a critical role in learning from errors, particularly in learning from errors that result in the poorest or most averse outcomes, has an implication for a range of clinical conditions. For example, a common feature of addiction is an increased sensitivity to reward and a diminished sensitivity to punishment that manifests as a failure to learn from or disregard negative or aversive outcomes (Bechara et al., 2002; Franken et al., 2005). Recent work has consistently demonstrated that insular cortex dysfunction is associated with addiction (Paulus, 2007; Goldstein et al., 2009; Naqvi and Bechara, 2009), particularly poor decision making that may contribute to continued drug taking in the face of significant negative consequences (Paulus et al., 2005a, 2008). The opposite pattern, increased insula activity in response to punishment and heightened sensitivity to learning from aversive outcomes (Paulus et al., 2005b; Samanez-Larkin et al., 2008), is a feature of general anxiety disorder and other psychiatric conditions that feature anxiety such as obsessive-compulsive disorder and posttraumatic stress disorder. The present result may offer a functional neuroanatomical correlation for the relationship between insular cortex dysfunction and the alterations to learning from negative outcomes, regardless of their magnitude.

Footnotes

This research was supported by Australian Research Council Grant DP1092852 (to R.H.) and National Health and Medical Research Council Fellowship (519730) (to R.H.). The assistance of Drs. Katie McMahon, Mark Strudwick, Mark Bellgrove, Jason Mattingley, and Matt Meredith is gratefully acknowledged.
Correspondence should be addressed to Dr. Robert Hester, Department of Psychological Sciences, Redmond Barry Building, University of Melbourne, Melbourne, Victoria, 3010, Australia. hesterr{at}unimelb.edu.au

Punishing an Error Improves Learning: The Influence of Punishment Magnitude on Error-Related Neural Activity and Subsequent Learning (original) (raw)