Molecular and Circuit-Dynamical Identification of Top-Down Neural Mechanisms for Restraint of Reward Seeking - PubMed (original) (raw)

Molecular and Circuit-Dynamical Identification of Top-Down Neural Mechanisms for Restraint of Reward Seeking

Christina K Kim et al. Cell. 2017.

Abstract

Reward-seeking behavior is fundamental to survival, but suppression of this behavior can be essential as well, even for rewards of high value. In humans and rodents, the medial prefrontal cortex (mPFC) has been implicated in suppressing reward seeking; however, despite vital significance in health and disease, the neural circuitry through which mPFC regulates reward seeking remains incompletely understood. Here, we show that a specific subset of superficial mPFC projections to a subfield of nucleus accumbens (NAc) neurons naturally encodes the decision to initiate or suppress reward seeking when faced with risk of punishment. A highly resolved subpopulation of these top-down projecting neurons, identified by 2-photon Ca2+ imaging and activity-dependent labeling to recruit the relevant neurons, was found capable of suppressing reward seeking. This natural activity-resolved mPFC-to-NAc projection displayed unique molecular-genetic and microcircuit-level features concordant with a conserved role in the regulation of reward-seeking behavior, providing cellular and anatomical identifiers of behavioral and possible therapeutic significance.

Keywords: 2-photon Ca(2+) imaging; activity-dependent labeling; medial prefrontal cortex; nucleus accumbens; optogenetics; reward seeking; ventral tegmental area.

Copyright © 2017 Elsevier Inc. All rights reserved.

PubMed Disclaimer

Figures

Figure 1

Figure 1. Molecular and Anatomical Characterization of mPFC Projections to NAc and VTA

(A) Schematic for molecular profiling experiment. (B) Volcano plot illustrating genes enriched in mPFC→NAc cells (shown as positive fold enrichment, green dots) or enriched in mPFC→VTA cells (shown as negative fold enrichment, magenta dots), respectively. Fold enrichment is plotted in linear space to describe how much the expression differs from one group to the other group. Gray dots denote genes with p ≥ 0.05 or fold enrichment ≤ 1.5. One-way between-subjects ANOVA analysis. (C) Schematic of viral strategy for dual-projection labeling of mPFC→NAc and mPFC→VTA neurons in the same animal. (D) Coronal section showing mPFC→NAc (green) and mPFC→VTA (magenta) cell bodies. Scale bar: 600 μm. (E) Probability distribution function of lateral distances of cell bodies from midline. mPFC→NAc cells are more superficial than mPFC→VTA cells (n = 237 mPFC→NAc and 500 mPFC→VTA cells from 5 mice; Kruskal-wallis test, H1 = 319.46, *p < 1e–10). (F) Single-projection labeling of mPFC→NAc or mPFC→VTA in separate mice. (G and H) Example mPFC→NAc (G) or mPFC→VTA(H) cells labeled with eYFP. CTIP2 stain overlapped with mPFC→VTA but not mPFC→NAc cells (n = 3 mice per projection). White dashed line, superficial boundary of CTIP2 stain. Scale bar: 100 μm. See also Figure S2.

Figure 2

Figure 2. mPFC→NAc but Not mPFC→VTA Cells Are Suppressed prior to Reward Seeking with Punishment

(A) Schematic and example FIP Ca2+ trace for mPFC→NAc (top) or mPFC→VTA populations (bottom). Vertical scale bar: 2 z-scores; horizontal scale bar: 25 s. (B) Behavioral protocol. Upon stable lever pressing for liquid reward (max 50 rewards/day), a protocol was instituted wherein 30% of lever presses instead resulted in 1 s foot-shock. (C and D) Cumulative lever presses across all mPFC→NAc (C) and mPFC→VTA (D) mice in baseline and shock days. Each line, individual mouse. (E) mPFC→NAc FIP activity aligned to lever presses followed by reward (top) or shock (bottom). Black dashed line, lever press time; gray vertical line, average reward-retrieval time (0.87 ± 0.04 s); gray shaded rectangle, shock duration. Mean ± SEM (n = 250 reward and 25 shock trials; 5 mice). Vertical scale bars: 0.2 z-scores for reward, 2 z-scores for shock. Horizontal scale bar: 1 s. (F) Mean mPFC→NAc response following lever press resulting in reward or shock. Activity suppressed during reward and increased during shock (Reward: n = 250 trials, 5 mice; Wilcoxon’s signed-rank test, *p < 1e–10. Shock: n = 25 trials, 5 mice; Wilcoxon’s signed-rank test, *p = 1.23e–5). (G and H) As in (E) and (F), except mPFC→VTA FIP activity (n = 250 reward trials, 31 shock trials, 5 mice). Average reward-retrieval time: 0.80 ± 0.06 s after the lever press. mPFC→VTA activity is suppressed during reward and increased during shock (Reward: n = 250 trials from 5 mice; Wilcoxon’s signed-rank test, *p = 4.96e–4. Shock: n = 31 trials, 5 mice; Wilcoxon’s signed-rank test, *p = 1.17e–6). (I) mPFC→NAc activity preceding lever presses on baseline (black) or shock day (green). Mean ± SEM (n = 250 baseline and 58 shock-day trials; 5 mice). (J) Mean suppression in mPFC→NAc activity prior to lever press was larger on shock-risk day compared to baseline day (n = 250 baseline and 58 shock day trials, 5 mice; Wilcoxon’s rank-sum test, *p = 0.020). (K and L)As in (I) and (J), except mPFC→VTA. Mean mPFC→VTA suppression prior to lever press was not different between baseline and shock days (n = 250 baseline and 86 shock day trials, 5 mice; Wilcoxon’s rank-sum test, p = 0.68). (M) Same data in panel (J) averaged across mice instead of trials. Each pair: individual mouse (n = 5 mice; paired t test, t4 = −3.85, *p = 0.018). (N) On shock day, a positive correlation between number of lever presses made and mean relative suppression of mPFC→NAc prior to lever press (n = 5 mice; Pearson’s r = 0.96, p = 0.011). (O) Same data in panel (L), averaged across mPFC→VTA mice instead of trials. Each pair: individual mouse (n = 5 mice; paired t test, t4 = −0.048, p = 0.65). (P) No correlation between number of lever presses made and mean relative suppression in mPFC→VT Aactivity prior to lever press (n = 5mice; Pearson’sr= −0.17, p = 0.79). All bar graphs plotted as mean ± SEM. See also Figure S3.

Figure 3

Figure 3. Optogenetic Stimulation of mPFC→NAc Projections Does Not Suppress Reward Seeking

(A) Schematic for mPFC→NAc stimulation. (B) Example RTPP locomotor traces during baseline and test days: mPFC→NAc mice. Orange bar, stimulation side. (C) % time spent on stimulation side (stim): baseline (base) and test days. mPFC→NAc mice spent less time on the stim side (test versus baseline days; n = 8 mice; paired t test, t7 = 2.67, *p = 0.032). Grey lines, individual mice. (D) No change in velocity on test day (neutral versus stim side, mPFC→NAc; n = 8 mice; paired t test, t7 = 0.50, p = 0.63). (E–H) As in (A)–(D), for mPFC→VTA. No difference between % time spent on stim side (baseline versus test day; n = 5 mice; paired t test, t4 = 0.17, p = 0.87). No change in velocity on test day (neutral versus stim side; n = 5 mice; paired t test, t4 = −0.48, p = 0.66). (I) Protocol: on baseline and stim days, 100% of lever presses gave liquid reward. On stim day, each press also resulted in 5 s bReaChES stimulation. (J) Cumulative # lever presses, baseline and stim sessions, mPFC→NAc mice. Each line: individual mouse. (K) Average rate of pressing, baseline and stim days, mPFC→NAc mice. No difference in rate of pressing on baseline versus stim days (n = 5 mice; paired t test, t4 = −1.28, p = 0.27). (L and M) As in (J) and (K), except mPFC→VTA mice. No difference in press rate, baseline versus stim day (n = 5 mice; paired t test, t4 = −0.80, p = 0.47). (N) Protocol. Days 1 and 3: 100% of presses gave liquid reward. Days 2 and 4: 10% of presses gave 1 s foot shock instead. Day 4: each press also resulted in 5 s bReaChES stimulation. (O) Cumulative # of presses: day 2 (shock), day 4 (shock + stim), mPFC→NAc mice. (P) Average rate of pressing across days, mPFC→NAc. No difference, day 2 (shock) versus day 4 (shock + stim; n = 5 mice; one-way ANOVA, F3,16 = 28.59, p = 1.16e–6; Tukey-Kramer multiple comparisons test, day 2 versus day 4, p = 0.69). (Q and R) As in (O) and (P), for mPFC→VTA. No difference in pressing, day 2 versus day 4 (n = 5 mice; one-way ANOVA, F3,16 = 13.76, p = 1.07e–4; Tukey-Kramer multiple comparisons test, day 2 versus day 4, p = 0.30). All bar graphs plotted as mean ± SEM.

Figure 4

Figure 4. mPFC→NAc and mPFC→VTA Population Dynamics Can Discriminate Reward and Shock Trials

(A) Schematic. After 5 s tone, lever extended for 5 s; lever press gave 80% chance of reward/20% chance of 1 s foot-shock. 1 s delay between lever press and reward/shock. If no press in 5 s, lever retracted. (B and C) % press trials (two baseline and two shock days, 50 trials/day). mPFC→NAc (B) and mPFC→VTA (C) mice suppressed pressing during shock days versus baseline days (mPFC→NAc: n = 5 mice; one-way ANOVA, F3,16 = 17.66, p = 2.49e–5; Tukey-Kramer multiple comparisons test, p < 0.05, shock versus baseline days. mPFC→VTA: n = 5 mice; one-way ANOVA, F3,16 = 28.58, p = 1.13e–5; Tukey-Kramer multiple comparisons test, p < 0.05, shock versus baseline days). (D) Example 2-photon image, GCaMP6f in mPFC→NAc neurons. Active cells outlined as masks. (E) Example reward/shock trial trajectories projected onto first 3 PC dimensions; single mPFC→NAc mouse. Thin lines, individual trials; thick lines, mean. (F) Averaged trajectory-selectivity index across reward/shock trials (all mice) calculated as (dshock – dreward)/(dshock + dreward) where d = Euclidean distance of trial to either mean reward or shock trajectory. Mean ± SEM (n = 22 reward/22 shock trials; 5 mice). (G) Classifier accuracy for all trials across mice; trajectory selectivity indices discriminated reward/shock trials (1,000 shuffled distributions plotted as mean ± 2 SD; *p < 0.05). (H–K) As in (D)–(G), for mPFC→VTA cells. Reward/shock trials could be discriminated using trajectory-selectivity indices (n = 23 reward/23 shock trials; 5 mice). (L) Heatmaps of normalized z-scored activity for mPFC→NAc cells correlated with lever press, reward, or shock. (M) Mean fractions mPFC→NAc lever cells, reward cells, and shock cells (more shock cells seen than lever or reward cells; n = 5 mice; one-way ANOVA, F2,12 = 12.25, p = 0.0013; Tukey-Kramer multiple comparisons test *p < 0.05). (N and O) As in (L) and (M), for mPFC→VTA cells. No difference in fraction of lever, reward, or shock cells (n = 5 mice; one-way ANOVA, F2,12 = 1.97, p = 0.18). All bar graphs plotted as mean ± SEM. See also Figures S4 and S5.

Figure 5

Figure 5. mPFC→NAc Population Dynamics Predict Individual Reward-Seeking or Suppression Decisions

(A) Heatmaps of normalized activity during 5 s tone for mPFC→NAc cells that discriminate missed versus pressed trials (positive Lasso regression weight cells (Cellsw > 0) predicted missed trials, while negative Lasso regression weight cells (Cellsw < 0) predicted pressed trials. Horizontal ticks along left vertical axis separate cells from different mice. (B) Classification accuracy from LDA of missed versus pressed trials using only Lasso regression-identified cells. All models for mPFC→NAc could predict trial-by-trial lever pressing (1,000 shuffled distributions plotted as mean ± 2 SD, *p < 0.05). (C and D) As in (A) and (B), except mPFC→VTA. Only models from two out of five mice could predict trial-by-trial pressing, both with classification accuracy < all accuracies of mPFC→NAc mice. (E) Left: Heatmap of activity during foot-shock for mPFC→NAc neuronswith positive Lasso regression weight (missed cells). First dashed vertical line, lever press time; second dashed line, shock time. Horizontal ticks (left vertical axis) separate cells from different mice. Right: Average activity during foot-shock, all missed cells. Mean ± SEM; scale: 0.1 z-scores (n = 44 cells, 5 mice). (F) As in (E), for mPFC→NAc cells with negative Lasso regression weight (pressed cells; n = 53 cells, 5 mice). (G) Activity of three example mPFC→NAc shock cells during shock, reward, and 5 s tone preceding missed or pressed trials (these shock cells more active during 5 s tone preceding missed versus pressed trials. Scale: 2 z-scores. Average activity during shock and rewards plotted as mean ± SEM. (H) Difference in mPFC→NAc shock-cell activity during 5 s tone preceding missed/pressed trials. Horizontal black lines, cells with average activity difference of 0.1 and −0.1. (I) Mean activity of mPFC→NAc shock cells was higher during 5 s tones prior to missed versus pressed trials (n = 91 cells, 5 mice; Wilcoxon’s signed-rank test, *p = 0.0023). (J and K) As in (H) and (I), for mPFC→VTA shock cells. No difference in mean shock-cell activity during 5 s tones prior to missed versus pressed trials (n = 32 cells, 5 mice; Wilcoxon’s signed-rank test, p = 0.85). All bar graphs plotted as mean ± SEM. See also Figure S6.

Figure 6

Figure 6. Optogenetic Recruitment of Shock-Labeled PFC→NAc Projections Reduces Reward Seeking

(A) Timeline for activity-dependent labeling. Mice injected in mPFC with viral mixture of E-SARE-CreER and DIO-bReaChes-mCherry and implanted with bilateral optical fibers in NAc and VTA (for clarity, schematic illustrates only unilateral injection and implant). (B) Example histology of fiber tip in NAc (white arrow) and mCherry-expressing mPFC axons. Scale: 150 μm. (C) Example RTPP locomotor traces during baseline day (no optogenetic stim) and test day (bReaChes stim of mPFC→NAc shock axons). Orange bar: stim side. (D) Mice spent less time on stim side on test versus baseline day (n = 6 mice, Wilcoxon’s signed-rank test, *p = 0.031). Grey lines, individual mice. (E) No difference in velocity, neutral versus stim side, test day (n = 6 mice, Wilcoxon’s signed-rank test, p = 0.16). (F–I) As in (B)–(E), for mPFC→VTA shock-axon stim. No difference in preference for stim side on baseline versus test day (n = 6 mice; Wilcoxon’s signed-rank test, p = 0.84). No difference in velocity, neutral versus stim side on test day (n = 6 mice, Wilcoxon’s signed-rank test, p = 1). (J) Stim paradigm during lever press. Light delivered during 5 s tone and terminated when lever was pressed or retracted (after 5 s). (K) % trials resulting in lever press during consecutive light OFF, ON, and OFF epochs. Reduction in pressing seen during mPFC→NAc shock-axon stim on test versus baseline day (n = 6 mice; two-way ANOVA interaction, F2,30 = 3.78, p = 0.034; Bonferroni test during light ON, *p = 0.0028). (L) As in (K), for stimulation of mPFC→VTA shock axons. No change in pressing during mPFC→VTA shock-axon stim on test versus baseline day (n = 6 mice; two-way ANOVA interaction, F2,30 = 0.82, p = 0.45). (M) Difference score calculated during “light on” epoch on baseline and stimulation days (Stimulation–Baseline). Difference score for mPFC→NAc shock condition was lower than difference score for all other conditions (n = 6 for mPFC→NAc/VTA shock, n = 6 for mPFC→NAc/VTA vehicle, n = 5 mice for mPFC→NAc home cage; N-way ANOVA F4,28 = 3.76, p = 0.016, Dunnett’s multiple comparisons test for mPFC→NAc shock versus all other conditions, *p < 0.05). All bar graphs plotted as mean ± SEM. See also Figure S7.

Comment in

Similar articles

Cited by

References

    1. Al-Hasani R, McCall JG, Shin G, Gomez AM, Schmitz GP, Bernardi JM, Pyo CO, Park SI, Marcinkiewcz CM, Crowley NA, et al. Distinct subpopulations of nucleus accumbens dynorphin neurons drive aversion and reward. Neuron. 2015;87:1063–1077. - PMC - PubMed
    1. Amemori K, Graybiel AM. Localized microstimulation of primate pregenual cingulate cortex induces negative decision-making. Nat Neurosci. 2012;15:776–785. - PMC - PubMed
    1. Bossert JM, Stern AL, Theberge FR, Marchant NJ, Wang HL, Morales M, Shaham Y. Role of projections from ventral medial prefrontal cortex to nucleus accumbens shell in context-induced reinstatement of heroin seeking. J Neurosci. 2012;32:4982–991. - PMC - PubMed
    1. Britt JP, Benaliouad F, McDevitt RA, Stuber GD, Wise RA, Bonci A. Synaptic and behavioral profile of multiple glutamatergic inputs to the nucleus accumbens. Neuron. 2012;76:790–803. - PMC - PubMed
    1. Calipari ES, Bagot RC, Purushothaman I, Davidson TJ, Yorgason JT, Peña CJ, Walker DM, Pirpinias ST, Guise KG, Ramakrishnan C, et al. In vivo imaging identifies temporal signature of D1 and D2 medium spiny neurons in cocaine reward. Proc Natl Acad Sci USA. 2016;113:2726–2731. - PMC - PubMed

MeSH terms

Grants and funding

LinkOut - more resources