Neural correlates of weighted reward prediction error during reinforcement learning classify response to cognitive behavioral therapy in depression

See allHide authors and affiliations

Science Advances  31 Jul 2019:
Vol. 5, no. 7, eaav4962
DOI: 10.1126/sciadv.aav4962
  • Fig. 1 CONSORT (Consolidated Standards of Reporting Trials) diagram for patients in the study.

  • Fig. 2 Probabilistic reversal-learning task, computational model comparison, and behavioral fit.

    (A) Each trial commenced with a jittered interstimulus interval (1 to 4 s) displaying a fixation cross. Subsequent to this, two abstract visual stimuli appeared randomly on either side of the screen for 1.25 s. For each participant, the two stimuli were randomly chosen from a pool of 18 different geometrical shapes. Participants were given 1 s to choose a stimulus via a button press. Following a second jittered interstimulus interval (1 to 4 s), participants were presented with the outcome of their decision for 0.65 s. Outcome was either positive (+10) or negative (−10). To maximize design efficiency, the duration of jittered interstimulus intervals was optimized implementing a genetic algorithm. ISI, interstimulus interval. (B) Stimulus-outcome contingencies were asymmetrically skewed (70 to 30%) so that the expected value of the two stimuli was of the same magnitude but of opposite sign. This meant that while one stimulus [here referred to as the high-probability stimulus (HPS)] was associated with a greater likelihood of positive outcome, the other stimulus [here referred to as the low-probability stimulus (LPS)] was associated with a greater likelihood of negative outcome. Reversals were self-paced and occurred when participants chose the high-probability stimulus five times over the last six trials. To prevent participants from figuring out the underlying reversal rule, we ran a randomly generated number of buffer trials from a zero-truncated Poisson distribution before reversing stimulus-outcome contingencies. The stimulus-outcome association strength was chosen to enable detection of reversals. (C) Summed integrated Bayesian Information Criterion (BICint) scores for all models. Lower scores indicate better fit. DYNA stick is the winning model (BICint = 5799). DYNA, Krugel et al.’s model (BICint = 5897); DYNA stick, Krugel et al.’s model with additional choice autocorrelation parameter stick; HGF, hierarchical Gaussian filter (BICint = 6957); PH, Pearce-Hall (BICint = 6875); K1, Kalman filter K1 variant (BICint = 6498). (D) Scatterplot showing linear correlation between the empirical and predicted choice probabilities. r, Pearson’s correlation coefficient; P, P value.

  • Fig. 3 Computational parameters.

    Bar plots showing computational model parameter estimates. Mean estimates ± SEM (error bars) of 𝜌 (A), β (B), stick (C), and γ (D) in the responders (blue; n = 19), nonresponders (red; n = 18), nonresponders noncompleters (fuchsia; n = 7), and nonresponders completers (magenta; n = 11) groups. Parameter estimates are shown in their native space (logit for 𝜌 and log for 𝛾 and β). White circles represent individual subjects. *P < 0.05 and **P < 0.001. a.u., arbitrary units.

  • Fig. 4 Univariate fMRI analysis.

    Top: Contrast image representing between-group differences (responders > nonresponders) of neural encoding of the weighted RPE (P < 0.05 FWE). Responders exhibit greater activity in the right amygdala (A) and right striatum (B). Coordinates are given in the MNI (Montreal Neurological Institute) space. Bottom: Scatterplots [n = 26 (19 responders and 7 nonresponders)] representing robust linear correlation between posttreatment residualized BDI scores (that is, adjusted for the pretreatment BDI score) and subject-specific average parameter estimates extracted from two clusters pertaining to the right amygdala (C) and right striatum (D).

  • Fig. 5 Multivariate fMRI analysis.

    Top: Weight maps obtained performing feature selection. The right amygdala (A) and right striatum (B) were the most discriminative features. Coronal and axial slices shown here are same as for univariate analysis (see Fig. 4) to allow for direct comparison between multi- and univariate analyses. (C) Receiver operating characteristic curves [and respective areas under the curve (AUCs)] for L2-loss and L2-regularized support vector classifier (SVC; blue), L2-regularized logistic regression (LR; purple), and relevance vector machine (RVM; yellow). We used a leave-one-subject-out nested cross-validation scheme and performed hyperparameter tuning using Nelder-Mead optimization routine. (D) Scatterplot showing significant robust linear correlations between subject-wise likelihood of treatment response as estimated by RVM classifier and posttreatment BDI percentage change.

  • Table 1 Demographic and clinical characteristics of the sample.

    Means and SD (in parentheses) are shown.

    Responders (n = 19)Nonresponders (all) (n = 18)Nonresponders (dropouts) (n = 11)Nonresponders (retained) (n = 7)
    Age (years)38.99 (12.03)39.34 (13.43)35.51 (14.33)45.35 (9.99)
    Gender (male/female)9/1010/86/54/3
    BDI baseline23.89 (8.99)31.94 (8.01)35.45 (4.27)26.42 (9.67)
    BDI follow-up4 (3.43)Not availableNot available24.28 (12.48)

Supplementary Materials

  • Supplementary Materials

    This PDF file includes:

    • fMRI analysis of posttreatment data
    • Fig. S1. Posttreatment activity change.

    Download PDF

    Files in this Data Supplement:

Stay Connected to Science Advances

Navigate This Article