Task 4 Reward and Value update to lOFC +MCC
thalamus niclei = modualte communication in lOFC + MCC
—> knew that kind of already cause habenula = modualtes dopa release of VTA+Snc :D!!
——> if error made = supresses them and no dopa even though expected by MCC/lOFC = leads tehm to update (which we see as icnreased activity haha :D!!)
Unexpected reward (schulz)
--> dopa increase through limbic loop BG
---> its effects on humans (daniel + pullman)
--> reward effect on striatum (willuhn) ❗
-----> super important shows how reward cocaine affects BG through spirals (limbic to cognitive/ motor :D!!)
addiction not just BG but also brain areas modulated by BG (Kravitz)
Dopa activity+ stimulus value associations + prediction error (schultz)
- prediction error = riscola wagner model (LEARNING AND MEMORY LOOLL :D !!!)
- slow firing increase during learning
- depression in firing if negative prediction error
- phasic activity = burst firing/ activity :D
---> the way dopa neurons fire :3!!
Positive prediction error
A bit paired CS
- CS = low spike activity
- partly predicted reward = medium high spike activity
- --> CS only partly predicts reward 80% prediction error
- uncertainty = during uncertainty if CS predicts reward (US) dopa firing slowly increases during + builds up during delay and if reward present = spikes strongly :D !!
❗ --> modulated by habenula ? task 6 pre-SMA response conflict + uncertainty :D!!)
- uncertainty = during uncertainty if CS predicts reward (US) dopa firing slowly increases during + builds up during delay and if reward present = spikes strongly :D !!
Fully paired CS
- CS = high spike activity
- fully predicted reward = no spike activity
- --> cause no prediction error !!
Unpaired CS
- CS = no activity
- Unexpected reward = high spike activity
- --> cause 100% prediction error :D!!
Negative prediction error
- CS = fully predicts reward, but no reward happens
- CS = high spike,
- time of absent reward = reduced beyond baseline activity (depression!)
- TIME SENSITIVE !! --> if reward / US moved only 500ms this happens already :D!!
Special cases
--A blocking (ala mescola wagner)
--> aversive stimulii
blocking
--> if after CS 1 = fully paired other stimulus is added (CS2) its fully blocked cause no prediction error happens, already fully predicted by CS1 :D!!
---> if blocked one presented alone = then 100% prediction error cause it was never associated with reward haha
- conditioned inhibitor CS = Cs given alongside fully predicting CS, but this inhibitor C signals = no reward
- if predictive CS is given with inhibitor CS = NO CS SPIKING AND NO REWARD SPIKING (cause no reward given )
aversive stimuli
- time of aversive stimuli = reduced beyond baseline dopa firing (depression) !! #
Spike activity increase for CS = proportional to value of expected reward
Spike activity increase for Reward/US = depends on error prediction / how expected the reward was :D !! #
Ventral striatum important for all kinds of learning :D
--> evidence fMRI studies
--> important for cocain study below cause it starts with input to VS (motivation) but later activity only in DS (behavior) /cognitive loops :3 !!
Learning + ventral striatum (basal ganglia)
--> depends on cortex input (glutamate
--> depends on SNc input (dopamine)
**----> called "three learning rule"" :D!! (bridge to task 3!!)
Primary vs secondary reinforcers
--> fMRI studies possible because- dopa injections increased activity in same areas as monkey (VTA+Snc)
- targeting these areas = also increased activity to expected reward + prediction error :D !!
- although most studies only activity in ventral striatum (VS)
- --> All increase activity in VS in accordance to grey box and above reader :D!!
Primary rewards
--> food, drink (juice), sex, safety
--> unconditioned
Secondary reinforcers
--> value only by being paired with primary goals
Social rewards
-> smiling faces (emoji as well :'D ?)
cognitive feedback
--> feedback how well you did (works on social approval)
Spike activity increase for CS = proportional to value of expected reward
Spike activity increase for Reward/US = depends on error prediction / how expected the reward was :D !! #
Learning through observation
--> increase activity in VS only if observed knowledge later important for person watching, or perceived similar with person thats being observed
Monetary rewards
--> well money haha
--> Nacc = more active compared to cognitive reward / feedback (see great box on right!)
Striatum + drug addiction
- from VMS (limbic loop) to DLS (sensory motor)
- needs VMS activity (lesioning prevents DLS activity forming on same side)
- DLS dopa inhibition = more reward seeking / cocain taking (both reinforced/ active port) and inactive port/not reinforced!)
Design
- rats 3 weeks + recording of VMS + DLS
- could get cocain as much as they wanted each day for 1 hour!
- active port =
- cocaine (reinforced)
- 20s CS (during 20s, reactivation of port not possible nonreinforcement)
- Inactive port=
- nothing (nonreinforcement)
- nothing (nonreinforcement)
---> ative reinforcement stayed same level during the 3 weeks!!, while non rewarding reinforcement (inactive port / 20s CS time) = went down :3!! (except for when DSL dopa inhibition study part :p !!)
Ventromedial Striatum (VMS)
- strong dopa activity after nosepoke active port (cocain)
- **decreased gradually during the 3 weeks
* nothing happened when inactive port
- **decreased gradually during the 3 weeks
- if unexpected/not caused by nosebooping active port cocain given = response as strong as if they had actively taken cocain (active port)
- makes sense cause prediction error = huge :D !! so naturally strong dopa response
Dorsollateral Striatum
- begin of dopa increase starting week 2 if active port!
* no difference between week 2 and 3 though! (sneaky people!!!) - nothing happens if inactive port pressed
- if unexpected cocaine = as strong response as if pressed active port
- makes sense cause prediction error
Role of DSL (dopa inhibition study)
- dopa inhibitor into DSL either on week 1 or week 3
- increase in cocaine seeking (active port!) in both conditions (reinforced)
- increase in pressing inactive port / active port during 20s CS presentation (non reinforced)
--> DLS = mediates consumption / behavior / habit of drug use even in absence of feedback from VMS (automaticity of reward / drug seeking :3 !)
Role of VMS in generating DLS activity (lesion study)
- lesion to right VMS
- ipsilateral /same side =
- no activity to cocain at all (active port / getting it off schedule)
- contralatteral/ opposite site of lesion) =
- normal DLS response (increase in dopa activity starting 2nd week + getting cocain off schedule)
- ipsilateral /same side =
- ---> VMS = needed for DLS response development !!
VMS = motivation phase, feedback of cocain = required
--> after few weeks
DLS = doesn't need feedback from cocain anymore for it to engage reward seeking behavior (can be seen by dopa inhibition study where it becomes habit (but really this study no evidence for that just that sensory motor = important for drug habit they say but no evidence in this study but they cited other study :P !! so yeah haha) [#]
-----> they say DLS = regulates automaticity of drug taking / reward seeking :3 !!(#bbe994) #
Fun fact VMS = core of the Nucleus accumbens :o !!! (cool for task 3 :D!!) #
Dopa receptors in Striatum+ drugs brain areas
- D1 receptor = direct pathway in BG(center of center sourround model)
- stimulation = increases reward seeking behavior + making correct value choice (task 3 BG)
- D2 receptor = indirect pathway BG (sourround of center surround model)
- less D2 receptors in addiction
- stimulation = decreases reward seeking behavior
- Inline with less D2 receptors and also less active dlPFC in drug abuse! cause dlPFC = less active cognitive loop as well #
vlPFC = iLC in monkeys
- S-R learning + habit behavior
- inhibition = reduced habit learning for cocaine + reduced moving to get cocaine?
dmPFC = PLC in monkeys
- instrumental conditioning + goal directed behavior
- inhibition = reduced cocaine seeking