Please enable JavaScript.
Coggle requires JavaScript to display documents.
Inst-Conditioning - Neurobiology - Coggle Diagram
Inst-Conditioning - Neurobiology
Instrumental Behaviours
We learn about predictive relationships by interacting with our environment through instrumental behaviours.
Instrumental behaviours are ones we perform voluntarily. they are called “instrumental behaviours” because the performance of an action is “instrumental” in obtaining an outcome
Instrumental learning occurs as a result of preforming actions and learning about their consequences.
Instrumental actions are initially very cognitively demanding > Each behaviour is directed towards a specific goal >
goal-directed.
Over time when doing the same actions over and over, they become
habitual.
goal directed actions
two main characteristics.
they rely on the presence of a causal relationship with their consequences > contingency requirement
meaning that you perform an action because you know that doing so will produce the particular outcome you have in mind.
depend on the value attributed to their consequences. > goal requirement
meaning that the outcome of performing that action is valuable to you.
goal-directed actions are said to be driven by action-outcome, or A-O, associations.
Goal-directed actions are essential to survival, allowing us to interact with the environment, and they are flexible.
Cognitively demanding
habits
Habits help ease cognitive load
Habits emerge after goal-directed actions have been repeated over and over; i.e., after extensive training.
performance of a habitual response does not depend on its causal relationship with the outcome, nor the value of its outcome > habits are insensitive to the contingency and value requirements that characterize goal-directed actions.
habits are driven by their antecedent stimuli > environmental and internal cues drive the performance of the habitual response.
habits are characterized as being driven by what is referred to as stimulus-response, or S-R, associations.
they occur without cognitive oversight, freeing up resources for other tasks.
they are essential in order for us to function efficiently in a fast-paced world, where we have to perform actions and make decisions constantly and very rapidly
They are also important for refining and perfecting motor skills and they are stable and long-lasting.
habits are relatively inflexible and it can be difficult to refrain from performing them once they’ve been established.
the S-R association that leads to the performance of habits can also lead to maladaptive behaviours such as addiction (consuming drugs regardless of their negative consequences).
Assessing instrumental behaviours in the lab
In the lab, we can assess instrumental behaviours by training rodents, such as rats, to perform actions to obtain outcomes.
Often, we train them to press a lever in a conditioning chamber to gain access to solid food or sucrose solution.
These outcomes are delivered into what’s called a magazine, which is a food dispenser in the wall of the conditioning chamber. Most of the time, these chambers are equipped with two different levers on opposite sides of the food magazine, so two different actions can be performed to get to different outcomes. The chambers also have a variety of visual and auditory stimuli.
Goal-directed actions vs habits
A fundamental principle in instrumental conditioning is establishing whether the behavior is goal-directed or habitual.
goal-directed behaviours are sensitive to outcome value and the contingency between the action and the outcome. Habits, on the other hand, are not. Therefore, in order to determine whether behaviors are goal-directed or habitual, we can manipulate each of these.
Manipulating the value of the outcome is known as
outcome devaluation
.
Consists of 3 stages.
The second stage is to devalue the outcome.
sensory specific satiety:
giving the animal free access to one of the foods, either the pellets or the sucrose solution, for a period of time, typically 30 minutes to an hour. These hungry rats usually gorge themselves on the outcome, and as you can imagine, after filling up on it, they probably don’t want any more of that specific food, but could maybe eat something else.
Test is done immediately after sensory satiety
conditioned taste aversion (CTA):
one of the outcomes is given freely to the rats for some period of time, like 30 min to an hour. They usually happily fill up on it. But immediately after eating it, they are injected with a substance called lithium chloride, which makes them feel mildly sick. This process is repeated over a few days. the appeal of eating the food that’s been paired with sickness has completely diminished.
Test is done the day after the last CTA pairing
Third stage is a choice test, where the rats are given the opportunity to press the two levers.
Successful devaluation is shown when the rats select the “valued” lever – the one that delivered the food outcome they were not sated on, or the one that was not paired with lithium chloride - over the devalued lever – the one that delivered the food outcome that has now been devalued.
First, rats are trained to perform two different actions for two different outcomes.
Manipulating the contingency between the action and outcome is called
contingency degradation.
rats might initially learn that pressing the right lever produces food pellets and pressing the left lever produces sucrose solution.
In such positive contingencies, the probability of earning the outcome given performance of the action is greater than the probability of gaining the outcome in the absence of the action.
\ - we can degrade the contingency between one action and its outcome by arranging it so that the outcome is freely delivered without having to press the lever
When contingencies are degraded, goal-directed animals will cease performing the action with the degraded contingency. So, if the left lever earns sucrose solution, and the right lever earns pellets But then sucrose solution becomes freely delivered, goal-directed rats will reduce their responding on the left lever.
If animals show sensitivity to these manipulations, they are said to be goal-directed; if not, they are said to be habitual.
From actions to habits
Goal-directed actions develop into
habits in one of two ways
schedules of reinforcement
These relationships between the performance of an action and the rate of reinforcement are called schedules of reinforcement.
rats that were trained on ratio schedules showed a devaluation effect: Those that had the pellet devalued by conditioned taste aversion showed a decrease in responding compared to the valued rats, or the rats that were given saline injections. By contrast, rats that were trained on interval schedules showed insensitivity to outcome devaluation: they responded equivalently whether the outcome had been paired with sickness or not.
ratio
determine outcome delivery based on a specified number of responses.
the more the animal performs an action, the more rewards they will receive.
interval
will deliver an outcome upon a response, but only after a specified period of time has lapsed.
Goal-directed behaviours rely on the feedback between the response rate and reinforcement rate.
Under interval schedules, the rate of responding does not necessarily correlate with the rate of reward delivery, particularly if the specified interval is long and the rate of responding is high. Thus, interval schedules have been shown to produce habits more rapidly than ratio schedules. This is because interval schedules provide less feedback between the response and reinforcement rate.
overtraining
as actions are repeatedly performed, they become more and more automatic and habitual
Rats that received 2 and 5 days of training showed a devaluation effect: they responded more on the valued lever than the devalued lever. Rats that were given 20 days of the training did not show the devaluation effect: they responded equivalently on the devalued and valued lever.
overtrained rats were insensitive to outcome devaluation.
Neural Structures
Neural structures of goal-directed actions
goal-directed behaviours require the interactions of several brain regions >
amygdala, and prelimbic region (PL) of the the medial prefrontal as well as the dorsal striatum.
other regions such as the ventral striatum, the mediodorsal thalamus and the orbitofrontal cortex have also been implicated in processes involved in goal-directed behaviour
The striatum is the rodent homologue to the human caudate/putamen. It receives dense projections from the cortex and processed information from the thalamus
medial aspect of the dorsal striatum receives projections from the prelimbic cortex, as well as the BLA.
this medial aspect of the dorsal striatum and the PL and BLA are all important for different aspects of goal-directed behaviours.
neural structures of habits
the
infralimbic cortex
, which lies just ventral to the prelimbic cortex;
the dorsolateral striatum
, which lies just lateral to the DMS, and the
central amygdala
, which sits medial to the BLA