Chapter 6 – Schedules of Reinforcement PDF

Title	Chapter 6 – Schedules of Reinforcement
Author	Zachary Herz
Course	Expl Psych Learning
Institution	Binghamton University
Pages	9
File Size	441 KB
File Type	PDF
Total Downloads	5
Total Views	161

Preview

CLICK TO PREVIEW PDF

Summary

Download Chapter 6 – Schedules of Reinforcement PDF

Description

Zachary Herz Chapter 6 – Schedules of Reinforcement Definition • We’ve focused on perfect contingency • Schedules of reinforcement: a rule determining whether a response will be followed by a reinforcer • Schedules influence how a response is learned and maintained Simple Schedules • A single factor determines the occurrence of the reinforcer • Ratio Schedules: Reinforcement depends upon the number of responses performed o Continuous reinforcement – every time the response occurs, so does the reinforcer § Every time someone smiles, a person says “I like your smile” § Every time you start your car, the car will turn on o Partial reinforcement – response is reinforced only some of the time § Example: only one out of every four times saying “I like your smile” Simple Schedules – Fixed Ratio • Fixed Ratio – A fixed ratio between the number of responses necessary to produce the reinforcer

• • •

o Continuous reinforcement (FR1) – steady moderate responding Partial reinforcement (FR50) – produces vigorous responding Fixed Ratio Characteristics o Post-reinforcement Pause – decrease responding just after reinforcer o Ratio Run: a high steady rate of responding that completes the ratio o Ratio Strain: Rapid increase in fixed ratio requirement results in long prereinforcement pauses

o

Zachary Herz Simple Schedules – Variable Ratio • In a fixed ratio schedule, the subject knows how many responses are needed • BUT, reinforcement is often unpredictable o Example 1: Slot Machines o Example 2: Commission • Variable Ratio – a different number of responses are required for reinforcement o Example – VR10: Criteria order: 9, 6, 11, 14 § The average number needed is 10 responses • Characteristics of VR schedule vs FR schedules: o Fewer post-reinforcement pauses o Fewer ratio runs o More resistance to ratio strain Simple Schedule – Interval Schedules • Interval Schedules – responses are reinforced if they occur after a certain amount of time o Fixed interval schedule – the time between reinforcers is constant § Example – washing clothes in a washing machine

•

§ o Variable interval schedule – the time between reinforcers is variable § Example – waiting for the dealership to fix your car Characteristic of Fixed Interval Schedules o Responses cluster around reinforcer delivery (FI Scallop) § Silly freshman (not you), might wait to study until just before the exam o Depend upon the ability to perceive time § Fester and Skinner (1957) – visual stimuli increased “scalloping”

Zachary Herz Simple Schedules – Variable Interval • Characteristics of VI Schedules o VI Schedules steady, stable rates o Once time has passed, the response will be reinforced (2 min or 90 min) o Limited hold – a restriction on the length of time a reinforcer will be available § Example – seeing the sunset on the beach § Example – a lion waiting for an impala § Some degree of regularity with the schedule

§ Ratio vs Interval Schedules • Inter-response (IRT): interval between responses o If short IRTs are reinforced, responding is increased o If long IRTs are reinforced, responding is decreased • Ratio schedules depend upon response accumulation, the faster the criterion, the more likely for reinforcement to occur • Interval schedules provide no advantage for increased responding until around the time of delivery Response-Rate Schedule • Requires a certain number of responses at a specified rate • Example: Assembly Line Work o Work too slow – slow down the line – get fired o Work too fast – piss off others – greased glove o Work just right – team player – paycheck • Differential reinforcement of high rates (DRH) o Responses are reinforced only if accumulated before a given time § Encourages a high rate of responding § Example – At DRH12 a rat must press lever 12 of more times in order to receive a reinforcer

Zachary Herz •

Differential reinforcement of low rates (DRL) o Encourages a low rate of responding o Example – At DRL3 a pigeon can peck 3 or less times per minute for reinforcement

Choice Behavior – Concurrent Schedules • Often, we choose between response-reinforcer pairs based on their consequences • Common technique for studying choice o Skinner Box – 2 keys of levers o Concurrent schedules – 2 available reinforcement schedules § Mr. Pigeon has a choice • Key A – VI60 • Key B – FR10 • He may choose both! Choice Behavior – Measuring Choice

Zachary Herz •

Choice is influenced by schedule of reinforcement o If 2 schedules = FR5, pigeons are predicted to spend equal time at each key to spend equal time at each key given equal reinforcement o Relative Rate of Reinforcement (rr) can be calculated like RR

Choice Behavior – Matching Law • Matching Law – the relative rate of responding (RR) on a given alternative is approximately equal to the relative rate of reinforcement (rr) earned on that alternative • The relative rates of responding MATCH relative rates of reinforcement • The matching law is about choice

• •

The equation is affected by 3 variables o Sensitivity (s) – tendency to choose a particular schedule, despite loss of reinforcement

o o Examples: monkeys have a choice between VI3 and VI6 § Undermatching – choice responding less that predicted, s < 1.0 • Matching law predicts 2:1 ratio, Choice < 2:1 (ex. 1:1) • Choosing the VI3 and VI 6 equally § Overmatching – choice responding mire that predicted, s > 1.0 • Matching law predicts 2:1 ratio, Choice > 2:1 (ex. 3:1) • Choosing the VI 3 more than expected than VI6

Zachary Herz o Bias (b) – natural tendencies between alternative responses and/or reinforcers (allows comparison between studies) § b > 1.0, A is more preferred § b < 1.0, A is less preferred § Response bias – How would you rather respond? § Reinforcer bias – What would you receive? o Reinforcer Value – reinforcer features influence rate of responding ® § Ex 1 – Amount § Ex 2 – Palatability § Ex 3 – Immediacy o Volmer and Bourett (2000) – Basketball § In basketball, players choose • RRa (3 pointer): further but more points (rrA) • RRb (2 pointer): closer but less points (rrB) § Shots (RR) were proportional to the shooting % of those shots • i.e good long-range shooting teams took more 3 pointers § Similar in football (Reed et al., 2006), with rushing and passing § UNDERSTANDING WHICH SCHEDULE THEY ARE USING AS THEIR CHOICE Mechanisms • Study of choice behavior suggests we will maximize levels of reinforcement • Levels of Choice o Molecular: Individual responses § Molecular maximizing – we choose the response that is best at a single point § Example – what key light does a pigeon choose in a single instant between 2 levers? o Molar: sum distribution of responses § Molar maximizing – we choose the response that will maximize reinforcement over the long run § Example – how many lever presses does a rat make on each lever over 3 days in a Skinner box? o Melioration: local rates of responding § We alter responses based on local rates of reinforcement § Choices improve the immediate situations § Local rate: period a subject devotes a particular alternative • Example: lever pressing 60 times in 60 min, but all occur in the first 30 min o Overall rate 1/min BUT Local rate for 1st 30 min = 2/min o The overall rate is the molar perspective • Choosing the more expensive beer during reduced price 7-8 pm happy hour o Overall Galaxy Andromeda = 1 hr, Local Andromeda 3/hr

Zachary Herz o Molecular -> Melioration -> Molar Choice Commitment and Self Control • Concurrent Chain Schedule – testing choice and self-control • Choice Link: A or B? Not reinforced • Terminal Link: Reinforced o A leads to VR10 o B leads to FR10 o 10 minutes in the Terminal Link, can go back to the choice link after because this is a free operant procedure o THE TERMINAL LINKS ARE REINFORCED • Subject is committed to a schedule in the terminal link o Do they prefer VR10 or FR10? Why? § VR10 is more preferred because the ratio is more novel and unique than FR10 • Self-Control: choosing a large, delayed reward over an immediate small reward o Self-control is easier if the tempter is not readily available • Green (1972) Self Control Procedures o Direct choice procedure – pigeons prefer small immediate reward over large, delayed reward o Concurrent chain – with chain delays, pigeons choose the large, delayed rewards o Consistent across species

o

Zachary Herz Consequences of the VDF • V = M/(1+KD) • As reward value decays over time (KD), choice is shifted o T0 (Onset) – the reward value for “large” is greater, no decay o T1 (Early) – immediate small reward preferred if large reward value decays with delay, like “direct choice” o T2 (Late) – In long delays, large reward retains value is preferred, like “concurrent chain” Choice Commitment and Self Control • Madden at al. 1997: “Stay off the Horse” • Subjects: Controls, Heroin users • Choice between 1000 at later times (1 week to 25 years) or users choose smaller amounts right away • “K” large, decay steep in heroin users, decreased self-control • “K’ small, shallow function, increased self-control

•

Zachary Herz Can Self-Control be trained? • Eisenberger and Adornetto, 1986: o Subjects: 2nd and 3rd graders o Pretest: Choice § 2 cents immediately § 3 cents at days end o Training: Reinforcement with Delay § Group 1: Correct answers 2 cents immediately § Group 2: Correct answers 3 cents at days end § At post-test, training increase choice of 2 cents later...