Chapter 6 in learning - Lecture notes 4 PDF

Title	Chapter 6 in learning - Lecture notes 4
Author	Katherine Seguin
Course	Fundamentals of Learning
Institution	Concordia University
Pages	16
File Size	238 KB
File Type	PDF
Total Downloads	19
Total Views	154

Preview

CLICK TO PREVIEW PDF

Summary

Download Chapter 6 in learning - Lecture notes 4 PDF

Description

Chapter 6: Schedules of Reinforcement and Choice Behavior PSYC 351  Casual reflection suggests that every occurrence of the instrumental response invariably results in delivery of the reinforcer which makes it rare in the real world. 

Ex: you don’t always get a good grade even when you study hard.



In most cases, the relation between instrumental responses and consequent reinforcement is rather complex. Laboratory investigations have been examining how these complex relations determine the rate and pattern of instrumental behavior.



A SCHEDULE OF REINFORCEMENT is a program or rule that determines which occurrence of a response is followed by the reinforcer. There are an infinite number of ways that such a program could be set up.



The delivery of a reinforcer could depend on the occurrence of a certain number of responses, the passage of time, the presence of certain stimuli, the occurrence of other responses, or any number of combination of other factors.



Cataloging this is quite manageable. Reinforcement schedules that involve similar relations between responses and reinforcers usually produce similar patterns of behavior.



The exact rate of responding may differ from one situation to another, but the pattern of behavior is highly predictable. This regularity has made the study of reinforcement schedules interesting and useful.



Schedules of reinforcement influence both how an instrumental response is learned and how it is then maintained by reinforcement. They are highly relevant to the motivation of behavior.



Ex: managers who have to make sure their employees continue to perform a job after having learned it



Laboratory studies of schedules or reinforcement are typically conducted using a Skinner box that has a clearly defined response that can occur repeatedly, so that changes in the rate of responding can be readily observed and analyzed.



The focus, here, is the timing and repetition of the operant response and NOT how a rat lever presses.



Class notes. In a perfect world there is always milk for cereal the most wrinkled dollar bills are accepted by machines, stickers always peel off clean, etc.

Simples Schedules of Intermittent Reinforcement 

In simple schedules, a single factor determine which occurrence of the instrumental response is reinforced. The single factor can be how many responses have occurred or how much time has passed before the target response can be reinforced.



Class notes. When reinforcement is delivered after a behavior will influence whether or not that behavior is learned, how it is learned, and how it is maintained.



A schedule of reinforcement is the rule that determines how and when a response will be reinforced.



Though there are many ways to arrange such rules, most can be categorized into a limited number of types that share similar relations between responses and reinforcers.

Reinforcement Rates 

CONTINUOUS reinforcement: reinforcing every correct response (quickly learn extinction, most efficient way to condition a new response, rare in real life).

1

Chapter 6: Schedules of Reinforcement and Choice Behavior PSYC 351  PARTIAL reinforcement: reinforcing some, but not all responses (more effective at maintaining or increasing the rate of response). Partial Reinforcement Schedules 

Different schedules produce distinct rates and patterns of responses & varying degrees of resistance to extinction.



TWO basic types: 1. Ratio: requires that a certain number of responses be made before one is reinforced. 2. Interval: requires that a certain amount of time must pass before a reinforcer is given.



TWO basic categories: 1. Fixed (consistent number) 2. Variable (number varies)

Ratio Schedules 

The defining characteristic of a RATIO SCHEDULE is that reinforcement depends only on the number of responses the organism has to perform.



A ratio schedule requires merely counting the number of responses that have occurred and delivering the reinforcer each time the required number is reached.



If the required number is one, every response results in delivery of the reinforcer. Such a schedule is called a CONTINUOUS REINFORCEMENT (CRF). Can also be referred to as an FR 1.



Ex: drug addicts trying to get clean go to a clinic several times a week to be tested for drug use. If the test indicates that they have not used drugs since the last visits, they receive a voucher, which can be exchanged for money.



This can also occur in everyday things: unlocking the car door enables you to get in, entering the correct code at the ATM enables you to withdraw money.



Situations in which responding is reinforced only some of the time are said to involve PARTIAL REINFORCEMENT or INTERMITTENT REINFORCEMENT.



Ex: the lock on your door malfunctions, you don’t have enough money in your bank account.



Class notes. The delivery of reinforcement depends on the number of responses performed. Therefore, a ratio between work and reinforcement is established.

Fixed-Ratio Schedules 

Delivering the reinforcer after every 10 th lever-press response in a laboratory study with rats. Here, there is a fixed ratio between the number of responses the rat makes and the number of reinforcers it gets (i.e.: 10 responses per reinforcer).



This makes the procedure a FIXED-RATIO SCHEDULE (FR). More specifically, the procedure would be called a fixed-ratio 10 or FR 10.



2

Class notes. Reinforcement is given if the subject completes a pre-set number of responses.

Chapter 6: Schedules of Reinforcement and Choice Behavior PSYC 351  A continuous reinforcement schedule is also an FR schedule. Continuous reinforcement involves a fixed ratio of one response per reinforcer. Organisms typically respond at a steady and moderate rate on this schedule. Only brief and unpredictable pauses occur. 

A very different pattern of responding occurs when an FR schedule is in effect that requires a larger number of responses. There is a steady and high rate of responding once the behavior gets under way but there may be a pause before the start of the required number of responses.



This is evident in a CUMULATIVE RECORD of behavior. This is a special way of representing how a response is repeated over time. It shows the total/cumulative number of responses that have occurred up to a particular point in time.



When Ferster and Skinner did their research on schedules of reinforcement, cumulative records were obtained with the use of a CHART RECORDER (check image on p.158 & notes p.9): 

The recorder consisted of a rotating drum that pulled paper out of the recorder at a constant speed.



A pen rested on the surface of the paper. If no responses occurred, the pen remained at the same level and made a horizontal line as the paper came out of the machine.



If the pigeon performed a key-peck response, the pen moved one step vertically on the paper. Because each key-peck response caused the pen to move one step up the paper, the total vertical distance traveled by the pen represented the cumulative/total number of responses the participant made.



Because the paper came out of the recorder at a constant speed, the horizontal distance on the cumulative record provided a measure of how much time has elapsed in the session.



The slope of the line made by the cumulative recorder represents the participant’s RATE OF RESPONDING (number of responses per unit of time).



The cumulative record provides a complete visual representation of when and how frequently the participant responds during a session.



Animals usually stop responding after the food delivery, afterwards they resume the behavior at a high and steady rate.



The zero rate of responding that typically occurs just after reinforcement on a fixed ratio schedule is called the POST-REINFORCEMENT PAUSE. The high and steady rate of responding that completes each ratio requirement is called the RATIO RUN.



If the ratio requirement is increased a little (i.e.: from FR 120 to FR 150), the rate of responding during the ratio run may remain the same. However, with higher ratio requirements, longer post-reinforcement pauses occur.



If the ratio requirement is suddenly increased a great deal (i.e.: FR 120 to FR 500), the animal is likely to pause periodically before the completion of the ratio requirement. This is called the RATIO STRAIN.



In extreme cases, the ratio strain may be so great that the animal stops responding altogether. To avoid ratio strain during training, one must be careful not to raise ratio requirement too quickly in approaching the desired FR response requirement.

3

Chapter 6: Schedules of Reinforcement and Choice Behavior PSYC 351  Class notes. Ratio strain is a pause during the ratio run, following a sudden significant increase in ratio requirement (i.e.: FR 5 to FR 50). *check image in notes p.10 

Although the pause that occurs before a ratio run in FR schedules is historically called the post-reinforcement pause, research has shown that the length of the pause is controlled by the upcoming ratio requirement.

Variable-Ratio Schedule 

In an FR schedule, a predictable number of responses or amount of effort is required for each reinforcer. This predictability can be disrupted by varying the number of responses required for reinforcement from one occasion to the next.



Ex: working at a car wash where you have to work on cars of different sizes. This is still a ratio schedule because washing each car still depends on a set of number of responses or effort. However, now a different number of responses is required to obtain successive reinforcers.



Such a procedure is called a VARIABLE-RATIO SCHEDULE (VR).



Ex: require a pigeon to make 10 responses to earn the first reinforcer, 13 to earn the second, 7 for the next one, etc. Such a schedule requires on AVERAGE 10 responses per reinforcer and would be a variable-ratio 10 schedule (10 + 13 + 7 = 30, 30/3= 10, VR 10).



VR schedules are found in daily life whenever an unpredictable amount of effort is required to obtain a reinforcer.



Ex: gamblers, they never know when they will win so they continue to play.



Because the number of responses required for reinforcement is not predictable, predictable pauses in the rate of responding are less likely with VR schedules than with FR schedules.



Post-reinforcement pauses can occur on VR schedules but such pauses are longer and more prominent with FR schedules.



The overall response rates on FR and VR schedules is similar provided that, on average, similar numbers of responses are required.



However, the overall response rate tends to be distributed in a pause-run pattern with FR schedules, whereas a steady pattern of responding is observed with VR schedules.



Class notes. The number of responses required to get each reinforcer is not fixed, it varies around an average, the reinforcer is less predictable (don’t know when it comes so work extra hard for it), there is less likelihood of regular pauses in responding, numerical value of the ratio indicates the average number of responses required per reinforcer.

Interval Schedules 

A response is reinforced only if the response occurs after a certain amount of time has passed. This is the case for INTERVAL SCHEDULES.

Fixed-Interval Schedules 

In a simple interval schedule, a response is reinforced only if it occurs more than a set of amount of time after a reference point, the last delivery of the reinforcer or the start of the trial.

4

Chapter 6: Schedules of Reinforcement and Choice Behavior PSYC 351  In a FIXED-INTERVAL SCHEDULE, the amount of time that has to pass before a response is reinforced is constant from one trial to the next. 

Class notes. A response is only reinforced if a constant or fixed amount of time has elapsed from the previous delivery of a reinforcer.



Ex: a washing machine, a fixed amount of time is required to complete the wash cycle. No matter how many times you open the washing machine before the required time has passed, you will not be reinforced with clean clothes. Only once the cycle is finished will the clothes be clean and ready for you to take them out.



Can also do this in a laboratory setting, have a fixed-interval of 4 minutes for a pigeon to receive the reinforcer.



As the time for the availability of the next reinforcer draws closer, the response rate increases. This increase in response rate is evident as an acceleration in the cumulative record toward the end of each fixed interval and is called the FIXED-INTERVAL SCALLOP.



Class notes. Fixed-interval scallop is the time to the end of the interval approaches, increase rate of responding. Once the time is up, the very next response will be reinforced (contingency aspect).



It is important to realize that an FI schedule does not guarantee that the reinforcer will be provided at a certain point in time. Pigeons on an FI 4-min schedule do not automatically receive access to gain to grain every four minutes. The interval determines only when the reinforcer becomes available, not when it is delivered. To receive the reinforcer after it has become available, the participant still has to make the instrumental response.



Ex: students spend little effort studying at the beginning of the semester or just after the midterm. Rather, they begin to study a week before each exam, and the rate of studying rapidly increases as the day of the exam approaches.

Variable-Interval Schedule 

Intervals schedules also can be unpredictable. With a VARIABLE-INTERVAL SCHEDULE (VI), the time required to set up the reinforcer varies from one trial to the next.



The subject has to respond to obtain the reinforcer that has been set up, but now the set-up time is not as predictable.



VI schedules are found in situations where an unpredictable amount of time is required to prepare the reinforcer.



Ex: a mechanic who cannot tell you how long it will take to fix your car.



As with FI schedules, the participant has to perform the instrumental response to obtain the reinforcer. Reinforcers are not just given because a certain amount of time has passed.



They are given if the individual responds after the variable interval has timed out. VI schedules maintain steady and stable rates of responding without regular pauses.



Class notes. A response is reinforced only if it occurs more than a variable amount of time after the delivery of an earlier reinforcer. As in VR schedules, the reinforcer is less predictable, thus the subject shows steady rate of response.

Interval Schedules and Limited Hold

5

Chapter 6: Schedules of Reinforcement and Choice Behavior PSYC 351  In simple interval schedules, once the reinforcer becomes available, it remains available until the required response is made, no matter how long that may take. 

Ex: on an FR 2-min schedule, the reinforcer becomes available 2 minutes after the start of the schedule cycle. If the animal responds at exactly this time, it will be reinforced. If it waits and responds 90 minutes later, it will still get reinforced. Once the reinforcer has been set up, it remains available until the response occurs.



This kind of restriction on how long a reinforcer remains available is called LIMITED HOLD. Limited-hold restrictions can be added to either FI or VI schedules.

Four Basic Schedule of Reinforcement



The VR schedule produces both the highest rates of responding (because the very next response leads to reinforcement) and is the most resistance to extinction.



Ex: nagging and whining for something from parents can be hard to extinguish if it is reinforced here and there (VR).



Interval schedules (FI and VI) produce slower rates of responding.

Schedule of Reinforcement Fixed-ratio

Response Rate Very high

Pattern of Responses Steady response with low

Resistance to Extinction The higher the ratio, the

ratio. Brief pause after each

more resistant to extinction.

reinforcement with very Variable-ratio

Highest response rate

high ratio. Constant response pattern, no pauses.

6

Most resistant to extinction.

Chapter 6: Schedules of Reinforcement and Choice Behavior Fixed-interval Lowest response rate Long pause after

Variable-interval

Moderate

PSYC 351 The longer the interval, the

reinforcement followed by

more resistant to extinction.

gradual acceleration. Stable, uniform response.

More resistant to extinction than F-I schedule with same average interval.

Comparison of Ratio and Interval Schedules 



Similarities: 

There is a post-reinforcement pause after each delivery of the reinforcer.



Produce high rates of responding just before the delivery of the next reinforcer.



Response rate is steady without predictable pauses.

Difference: 



Different rate of response even when reinforcement frequency is similar.

Fundamental differences between ration and interval schedules was provided by an experiment by Reynolds: 

Compared the rate of key pecking in pigeons reinforced on VR and VI schedules.



Two pigeons were trained to peck the response key for food reinforcement. One of the birds was reinforced on a VR schedule. Therefore, for this bird the frequency of reinforcement was entirely determined by how many responses it made.



The other bird was reinforced on a VI schedule. To make sure that the opportunities for reinforcement would be identical for the two birds, the VI schedule was controlled by the behavior of the bird reinforced on the VR schedule.



Each time the VR pigeon was just one response short of the requirement for reinforcement on that trial, the experimenter set up the reinforcer for the VI bird.



With this arrangement, the next response made by each bird was reinforced. Thus, the frequency of reinforcement was virtually identical for the two animals.



Results: even if the birds received the same frequency and distribution of reinforcers, they behaved very differently. The VR pigeon responded ...