Notes on Prisoners Dilemma Tutorial PDF

Title Notes on Prisoners Dilemma Tutorial
Author Ali Karim
Course Microeconomics 3
Institution University of Strathclyde
Pages 7
File Size 174 KB
File Type PDF
Total Downloads 14
Total Views 135

Summary

Download Notes on Prisoners Dilemma Tutorial PDF


Description

EC315 Topics in Microeconomics with Cross Section Econometrics Tutorial Worksheet 2 The Prisoners’ Dilemma The purpose of tutorials is to cement your learning in lectures by applying the knowledge gained there giving you the opportunity to develop a better understanding of fundamental concepts. Please come to tutorials prepared to participate in discussing the questions “for discussion in class” in small groups. Questions for discussion in class: (1) Two individuals Alice and Bob have to decide on whether to throw their litter on the floor, or put it in a bin. Both individuals have a preference for a clean environment, but each dislikes the cost of finding a bin. If both individuals litter, their payoff is zero. The benefit to each individual from any one of them putting litter in the bin is 3, but the cost to an individual from doing so is 4. (a) Justify why this game can be represented by the following normal form.

Alice

Bin Litter

Bin 2,2 3,-1

Bob Litter -1,3 0,0

If both litter the payoffs are zero, as noted. If one finds a bin whilst the other litters then the benefit is 3 to both players, but there is a cost of 4 to the one that found the bin. If both find a bin, the benefit is 3 × 2 = 6 to each player, and each incurs a cost of 4. (b) Explain why this game has the features of a prisoners’ dilemma. What is the equilibrium of the game? Each player has a dominant strategy to use a particular strategy (litter), but when they do so the outcome is Pareto inferior to that when they use their alternative strategy (bin). By littering, each 1

individually receives a higher payoff but does harm to the other player. Thus, this is a prisoners’ dilemma, where cooperating is finding a bin, and defecting is littering. The dominant strategy equilibrium is that both players litter, with payoffs of 0 each. (c) Suppose now that each player perceives that there is a ‘social norm’ that littering is frowned upon in society. If a person does litter, they incur a psychological cost of 2 payoff units since their behaviour contrasts with the social norm. Construct the new normal form with the modified payoffs, and consider what the equilibria of this game are. How do social norms change the situation? With the modified payoffs the game is now

Alice

Bin Litter

Bob Bin Litter 2,2 -1,3-2 3-2,-1 0-2,0-2

There is a single equilibrium, which is that players cooperate. Notice that in order for this to be an equilibrium the social norm needs to be sufficiently strong so that it reduces the payoff sufficiently to temper behaviour in defecting on cooperation. (d) Now suppose that if an individual discovered that they had littered whilst the other individual had found a bin they feel bad about this. The psychological cost that the individual feels is given by half of the harm done to the other individual by their actions (i.e. half of the reduction in their adversary’s payoff). Modify the payoffs in the original game (ignore social norms for this part of the question), construct the new normal form and deduce what the equilibria of the game are. How do ‘social preferences’ change the situation? The game is now

Alice

Bin Litter

Bob Bin Litter 2,2 -1,3 − 21 (2 − (−1)) 0,0 3 − 12 (2 − (−1)),-1

This has the structure of a stag hunt in which there are two equilibria. Individuals want to cooperate in such an environment, and cooperation can be the outcome so long as individuals have sufficient confidence that others will cooperate. (2) Henry Ford famously introduced the ‘ 5 work day’ at the Ford Motor Co. when everyone else in the industry was paying a wage of 3. The idea was that it would encourage workers to work hard, generating 2

increased profit for the company. Upon receiving a wage, workers decide whether to work hard or to shirk. If a worker works hard this generates a profit per worker (before accounting for wage costs) of 7 per day, whereas if a worker shirks the profit per day (before wage costs) is 4. Assume that a worker receives a net payoff of w − 1 if they work hard (working hard is costly) and w if they shirk, where w is the wage received. (a) Fully justify why this scenario can be modelled using the following normal form game. If the game is played just once, what is the likely outcome? Why?

Henry Ford

5 3

Worker Work Shirk 2,4 -1,5 4,2 1,3

Justification of the normal form is straightforward (note that the analysis works equally well if the wage is set first). Investigation of the payoffs reveals that each player has a dominant strategy: HF is always better playing 3 rather than 5, and the worker is always better playing shirk rather than work. As such, if the game is played only once, we should expect that HF pays workers only 3 and they shirk. Notice, however, that both players could be made better off if HF pays 5 and the worker works. (b) Suppose the relationship between Henry Ford and a worker continues indefinitely. Might it be possible that Henry Ford was correct to introduce a high wage far above that paid in the rest of the industry? [Hint: study whether tacit cooperation is possible and whether it is easy in the repeated game.] If the game is repeated indefinitely, there is the chance for dynamic tacit relationships to emerge. If the worker has a belief that HF starts by paying 5 but then reverts to paying 3 if one instance of shirking is observed (a grim trigger strategy), the 3δw 4 > 5 + 1−δ worker would continue to work if 1−δ , or δw > 1/2 w w so rw < 1: this is likely to be the case so from the workers perspective cooperation is possible. If the worker believed HF was playing tit-for-tat, they would continue to cooperate if 1 < 2δw , or again δw > 1/2 so rw < 1 (the 1 is the bonus payoff from a one period defection (5 − 4), the 2 is the cost of restoring cooperation (4 − 2)). Since the worker’s discount rate is likely to be less than this, cooperation is likely to be sustained, even with beliefs about being punished only weakly. We now need to check that HF will cooperate as well. Suppose he believes the worker is playing a grim trigger strategy: if HF 3

were to ‘cheat’ and pay 3 then the worker shirks forever more. 1δHF Then HF will choose to pay 5 so long as 1−δ2HF > 4 + 1−δ , HF or δHF > 2/3 so rHF < 1/2. If tit-for-tat is HF’s belief, then he will continue to pay 5 so long as 2 < 3δHF or again δH F > 2/3 so rHF < 1/2. HF will cooperate over a higher range of discount factors, but his discount factor is likely to fall into this range and so cooperation is likely. So yes, he was correct! He paid high wages and so long as the workers had some belief that they would be punished for slacking, they worked hard, on the basis that he would be punished if he cheated on the workers by paying them low wages. (c) Thinking practically, what elements of the environment would be more conducive to the 5 work day being a success at the Ford Motor Co.? The ‘hard’ factors of changes in payoffs are important in sustaining cooperation: lower benefits from defecting and higher punishments from cheating will be more conducive to cooperation being sustained. But there are other ‘softer’ factors of the environment that are also conducive to sustaining cooperation such as HF being able to identify shirking in order to punish it; if that’s not possible (eg because employees work in teams) than a worker might not believe they will be punished for defecting on cooperation. (3) Consider a scenario where two individuals—Robert and Stuart—are undertaking a joint project. The value of project depends on both Robert’s effort—x—and Stuart’s effort—y—and is given by V (x, y) = 4(x + y) + xy. Each individual receives the full value from the joint project, but incurs a cost of effort, which is 3x2 for Robert and 3y2 for Stuart. [This is a rather lengthy question, but it is well worth the effort!] (a) Suppose that each player has the choice between either 1 or 2 units of effort, which they decide simultaneously. After calculating the value of the project and the payoff to each player from each combination of strategies, construct the normal form that represents this game. Explain the nature of the game that is played by considering the incentives the players face. The value of the project under different effort combinations: V (1, 1) = 9; V (1, 2) = V (2, 1) = 14; V (2, 2) = 20. Taking into account the cost of effort, the normal form is thus

4

Robert

1 2

Stuart 1 2 6,6 11,2 2,11 8,8

This is a prisoners’ dilemma: each player has a dominant strategy to choose low effort, but mutually better payoffs can be achieved by undertaking high effort: in pursuing their own self-interest individuals exert negative externalities on each other, leading to mutual harm. (b) Now suppose that both Robert and Stuart have complete freedom over their effort choices (i.e. effort is a continuous strategy), which are again chosen simultaneously. Deduce the reaction function of each player and illustrate using an appropriate diagram. Robert’s payoff is πR = 4(x + y) + xy − 3x2 which he seeks to maximise over his choice of x, taking y as given. The first-order condition is 4 + y − 6x = 0 and therefore his reaction function is x ˆ(y) = 23 + 16 y. Similarly, Stuart’s payoff is πS = 4(x+y)+xy−3y2 so the first order condition is 4 + x − 6y = 0 and his reaction function is yˆ(x) = 23 + 16 x. These reaction functions should be plotted in (x, y) space, where they will intersect (which identifies the Nash equilibrium). (c) Explaining your reasoning, show that the Nash equilibrium effort levels in this game are 45 units of effort for each player. What is the equilibrium value of the partnership? What are the players’ payoffs in equilibrium? The Nash equilibrium is found by substituting yˆ(x) into the expression for x ˆ(y) which after rearranging gives x∗ = 45 , then substituting this back into yˆ(x) gives y∗ = 45 as well. The equilibrium value of the partnership is therefore V ∗ = 176 25 and the equilibrium payoff for each player is 128 = 5.12. 25 (d) Consider next a situation where there is a third party—Tony— whose objective is to maximise the total payoff of Robert and Stuart (social welfare, if you like). After writing down Tony’s objective function, show that Tony would recommend that both Robert and Stuart use 2 units of effort, and calculate their payoffs at this ‘social optimum’. Compare and contrast with the Nash equilibrium. [Hint: you need to differentiate Tony’s objective function first with respect to Robert’s strategy and set this equal to zero; then with respect to Stuart’s strategy and set this equal to zero; then find the value of Robert and Stuart’s strategies that satisfy these two equations.] Tony’s objective function is the sum of Robert and Stuart’s payoffs, i.e. J = 2(4(x + y) + xy) − 3x2 − 3y2 , which he seeks to 5

maximise over x and y. The first-order conditions are ∂J ∂x = 8 + 2y − 6x = 0 and ∂J = 8 + 2x − 6y = 0. Solving these equa∂y tions gives x ˜ = 2 and y˜ = 2. The value generated is V˜ = 20 and the players’ payoffs are πR = 8 and πS = 8. At the social optimum, both players put in more effort, and their payoff is higher, compared to the Nash equilibrium. (e) Consider a game in which the individuals choose whether to ‘cooperate’ and exert the effort that Tony suggests, or to ‘defect’ on this agreement. Unilateral defection involves choosing an effort level that is a best response to the cooperative level of effort used by the other player. If both individuals defect they use the Nash equilibrium effort levels. Construct the normal form for this game and explain its features. Robert’s best response to Stuart’s cooperative effort is x ˆ(2) = 1; the players’ payoffs with these strategies are πR (1, 2) = 11; πS (2, 1) = 2. Stuart’s best response to Robert’s cooperative effort is yˆ(2) = 1; πS (1, 2) = 111; πR (2, 1) = 2. Hence, the normal form is

Robert

Cooperate Defect

Stuart Cooperate Defect 8,8 2,11 11,2 5.12,5.12

This is a prisoners’ dilemma: there are mutual benefits to be gained from cooperating, but if each expects the other to cooperate there are individual gains from defecting on cooperation. Additional questions: (4) Two racing cyclists regularly compete in a competition. In the past there has been issues with “doping” in cycling where athletes take performance enhancing drugs. By doping, a cyclist can improve her probability of winning, but this is nullified if her opponent also dopes, and of course both cyclists doping is inferior to neither taking performance enhancing drugs. This situation can be modelled using the following prisoners’ dilemma game.

Speedy

refrain dope

Pedalo refrain dope 6,6 3,8 8,3 5,5

The game is repeated indefinitely. Both cyclists refraining from doping is the cooperative solution in this context. 6

(a) What values of the discount factor can sustain cooperation when a player considers a single-period deviation on cooperation when her opponent is playing tit-for-tat? (b) For what values of the discount factor can the grim trigger strategy sustain cooperation between the players? (c) Is tacit cooperation (on refraining from doping) possible in this game? Is it easy? What determines whether cooperation is possible, and whether it is easy? (d) Repeat the question when the payoffs are asymmetric, and take the form

Speedy refrain dope

Pedalo refrain dope 6,4 1,9 7,1 3,3

(5) Two neighboring countries are considering which of two technologies to adopt in reducing emissions; country A is more industrialized than country B, that has a large service sector to its economy. Technology X is not very expensive to implement but also has limited effectiveness, whereas technology Y is very effective but is more expensive. The payoffs to the two countries, taking into account the cost of the technology, the reduction in emissions induced by the technology and the reduction in emissions from the adoption decision of the other country can be represented in the following normal form. B Y 4,5 5,2

A Y X

X 1,8 2,4

(a) Explain why this game has the features of a prisoners’ dilemma. What is the prediction of the outcome if the game is played only once? (b) If the game is indefinitely repeated, derive the conditions under which tacit cooperation is possible, and when it is easy. Explain why the conditions are not the same for both countries. (c) Explain why country B might be perceived as being more likely to break a period of cooperation in reducing emissions.

7...


Similar Free PDFs