Exercises Statistics II Chapter 1 - solutions PDF

Title Exercises Statistics II Chapter 1 - solutions
Author Estefanía González Carbonell
Course Statistics
Institution Universidad Carlos III de Madrid
Pages 20
File Size 535.2 KB
File Type PDF
Total Downloads 14
Total Views 147

Summary

Soluciones...


Description

Statistics II Exercises Chapter 1 Academic year 2013/14



Solutions



Answers

1. A simple random sample of ten X-cars achieved the following fuel consumption figures (in miles per gallon): 27.2 27.2 26.8 26.9 25.3 26.0 26.4 25.7 28.1 25.7 Another independent simple random sample of twelve Y-cars achieved the following results: 24.2

24.3

25.3

24.8

25.1

25.0

24.9

23.9

26.0

26.1

26.0

26.3

Using the following information obtained in Excel, answer the questions below:

(a) Use an unbiased estimation procedure to find point estimates for the population mean µX and 2 . population variance σX (b) Use an unbiased estimation procedure to find a point estimate for the population proportion pX of those X-cars whose fuel consumption exceeds 25.8. (c) Use an unbiased estimation procedure to find point estimates for the population mean µY and population variance σY2 . (d) Use an unbiased estimation procedure to find a point estimate for the population proportion pY of those Y-cars whose fuel consumption exceeds 25.8. (e) Use an unbiased estimation procedure to obtain a point estimate of the difference in population mean fuel consumption between X-cars and Y-cars, that is, of µX − µY .

(f) Use an unbiased estimation procedure to obtain a point estimate of the difference between the population proportion of X-cars achieving more than 25.8 and the population proportion of Y-cars achieving more than 25.8, that is, of pX − pY . Solution. Let X = “fuel consumption of an X-car” and Y = “fuel consumption of an Y-car”, both in miles per gallon. P10 ¯ The corresponding estimate is x (a) An unbiased estimator of µX is X. = 26.53. ¯ = i=1 xi = 265.3 10 2 2 An unbiased estimator of σ X is s X . The corresponding estimate is s2x =

P10

i=1

6.761 7045.17 − 10(26.53)2 x)2 x2i − 10(¯ ≈ 0.75 = = 9 9 10 − 1

1

(b) An unbiased estimator of pX is pˆX = “proportion of the sampled X-cars, whose mpg exceeds 25.8”. The corresponding estimate is pˆx =

1+1+1+1+0+1+1+0+1+0 = 0.7 10

P (c) An unbiased estimator of µY is Y¯ . The corresponding estimate is y¯ = 12 i=1 yi = 25.158333. An unbiased estimator of σ 2Y is s2Y . The corresponding estimate is P12 2 7.0894 7602.39 − 12(25.158333)2 y )2 y − 12(¯ sy2 = i=1 i ≈ 0.6449 = = 11 11 12 − 1

301.9 12

=

(We used more decimals in y¯ to obtain the same value as in the Excel output of Ex. 22.)

(d) An unbiased estimator of pY is pˆY = “proportion of the sampled Y-cars, whose mpg exceeds 25.8”. The corresponding estimate is pˆy =

0+0+0+0+0+0+0+0+1+1+1+1 ≈ 0.33 12

¯ − Y¯ . The corresponding estimate is x ¯ − y¯ = 26.53 − (e) An unbiased estimator of µX − µY is X 25.16 = 1.37. (f) An unbiased estimator of pX − pY is pˆX − pˆY . The corresponding estimate is pˆx − pˆy = 0.7 − 0.33 = 0.37. Answer. Let X = “fuel consumption of an X-car” and Y = “fuel consumption of an Y-car”, both in miles per gallon. (a) 26.53, 0.75 (b) 0.7 (c) 25.158333, 0.6445 (d) 0.33 (e) 1.37 (f) 0.37 2. We take a simple random sample of size n = 400 from a population X and construct an 95% confidence interval for the unknown population mean µX (the population standard deviation is assumed to be known, σ X = 1). Based on the same sample we construct an 80% confidence interval for the unknown population mean µX .

−7.7

−7.6

−7.5

−7.4

| | | || || | || | | | | || || ||| |||| | | | | | |||| |||| | ||| || | |||| | | | | | | | | | | | | || |||| | | ||| | | ||| | | | | || | || | | | | || |||||| || | | | | |||| |||| | | | ||||| || | | || | ||| | | || | || || |||| || | | || | | | | ||||| || | | | | | ||| | | ||| | |||| ||| || | | ||| | || |||| | | | | | | || | | | | || | | ||| | |||| | |||| | ||| || ||| | || | ||| | | ||| ||| | | | | | | | || ||||||| | | | | | ||| ||

200

250

|

0

50

100

Index

250 200 150 100 50 0

Index

300

| | | || || | | || ||||||||| ||| | | || | | |||| |||| | | || || | |||| | | | | | | | | | | | | || |||| | | ||| | | ||| | | | | | || | | | | | | | || ||||| | ||| | | | ||| |||| | | | || ||| || | | || | ||| | | || | |||| ||| ||| | | || | | | | || || || | | || | ||| | | || || ||||||| || | ||| | |||| | | | | | | || | | | | | | | | ||| | || || | |||| | ||| || |||| | || | ||| | | | || | ||| || | | | | | || | |||||| | | | || ||| || |

150

300

We repeat this scheme 300 times. As the result we obtain 300 confidence intervals at an 95% level and the ”twin” 300 confidence intervals at an 80% confidence level. Both sets of intervals are plotted below.

−7.3

−7.7

Confidence Interval

−7.6

−7.5

−7.4

Confidence Interval

2

−7.3

(a) Which graph, left or right, shows the 95% confidence intervals for µX and which shows the 80% confidence intervals for µX ? Justify your answer. (b) For each graph, say how many (approximately) of those 300 intervals will cover µX and how many will not. Solution. (a) Left: 80% confidence intervals. Right: 95% confidence intervals. Justification: the higher the level, the longer the interval (keeping everything else the same). (b) Left: Approximately 300(0.80) = 240 will cover µX and the remaining 60 will not. Right: 300(0.95) = 285 will cover µX and the remaining 15 will not. Answer. (a) Left: 80% confidence intervals. Right: 95% confidence intervals. (b) Left: 240 will and 60 will not. Right: 285 will and 15 will not. 3. The average age for all Spanish cabinet ministers at the time of their appointment is 55. We take a simple random sample of 30 ministers. Would it make sense to construct a 95% confidence interval for the population mean age? Solution. No, because we already know the population mean. Answer. No, because we already know the population mean. 4. Let zα be an upper α ∈ (0, 1) quantile of the standard normal distribution, that is, zα satisfies P (Z > zα ) = α, where Z ∼ N (0, 1). If α increases, does zα increase or decrease? Justify.

Solution. As α increases zα decreases. This is because α equals the right-tail area, and so as it increases the quantile moves to the left. Answer. As α increases zα decreases.

5. Consider two normally distributed populations, X ∼ N (µX , σ 2X ) and Y ∼ N (µY , σY2 ). We take 200 simple random samples of size n = 100 from X and, assuming that the population standard deviation is known, construct an 95% confidence interval for the population mean for each sample. We take 200 simple random samples of size n = 100 from Y and, assuming that the population standard deviation is known, construct an 95% confidence interval for the population mean for each sample.

−2.0

−1.5

−1.0

200

|| || | | | | || | ||| || | | | | | | || | | | | | ||| | | | | | || || || | | | | || | | | | | | | | | | | | || || ||| || | | || | | | | | | | | || || | | | || ||| | | | | | | || | | | | || ||| | | | || | | | | | | | | | || ||| | | | || | | | | | | || | || | || | | | || || || | || | | | | | | | || ||| | | || | | | ||| | | | | |

150

|

0

50

Index

100 0

50

Index

150

| | | || | |||| || ||||||| | | | | | ||||| | | || || |||| ||| | | ||||||||| | | || || | ||||||| | | | || |||| | ||| || | | | |||||||||| | ||||||| | ||| | | |||||||| | | | | | ||||||| | | | || || ||| ||||| | | || | || || |||| ||| |||| || | || ||||| | | ||| ||||||

100

200

The constructed intervals are:

−0.5

0.0

0.0

Confidence interval

0.5

1.0

1.5

2.0

Confidence interval

(a) Knowing that σ X = 1 and σ Y = 3 identify each graph with its respective population (either X or Y ). Justify your answer. 3

(b) For each graph, say how many (approximately) of those 200 intervals will cover either µX or µY and how many will not. Solution. (a) Left: X. Right: Y . Justification: the larger the population standard deviation, the longer the interval (keeping everything else the same). (b) Left: Approximately 200(0.95) = 190 will cover µX and the remaining 10 will not. Right: Approximately 200(0.95) = 190 will cover µY and the remaining 10 will not. Answer. (a) Left: X. Right: Y . (b) Left: Approximately 200(0.95) = 190 will cover µX and the remaining 10 will not. Right: Approximately 200(0.95) = 190 will cover µY and the remaining 10 will not.

0.0

∼ Bernoulli(0.3) ∼ Bernoulli(0.7) ∼ Bernoulli(0.3) ∼ Bernoulli(0.7)

0.2

0.4

0.6

0.8

50 40 30 10

Index 1.0

0.0

Confidence interval

0.4

0.6

n = 100, n = 400, n = 400, n = 100,

(scenario IV) (scenario I) (scenario II) (scenario III)

Answer. Top left: Top right: Bottom left: Bottom right:

X ∼ Bernoulli(0.7) X ∼ Bernoulli(0.3) X ∼ Bernoulli(0.7) X ∼ Bernoulli(0.3)

n = 100, n = 400, n = 400, n = 100,

X ∼ Bernoulli(0.7) X ∼ Bernoulli(0.3) X ∼ Bernoulli(0.7) X ∼ Bernoulli(0.3)

(scenario IV) (scenario I) (scenario II) (scenario III)

30

40 0.8

Confidence interval

Solution. Top left: Top right: Bottom left: Bottom right:

20 10 0

30 10

20

Index

0

0.2

0.4

0.6

0.8

1.0

| | || | || | | || | | || | | | || | | | | || | | || || | ||| | | | | | || | |

50

| || || ||||| | || | || ||| | | ||| || ||| ||| || | |||| | | || 0.0

0.2

Confidence interval

Index

X X X X

50

n = 400, n = 400, n = 100, n = 100,

40

I. II. III. IV.

| | || || || | | |||||| | | || || | ||| || || | || ||| ||| || ||| |

0

0

10

20

Index

30

40

| | |||| | || | | | | | || || | || || | | || ||| | || | | || || || | || || | | | |

20

50

6. The following four graphs show confidence intervals (at an 95% confidence level) for the unknown population proportion pX , based on 50 simple random samples of size n from X, under four scenarios I-IV. Identify each scenario with its graph.

1.0

0.0

0.2

0.4

0.6

0.8

1.0

Confidence interval

7. The confidence interval for the population mean µX is symmetrical about the sample mean x ¯. Is the confidence interval for the population variance σ 2X symmetrical about the sample quasi variance s2x ? Justify. Solution. No. The end-points of the interval are obtained by multiplying s2x by n − 1 and then dividing this value by the quantiles of a χ2n−1 distribution. These quantiles are not symmetric with respect to n − 1 (the mean of the distribution). Answer. No.

4

8. A 99% confidence interval for the population mean level of satisfaction of students with their cars was found to be (75.7, 82.5). Would it therefore be proper to say that the probability is 0.99 that the population mean satisfaction level is between 75.7 and 82.5? If not, replace this statement by a valid one. Solution. No, it is wrong to say P (75.7 < µX < 82.5) = 0.99! µX is a fixed number, not a random variable, so this probability is either 1 (if µX is between the end-points) or 0 (if not). The correct statement is to say that we are 99% confident that µX is between the end-points. Or that, if we were to construct many intervals for µX , approximately 99% of them would contain µX and the remaining ones would not. Answer. No, it is wrong to say P (75.7 < µX < 82.5) = 0.99! µX is a fixed number, not a random variable, so this probability is either 1 (if µX is between the end-points) or 0 (if not). The correct statement is to say that we are 99% confident that µX is between the end-points. Or that, if we were to construct many intervals for µX , approximately 99% of them would contain µX and the remaining ones would not. 9. A manufacturer is concerned about the variability of the levels of impurity contained in consignments of raw material from a supplier. A random sample of fifteen consignments showed a quasi-standard deviation of 2.36% in the impurity concentration levels. Assume a normal population distribution. (a) Find and interpret a 95% confidence interval for the population variance. (b) Would a 99% confidence interval for this variance be wider or narrower than that found in the previous part? Justify without doing any calculations. (c) Find and interpret a 95% confidence interval for the population standard deviation. Solution. (a) Assumptions: SRS, normality of the population. Population: X = “concentration of impurity levels in a consignment of raw material (in %)” 2 ) X ∼ N (µX , σX

ց

s2x = 2.362 ≈ 5.57 1 − α = 0.95

SRS: n = 15

CI0.95 (σ 2X )

Sample: sx = 2.36 Objective: CI0.95 (σ 2X ) =

χ2n−1;1−α/2 χ2n−1;α/2



(n−1)s2x (n−1)sx2 χ2n−1;α/2 , χ2n−1;1−α/2



n = 15 ⇒

α/2 = 0.025

=

2 = 5.63 χ14;0.975

=

2 = 26.12 χ14;0.025   14(5.57) 14(5.57) , 5.63 26.12 (2.99, 13.85)

= =

We can be 95% confident that the variance in the impurity concentration levels takes a value between 2.99 and 13.85. (b) It would be wider because as the confidence level increases the size of the interval increases. √ √ (c) CI0.95 (σ X ) = ( 2.99, 13.85) = (1.73, 3.72) and so we can be 95% confident that the standard deviation in the impurity concentration levels is between 1.73 and 3.72. Answer. 2 ) = (2.99, 13.85). We can be 95% confident that the variance in the impurity con(a) CI0.95 (σ X centration levels is between 2.99 and 13.85.

(b) It would be wider because as the confidence level increases the size of the interval increases. (c) CI0.95 (σ X ) = (1.73, 3.72). We can be 95% confident that the standard deviation in the impurity concentration levels is between 1.73 and 3.72. 10. The members of a random sample of fifty four union stewards were asked how often they talked employees out of filing grievances. Of these sample members, fourteen answered “never” to this question. Based on this information, a Statistics student calculated a confidence interval running from 16% to 36% for the population percentage of union stewards who never talk employees out of filing grievances. (a) Find the level of confidence associated with this interval. 5

(b) Construct and interpret a 80% confidence interval for the population percentage in question. Solution. (a) Assumptions: SRS, large n. Population: X = 1 if a union steward never talks employees out of filing grievances and 0 otherwise X ∼ Bernoulli(pX ) pX = proportion of all union stewards who never talk employees out of filing grievances

ց

From theory CI width = 2zα/2 From the problem data

SRS: n = 54

Sample: pˆx = 14/54 ≈ 0.26 2zα/2 CI1−α (pX ) = (0.16, 0.36) Objective: 1 − α One way of finding 1 − α is to use the relationship p width = 2zα/2 pˆx (1 − pˆx )/n,

p

0.26(1 − 0.26)/54

CI width = 0.36 − 0.16 = 0.2 ⇒ 0.06 zp }| { 0.26(1 − 0.26)/54 = 0.2 0.2 ≈ 1.67 zα/2 = 2(0.06) α/2

= 0.047

α

= 0.094

solve this equation for zα/2 , identify α and even- The confidence level used was 1 − α = 1 − 0.094 = 90.6%. tually (1 − α).

(b) Assumptions: SRS, large n.

pˆx = 0.26 Population: X = 1 if a union steward never talks employees out of filing grievances and 0 otherwise X ∼ Bernoulli(pX ) pX = proportion of all union stewards who never talk employees out of filing grievances

ց

1 − α = 0.8 zα/2

CI0.8 (pX )

n = 54 ⇒ =

=

SRS: n = 54

Sample: pˆx =

14 54

=

≈ 0.26

  q pˆx (1−pˆx ) Objective: CI0.9 (pX ) = pˆx ∓ zα/2 n

=

α/2 = 0.10 z0.10 = 1.28 

  0.26 ∓ 1.28  

z r

(0.26 ∓ 0.0768)

 0.06 }| {  0.26(1 − 0.26)    54 

(0.1832, 0.3368)

We can be 80% confident that the population percentage of union stewards who never talk employees out of filing grievances is between 18.32% and 33.68%.

Answer. (a) Roughly 90.6% (b) CI0.8 (pX ) = (0.1832, 0.3368) 11. Scores obtained by a large group of students taking a test are known to be normally distributed. A random sample of twenty five test scores yielded the following statistics: 25 X

xi = 1508

i=1

25 X

x2i = 95628

i=1

(a) Find and interpret a 90% confidence interval for the population mean. (b) Without doing any calculations, say if a 95% confidence interval for the population mean will be narrower or wider than the preceding one. Solution.

6

(a) Assumptions: SRS, normality of the population. Population: X = “test score” 2 2 X ∼ N (µX , σX ) σX unknown

ց

=

√ 194.3933 ≈ 13.94 x ¯ = 60.32

1 − α = 0.9



α/2 = 0.05

tn−1;α/2

SRS: n = 25 small

Sample: x ¯ = 1508/25 = 60.32 sx2

sx n = 25

95628 − 25(60.32)2 = = 194.3933 25 − 1

CI0.9 (µX )

=

=

=

  sx Objective: CI0.9 (µX ) = x¯ ∓ tn−1;α/2 √ n

=

t24;0.05 = 1.711 

 2.788 z }| {  .94  60.32 ∓ 1.711 13 √     25 

(60.32 ∓ 4.77) (55.55, 65.09)

We can be 90% confident that the population mean test score is between 55.55 and 65.09. (b) It will be wider because the CI length increases with increasing confidence levels. Answer. (a) CI0.9 (µX ) = (55.55, 65.09) (b) It will be wider because the length increases with increasing confidence levels. 12. UC3M is concerned about the amount of time its students spend studying each week. A random sample of sixteen students had mean weekly study time 18.36 h and a sample quasi standard deviation of 3.92 h. Assume that the study times are normally distributed. (a) Find and interpret a 90% confidence interval for the average amount of study time per week for all students at UC3M. (b) State without doing the calculations, whether the interval would be wider or narrower under each of the following conditions: i. The sample contained thirty students (with everything else the same). ii. The sample quasi standard deviation was 4.15 h (with everything else the same). iii. An 80% confidence interval was required (with everything else the same). (c) If the population was not normally distributed, would you be able to construct the confidence interval for the population mean? Justify. What kind of remedy would you suggest to fix the problem? (d) Construct and interpret a 90% confidence interval for the population standard deviation. Solution. (a) Assumptions: SRS, normality of the population. sx Population: X = “weekly study time of a student at UC3M (in h)” 2 2 X ∼ N (µX , σX ) σX unknown

ց

n = 16 1 − α = 0.9 tn−1;α/2

CI0.9 (µX )

SRS: n = 16 small<...


Similar Free PDFs