Exam 2018, questions and answers PDF

Title Exam 2018, questions and answers
Course Data Analysis I
Institution Loughborough University
Pages 7
File Size 212.7 KB
File Type PDF
Total Downloads 55
Total Views 132

Summary

Exam 2018, questions and answers...


Description

ECA003 DATA ANALYSIS Specimen Examination Paper Attempt Question 1 (worth 50%) and TWO others Time Allowed: 2 hours

1.

You are provided with the following statistical information concerning the annual rate of U.K. unemployment (in %) between 1855 and 2004.

18

16

16

14

14 12 12 10

Observations 150

10 8 8 6

6 4

4

2

2

0

Mean Median Maximum Minimum Std. Dev. Skewness

4.9 3.9 15.6 0.4 3.4 0.8

0 0

5

10

15

Produce a short report discussing the general features of the distribution of unemployment rates. The following points might be made: (i)

(ii)

The range of values stretch from 0.4 to 15.9%. Note that there are 3 outliers of 14% or more: all other values are below 12%. The bulk of the values lie between approximately 2% and 7% (see boxplot). The mean (approx 5), being larger than the median (4), plus the positive skewness coefficient all imply that the distribution is skewed to the right (not surprising as the variable is bounded below by zero).

1

You are now provided with a scatterplot of unemployment (Y) and inflation (X) for this period, along with the following statistical information: Y  4.93

X 3.06

sY2 11 .22

s 2X 34.47

s XY  6. 71

16

UNEMP (Y)

12

8

4

0 -20

-10

0

10

20

30

INFL (X)

Compute the sample correlation coefficient between unemployment and inflation and construct and interpret a 95% confidence interval for the population correlation coefficient. r XY 

s XY  6.71   0.341 s X sY 34.47 11 .22

To construct a confidence interval for  XY we must first calculate  1  0. 341  f  12 log   0. 355  1  0. 341 

and then the interval 0.355 1.96

1  0.355 0.162 150  3

2

f U  0. 193

i.e.

f L  0.517

Transforming back to correlations: U 

L 

exp 2  0.193   1 0.320   0.191 exp 2  0.193   1 1.680

exp 2  0.517   1 0.644    0.475 exp  2  0.517  1 1.356

Thus the 95% c.i. is  0.475   XY   0.191 . Since this does not contain 0 we conclude that Y and X are significantly negatively correlated. Compute the intercept and slope coefficients in the regression of Y on X. If the standard error of the slope coefficient is 0.044, test the hypothesis that the true value of the slope is zero, using the 5% level of significance. Explain how the result of this test is related to the confidence interval constructed above. The coefficients in the regression Y a  bX are calculated as b

s xy s

2 X



6.71  0.195 34 .47

a Y  bX 4.93  0.195 3.06 5.53

To test the hypothesis that the slope is zero, construct t 

0. 195  4. 43 ~ t 148 0. 044

Since t 4.43  t 0.025 148  1.96 we can reject the null in favour of the slope being nonzero. This is consistent with finding that zero is excluded from the c.i. above. Can this regression model be used to establish that changes in inflation cause changes in unemployment? No, because correlation does not imply causation. 2.

The prices of different U.K fuels for 2000-2002 are shown below

3

2000 2001 2002

Coal

Gas

Electricity

Petroleum

128 134 141

104 107 114

106 105 105

195 185 179

The quantities consumed of each fuel in 2000 were: coal 3.5; gas 57.3; electricity 28.3; and petroleum 66.2. Calculate the Laspeyres energy price index for 2001 and 2002, based on 2000 = 100. The Laspeyres indices for 2001 and 2002 are L P2001 100

134 3.5 107 57.3 105 28 .3 185 66.2 97 .77 128 3.5 104 57.3 106 28 .3 195 66.2

L 100 P2002

141 3.5 114 57.3 105 28.3  179 66.2 97 .90 128 3.5 104 57.3 106 28.3  195 66.2

Calculate the Paasche price index based on the following quantities consumed and compare and contrast the two sets of indices.

2001 2002

Coal

Gas

Electricity

Petroleum

4.3 3.3

58.0 55.3

28.6 28.6

67.5 66.3

The Paasche indices for 2001 and 2002 are 134 4.3 107 58.0 105 28.6 185 67 .5 P  97 .79 P2001 100 128 4.3 104 58.0  106 28.6  195 67 .5 141 3.3 114 55.3 105 28.6 179 66.3 P P2001 100 97. 77 128 3.3 104 55.3 106 28.6 195 66.3

Both show falls in prices since 2000 but, because relative prices and quantities have remained fairly stable, little difference in the two indices is found.

4

3

(i) Experience has shown that 30% of all persons affected by a certain illness recover. A drug company has developed a new vaccine. Ten people with the illness were selected at random and injected with the vaccine; nine recovered shortly afterwards. Suppose that the vaccine was completely useless. What is the probability that at least nine out of ten injected by the vaccine will recover? Let X denote the number of people who recover. If the vaccine is worthless, the probability that a single ill person will recover is p 0.3 . With n 10 trials, Y is binomially distributed and P Y  910 C 9 0.3 9 0. 7  0.000138

 10 0.310 0. 000006 PY P Y 9  0.000138  0.000006 0.000144

This probability is so small that either we have observed a very rare event or that the vaccine is indeed very useful in curing the illness: we would adhere to the latter point of view. (ii) Suppose that a policeman visits a given location on his beat randomly X 0,1, 2, times per half-hour and that, on average, he visits each location once per half-hour. Calculate the probability that the policeman will miss a given location during a half- hour period. What is the probability that he will visit it at least twice? X will be Poisson distributed with a mean of 1, P  X r  

Thus,

1r e  1 r!

P  X 0  e 1 0. 368

P  X 1 e  1 0. 368

5

P X  1 1  P X 0  P X 1  1 0.368  0.368  0.264

(iii) The marks on a maths exam are normally distributed with mean 60 and standard deviation 10. To get a first on the exam a mark of at least 70 must be obtained, while to pass the exam a mark of at least 40 must be obtained. What is the proportion of students that obtain a first and what is the proportion that fail? 2 The exam mark is X ~ N 60, 10  . Thus

70  60   P  X  70  P z  1  0. 1587 10   40  60   P  X  40  P Z   2  P  Z  2  0. 0228 10  

4

A sample of 16 observations has been taken from a normally distributed population and a sample mean of 50 and a sample standard deviation of 10 have been obtained. Construct 95% confidence intervals for the population mean and variance and test the null hypotheses that the population mean is 52 and the population variance is 250 using the 5% level of significance.

A confidence interval for the sample mean is calculated using the result that t n

X  ~ t  n  1 s

A 95% c.i. for  is given by   X t 0.025 15  

10  50 2.1312.5 50 5.33 16

The distribution of the sample variance is s2 ~

2 2   n  1 n 1

6

Thus a 95% c.i. for  2 is 15 100 15 100  2  2  02.025  15  0.975 15 2 2 With  0. 975 15  6.262 and  0. 025  15  27.49

54.6   2  239.5

To test H 0 :  52 against the alternative H A :  52 , we calculate t0 

52  50  0.8  t0.025  2. 093 2.5

so that we do not reject H 0 . 2 To test H 0 :  250 against the alternative HA :  2 250 , we calculate

 02 15

100 6 250

2 2 Now, since  0   0. 975 15  6.262 , we may reject H 0 .

5

Write notes on THREE out of FIVE topics, a list of which might include (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x)

Measures of location and dispersion. Decomposing a time series. Seasonal adjustment. Alternative types of index numbers. Alternative measures of inflation. Real and nominal variables. The Lorenz Curve and the Gini Coefficient. Sampling distributions and alternative methods of sampling. Properties of estimators. Relationships between distributions.

7...


Similar Free PDFs