Statistics Exam Solutions 2019 PDF

Title	Statistics Exam Solutions 2019
Course	Statistics 1
Institution	University of Bristol
Pages	18
File Size	439 KB
File Type	PDF
Total Downloads	475
Total Views	1,001

Preview

CLICK TO PREVIEW PDF

Summary

UNIVERSITY OF BRISTOLExamination for the Degrees of B. and M. (Level C/4)STATISTICSMATH-10013/(Paper Code MATH-10013)May/June 2019, 1 hour and 30 minutesThis paper contains two sections: Section A and Section B.Each section should be answered in a separate answer book.Section A containsfivequestions...

Description

UNIVERSITY OF BRISTOL Examination for the Degrees of B.Sc. and M.Sci. (Level C/4) STATISTICS MATH-10013/11400 (Paper Code MATH-10013)

May/June 2019, 1 hour and 30 minutes

This paper contains two sections: Section A and Section B. Each section should be answered in a separate answer book.

Section A contains five questions, ALL of which will be used for assessment. This section is worth 40% of the marks for the paper. Section B contains two questions, ALL of which will be used for assessment. This section is worth 60% of the marks for the paper.

On this examination, the marking scheme is indicative and is intended only as a guide to the relative weighting of the questions. Calculators of an approved type (non-programmable, no text facility) are allowed in this examination.

THIS PAPER MUST NOT BE REMOVED FROM THE EXAMINATION ROOM.

1 of 18

Do not turn over until instructed.

Cont...

Math 10013–MayJune19

A1. (8 marks) Consider the test scores of 9 students: scores 1 but students not penalized for not mentioning this. ⌅

(b) Still assuming β = 1 known: i. (4 marks) Derive the likelihood equation for this distribution, assuming xi  1 for i = 1, . . . , n. You should explain every step of your reasoning. Answer. From the assumed independence (1 mark) and the fact that the Xi s0 are identically distributed (1 mark) fX1 ,...,Xn (x1 , x2 , . . . , x n ; ↵) = =

n Y

i=1 n Y

fXi (xi ; ↵) f (xi ; ↵).

i=1

This is strictly larger than zero since xi  1 (1 mark) and the log-likelihood is therefore `(↵; x1 , x2 , . . . , xn ) =

n X i=1

  ln(↵)  (↵ + 1) ln xi

and the likelihood equation (1 marks) @α`(↵; x1 , x2 , . . . , x n ) = n/↵ 

n X

i=1

  ln xi = 0.

⌅

7 of 18

Continued over...

Cont...

Math 10013–MayJune19 ii. (2 marks) Find the maximum likelihood estimator α ˆmle of α. You should explicitly confirm that this is a maximum. 2 < 0 (1 mark) and by solving . , xn ) =n/↵ Answer. We find that @α2`(↵; x1 , x 2 , . .P  n the likelihood equation we find↵ ˆ mle  = n/ i=1 ln xi (1 mark). It is therefore a unique Pn ln xi > 0, but students not penalized for not mentioning maximum. Requires i=1 this. ⌅

(c) (3 marks) For either estimator α ˆ mle or α ˆmom (call it α(X ˆ 1 , X2 , . . . , Xn ) for simplicity): i. what is the definition of the sampling distribution of α(X ˆ 1 , X2 , . . . , Xn )? ii. define the bias and mean squared error (mse) of α(X ˆ 1 , X2 , . . . , Xn ). Answer. It is the distribution of ↵(X ˆ 1 , X2 , . . . , Xn ) for X1 , X2 , . . . , Xn iid distributed according to the Pareto(↵,  = 1) distribution (1 mark). ↵; ↵,  =   For any ↵ the bias is E(ˆ 1)  ↵ (1 mark) and the mse is E (ˆ ↵  ↵)2 ; ↵,  = 1 (1 mark). ⌅

(d) Given the expressions for α ˆmom and α ˆmle it seems unlikely that we can get expressions for the bias and mean squared error and we turn to a numerical simulation. true.alpha x; )n ⇥ ⇤n = (/x)α

⇥ ⇤n for x  , otherwise zero. Finally P(ˆmle  x; ) = 1  (/x)α for x   and 0 otherwise. ⌅

11 of 18

Continued over...

Cont...

Math 10013–MayJune19

B2. Linear regression (a) Modelling assumptions and estimation i. (3 marks) In a standard linear regression model Yi = α + βxi + ei , what assumptions do we make about the residuals {e1 , . . . , en }? Answer. [Theory seen in the notes.] We assume that the ei are uncorrelated (1 mark) of expectation 0 (1 mark) and variance  2 unknown (1 mark). No need for normality. ⌅

b ii. (2 marks) What is meant by least squares estimates α b andβ?

Answer. Pn 2 Pn We minimise the sum of the square of the residuals F (↵, ) = i=1 (yi  ei = i=1 ↵  xi )2 . ⌅

iii. (2 marks) Def ine ssxx and ssxy as used in the least squares estimateβ ˆ= ssxy /ssxx . Give an expression for the least squares estimate α ˆ in terms of βˆ and sample means x and y. Answer. (1 mark) for thePf irst two and (1 mark) for ↵. ˆ Pn n 2 ˆ (x x)(y ˆ = y x. From the lecture notes ssxx = i=1 (xi x) , ssxy = i=1 i i y) and ↵ ⌅

(b) Confidence interval and model testing

i. (3 marks) Def ine precisely what a 95% conf idence interval for α is? How would you explain this informally to a non-mathematical person? Answer. It is a random interval cL (X1 , X2 , . . . , X n ) cU (X1 , X2 , . . . , X n ) such that for all ↵,  and  2 (2 marks) P(↵ 2 [cL , cU ]; ↵, ,  2 ) = 0.95. Informally it means that if we were to observe a large number of data sets generated by the same model ↵, ,  2 , then 95% of them are guaranteed to contain ↵ (1 marks) – but we do not know which ones. ⌅

ii. (2 marks) What additional assumption does one usually make about the residuals {e1 , . . . , en } in order to derive a confidence interval for α? Why are the assumptions for a standard linear regression model not sufficient? Answer. One usually assumes further that they are normally distributed (1 mark). To define the sampling distribution of ↵ ˆ one needs a distribution for the observations (or equivalently the residuals) and normality is standard–it allows one to do calculations (1 mark). Some students may mention CLT if n large enough and should be awarded 1 mark if that’s the case. ⌅

12 of 18

Continued over...

Cont...

Math 10013–MayJune19 iii. (2 marks) Define sα2b

  = σb2 1/n + x2 /ssxx ,

n

where

σb2 =

1 X b i )2 . (Yi  α b  βx n  2 i=1

Assuming that with your modelling assumptions you can establish that (b α α)/sαb ⇠ tn2 , f ind a 95% conf idence interval for α. You should explain your reasoning carefully. Answer. Using the above result we know that P(tn2;0.025  (b ↵  ↵)/sαb  tn2;0.025) = 0.95. We can make ↵ the subject of this in the usual way, to obtain P(b ↵  tn2;0.025 sαb  ↵  ↵ b + tn2;0.025sαb ) = 0.95.

and so we can take cL = ↵ b  tn2;0.025 sαb and cU = ↵ b + tn2;0.025sαb where for T ⇠ tn2 and ↵ 2 [0, 1], P(T  tn2;α) = ↵. ⌅

(c) The cars data set contains the stopping distances (in feet) for different car speeds (in miles per hour) recorded in the 1920s. Here are the first 6 records: head(cars) ## ## ## ## ## ## ##

1 2 3 4 5 6

speed dist 4 2 4 10 7 4 7 22 8 16 9 10

i. (2 marks) We type the following R commands: attach(cars) plot(speed,dist, xlab='speed (mph)',ylab='stopping distance (ft)')

13 of 18

Continued over...

Math 10013–MayJune19

●

20 40 60 80

●

0

stopping distance (ft)

120

Cont...

●

●

● ● ●

● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●

●

●

●

● ● ●

● ●

●

●

5

10

15

20

25

speed (mph) What is the name for this type of plot? What does it suggest? Answer. [Students familiar with this type of code.] This is a scatter plot (1 mark) of distance against speed, and tells us that there is correlation since when speed increases distance also increases. A linear model seems natural. (1 mark) ⌅

ii. (2 marks) Explain what the following R code does. What can you deduce from this plot? fit 0 is not evidence that we are departing from H0 in the direction of H1 :  < 0. ⌅

18 of 18

End of examination....