Title | Statistics Exam Solutions 2019 |
---|---|
Course | Statistics 1 |
Institution | University of Bristol |
Pages | 18 |
File Size | 439 KB |
File Type | |
Total Downloads | 475 |
Total Views | 1,001 |
UNIVERSITY OF BRISTOLExamination for the Degrees of B. and M. (Level C/4)STATISTICSMATH-10013/(Paper Code MATH-10013)May/June 2019, 1 hour and 30 minutesThis paper contains two sections: Section A and Section B.Each section should be answered in a separate answer book.Section A containsfivequestions...
UNIVERSITY OF BRISTOL Examination for the Degrees of B.Sc. and M.Sci. (Level C/4) STATISTICS MATH-10013/11400 (Paper Code MATH-10013)
May/June 2019, 1 hour and 30 minutes
This paper contains two sections: Section A and Section B. Each section should be answered in a separate answer book.
Section A contains five questions, ALL of which will be used for assessment. This section is worth 40% of the marks for the paper. Section B contains two questions, ALL of which will be used for assessment. This section is worth 60% of the marks for the paper.
On this examination, the marking scheme is indicative and is intended only as a guide to the relative weighting of the questions. Calculators of an approved type (non-programmable, no text facility) are allowed in this examination.
THIS PAPER MUST NOT BE REMOVED FROM THE EXAMINATION ROOM.
1 of 18
Do not turn over until instructed.
Cont...
Math 10013–MayJune19
A1. (8 marks) Consider the test scores of 9 students: scores 1 but students not penalized for not mentioning this. ⌅
(b) Still assuming β = 1 known: i. (4 marks) Derive the likelihood equation for this distribution, assuming xi 1 for i = 1, . . . , n. You should explain every step of your reasoning. Answer. From the assumed independence (1 mark) and the fact that the Xi s0 are identically distributed (1 mark) fX1 ,...,Xn (x1 , x2 , . . . , x n ; ↵) = =
n Y
i=1 n Y
fXi (xi ; ↵) f (xi ; ↵).
i=1
This is strictly larger than zero since xi 1 (1 mark) and the log-likelihood is therefore `(↵; x1 , x2 , . . . , xn ) =
n X i=1
ln(↵) (↵ + 1) ln xi
and the likelihood equation (1 marks) @α`(↵; x1 , x2 , . . . , x n ) = n/↵
n X
i=1
ln xi = 0.
⌅
7 of 18
Continued over...
Cont...
Math 10013–MayJune19 ii. (2 marks) Find the maximum likelihood estimator α ˆmle of α. You should explicitly confirm that this is a maximum. 2 < 0 (1 mark) and by solving . , xn ) =n/↵ Answer. We find that @α2`(↵; x1 , x 2 , . .P n the likelihood equation we find↵ ˆ mle = n/ i=1 ln xi (1 mark). It is therefore a unique Pn ln xi > 0, but students not penalized for not mentioning maximum. Requires i=1 this. ⌅
(c) (3 marks) For either estimator α ˆ mle or α ˆmom (call it α(X ˆ 1 , X2 , . . . , Xn ) for simplicity): i. what is the definition of the sampling distribution of α(X ˆ 1 , X2 , . . . , Xn )? ii. define the bias and mean squared error (mse) of α(X ˆ 1 , X2 , . . . , Xn ). Answer. It is the distribution of ↵(X ˆ 1 , X2 , . . . , Xn ) for X1 , X2 , . . . , Xn iid distributed according to the Pareto(↵, = 1) distribution (1 mark). ↵; ↵, = For any ↵ the bias is E(ˆ 1) ↵ (1 mark) and the mse is E (ˆ ↵ ↵)2 ; ↵, = 1 (1 mark). ⌅
(d) Given the expressions for α ˆmom and α ˆmle it seems unlikely that we can get expressions for the bias and mean squared error and we turn to a numerical simulation. true.alpha x; )n ⇥ ⇤n = (/x)α
⇥ ⇤n for x , otherwise zero. Finally P(ˆmle x; ) = 1 (/x)α for x and 0 otherwise. ⌅
11 of 18
Continued over...
Cont...
Math 10013–MayJune19
B2. Linear regression (a) Modelling assumptions and estimation i. (3 marks) In a standard linear regression model Yi = α + βxi + ei , what assumptions do we make about the residuals {e1 , . . . , en }? Answer. [Theory seen in the notes.] We assume that the ei are uncorrelated (1 mark) of expectation 0 (1 mark) and variance 2 unknown (1 mark). No need for normality. ⌅
b ii. (2 marks) What is meant by least squares estimates α b andβ?
Answer. Pn 2 Pn We minimise the sum of the square of the residuals F (↵, ) = i=1 (yi ei = i=1 ↵ xi )2 . ⌅
iii. (2 marks) Def ine ssxx and ssxy as used in the least squares estimateβ ˆ= ssxy /ssxx . Give an expression for the least squares estimate α ˆ in terms of βˆ and sample means x and y. Answer. (1 mark) for thePf irst two and (1 mark) for ↵. ˆ Pn n 2 ˆ (x x)(y ˆ = y x. From the lecture notes ssxx = i=1 (xi x) , ssxy = i=1 i i y) and ↵ ⌅
(b) Confidence interval and model testing
i. (3 marks) Def ine precisely what a 95% conf idence interval for α is? How would you explain this informally to a non-mathematical person? Answer. It is a random interval cL (X1 , X2 , . . . , X n ) cU (X1 , X2 , . . . , X n ) such that for all ↵, and 2 (2 marks) P(↵ 2 [cL , cU ]; ↵, , 2 ) = 0.95. Informally it means that if we were to observe a large number of data sets generated by the same model ↵, , 2 , then 95% of them are guaranteed to contain ↵ (1 marks) – but we do not know which ones. ⌅
ii. (2 marks) What additional assumption does one usually make about the residuals {e1 , . . . , en } in order to derive a confidence interval for α? Why are the assumptions for a standard linear regression model not sufficient? Answer. One usually assumes further that they are normally distributed (1 mark). To define the sampling distribution of ↵ ˆ one needs a distribution for the observations (or equivalently the residuals) and normality is standard–it allows one to do calculations (1 mark). Some students may mention CLT if n large enough and should be awarded 1 mark if that’s the case. ⌅
12 of 18
Continued over...
Cont...
Math 10013–MayJune19 iii. (2 marks) Define sα2b
= σb2 1/n + x2 /ssxx ,
n
where
σb2 =
1 X b i )2 . (Yi α b βx n 2 i=1
Assuming that with your modelling assumptions you can establish that (b α α)/sαb ⇠ tn2 , f ind a 95% conf idence interval for α. You should explain your reasoning carefully. Answer. Using the above result we know that P(tn2;0.025 (b ↵ ↵)/sαb tn2;0.025) = 0.95. We can make ↵ the subject of this in the usual way, to obtain P(b ↵ tn2;0.025 sαb ↵ ↵ b + tn2;0.025sαb ) = 0.95.
and so we can take cL = ↵ b tn2;0.025 sαb and cU = ↵ b + tn2;0.025sαb where for T ⇠ tn2 and ↵ 2 [0, 1], P(T tn2;α) = ↵. ⌅
(c) The cars data set contains the stopping distances (in feet) for different car speeds (in miles per hour) recorded in the 1920s. Here are the first 6 records: head(cars) ## ## ## ## ## ## ##
1 2 3 4 5 6
speed dist 4 2 4 10 7 4 7 22 8 16 9 10
i. (2 marks) We type the following R commands: attach(cars) plot(speed,dist, xlab='speed (mph)',ylab='stopping distance (ft)')
13 of 18
Continued over...
Math 10013–MayJune19
●
20 40 60 80
●
0
stopping distance (ft)
120
Cont...
●
●
● ● ●
● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●
●
●
●
● ● ●
● ●
●
●
5
10
15
20
25
speed (mph) What is the name for this type of plot? What does it suggest? Answer. [Students familiar with this type of code.] This is a scatter plot (1 mark) of distance against speed, and tells us that there is correlation since when speed increases distance also increases. A linear model seems natural. (1 mark) ⌅
ii. (2 marks) Explain what the following R code does. What can you deduce from this plot? fit 0 is not evidence that we are departing from H0 in the direction of H1 : < 0. ⌅
18 of 18
End of examination....