Statistics 1 Exam Solutions 2020 PDF

Title Statistics 1 Exam Solutions 2020
Course Statistics 1
Institution University of Bristol
Pages 13
File Size 274.4 KB
File Type PDF
Total Downloads 211
Total Views 412

Summary

UNIVERSITY OF BRISTOLExamination for the Degrees of B. and M. (Level C/4)STATISTICSMATH-10013/(Paper Code MATH-10013)May/June 2020, 1 hour and 30 minutesThis paper contains two sections: Section A and Section B.Each section should be answered in a separate answer book.Section A containsfivequestions...


Description

UNIVERSITY OF BRISTOL Examination for the Degrees of B.Sc. and M.Sci. (Level C/4) STATISTICS MATH-10013/11400 (Paper Code MATH-10013)

May/June 2020, 1 hour and 30 minutes

This paper contains two sections: Section A and Section B. Each section should be answered in a separate answer book.

Section A contains five questions, ALL of which will be used for assessment. This section is worth 40% of the marks for the paper. Section B contains two questions, ALL of which will be used for assessment. This section is worth 60% of the marks for the paper.

On this examination, the marking scheme is indicative and is intended only as a guide to the relative weighting of the questions. Calculators of an approved type (non-programmable, no text facility) are allowed in this examination.

THIS PAPER MUST NOT BE REMOVED FROM THE EXAMINATION ROOM.

1 of 13

Do not turn over until instructed.

Cont...

Math 10013–MayJune20

A1. (8 marks) Consider the following dataset fl 0 is MX (t) = β α/(β − t)α for t < β. When α = 1 then X is said to be distributed according to an Exponential(β). When α = r/2 and β = 1/2 then X is said to be distributed according to a χr2]. Answer. (Bookwork) We use the fact that for independent and identically distribute random variables [(2 marks)] n n MPi=1 Xi (t) = MX1 (t)MX2 (t) . . . MXn (t) = MX1 (t)

and from the hint [(1 mark)] n n n MPi=1 Xi (t) = θ /(θ − t)

which is the mgf of Gamma(n, θ). 

5 of 13

Continued over...

Cont...

Math 10013–MayJune20

(c) (3 P marks) Show that if Y = aX for a ∈ R, then MY (t) = MX (at). Deduce that 2θ ni=1 Xi ∼ χr2 for r to be specified.

    Answer. From the definition [(1 mark)] MY (t) = E exp(tY ) = E exp(taX ) = MX (at). From above we deduce that [(1 mark)] M2θ Pni=1 Xi (t) = MPni=1 Xi (2θt) = θ n /(θ − 2θt)n = 1/(1 − 2t)n = (1/2)n /(1/2 − t)2n/2

2 that is a Gamma of parameters [(1 mark)] α = 2n/2 and β = 1/2 or a χ2n .



6 of 13

Continued over...

Cont...

Math 10013–MayJune20

A4. (8 marks) Linear regression. (a) (2 marks) In a standard linear regression model Yi = α+βxi +ei , what assumptions do we make about the residuals {e1 , . . . , en }, assumed to be of common variance σ 2 > 0? Answer. [Theory seen in the notes.] We assume that the ei are uncorrelated (1 mark) of expectation 0 (1 mark) and variance σ 2 unknown. No need for normality. 

b (b) (2 marks) What is meant by least squares estimates α b andβ?

Answer. P P We minimise the sum of the square of the residuals (α, β ) 7→ F (α, β ) = ni=1 ei2 = ni=1(yi − α − βxi )2 . 

b2 . (c) (4 marks) State the expressions for α b, βb and state an estimator of σ

Answer. [Derivation not expected from the student] One can either differentiate and find the normal Pn Pn 2 (y − α − βx ) = 0 and (∂/∂β) (y − α − βxi )2 = 0. These equations are (∂/∂α) i i i P i=1 P P P 2 Pi=1 simplify to αn + β i xi = i yi and α i xi + β i xi = i xi yi . The first of these gives ˆ Using this to substitute for α in the second equation gives α ˆ = y − βx. P P P ssxy n i xi yi − ( i xi )( i yi ) ˆ P P = β= n i x2i − ( i xi )2 ssxx P P where ssxy = i (xi − x)(yi − y) = i xi yi − nxy and similarly for ssxx . (2 marks) A sensible estimator for the variance (suggested in the lecture notes) is (2 marks)



c σ2 =

n 1 X b i )2 . (Yi − α b − βx n − 2 i=1

A5. (8 marks) Method of moment estimation. (a) (1 mark) Let X be a real valued random variable of probability density f (x; θ) for some θ ∈ Θ. What is the definition of the k−th order population moment of X in terms of f (x; θ)? (b) (1 mark) For n ∈ N let x1 , x2 , . . . , xn be the observed values of a simple random sample. What is the k−th order sample moment of x1 , x2 , . . . , xn ? (c) Let x1 , x2 , . . . , xn be the observed values of a simple random sample corresponding to a real valued random variable X of probability density, defined for θ ∈ (−1, 1), ( 1 (1 + θx) for x ∈ [−1, 1] f (x; θ) = 2 0 otherwise. 7 of 13

Continued over...

Cont...

Math 10013–MayJune20 i. (2 marks) Find an expression for E(X; θ) in terms of θ ∈ (−1, 1). ii. (2 marks) Find the method of moments estimates θˆmom in terms of x1 , x2 , . . . , xn . Answer. i. [Bookwork][(1 mark)] From the lecture notes Z ∞  k  xk f (x; θ)dx E X ;θ = −∞

Pn xik ii. [Bookwork][(1 mark)] From the lecture notes mk = n−1 i=1 iii. We have iv. [New, but standard](2 marks) that the population mean is Z   1 1 E X; θ = x(1 + θx)dx 2 −1 1 1 = θx3 −1 6 θ 2θ = = 6 3 v. [Bookwork] (2 marks) we equate population and sample moment θ/3 = m1 and therefore θˆmom = 3m1 . 

(d) (2 marks) Is this estimator unbiased? You should carefully justify your answer. Answer.   ¯ θ = 3 × θ/3 so it is unbiased. [Bookwork (2 marks)] For any θ ∈ [−1, 1] we have E 3X;



8 of 13

Continued over...

Cont...

Math 10013–MayJune20 Remember to start a new answer book for Section B.

B1. (Total 30 marks) Let x1 , x2 , . . . , xn be the observed values of X1 , X2 , . . . , Xn for n ∈ N, assumed to be independent and identically distributed according to a Beta(θ ∗ , 1) distribution for some unknown θ ∗ . The probability density of a Beta(θ, 1) for θ > 0 is ( θxθ−1 for x ∈ [0, 1] f (x; θ) = 0 otherwise. It can be shown that the mean of this distribution is θ/(1 + θ). (a) (5 marks) Derive the method of moments estimate θˆmom of θ. Answer. that is

We are told that E(X; θ) = θ/(1+θ) and we set this equal to the sample moment,

with x = n−1

Pn

i=1 xi

θ = m1 = x 1+θ (2 marks) and we solve (2 marks) θˆmom =

x . 1−x

Would require x < 1 but students no penalty for not mentioning this. 

(b) Maximum likelihood estimator. i. (5 marks) Derive the likelihood equation for this distribution, assuming xi ∈]0, 1[ for i = 1, . . . , n. You should explain every step of your reasoning carefully. Answer. From the assumed independence (1 mark) and the fact that the Xi ′s are identically distributed (1 mark) fX1 ,...,Xn (x1 , x2 , . . . , xn ; θ) = =

n Y

i=1 n Y

fXi (xi ; θ) f (xi ; θ).

i=1

This is strictly larger than zero since xi ≥ 1 (1 mark) and the log-likelihood is therefore ℓ(θ; x1 , x2 , . . . , xn ) =

n X i=1

log θ + (θ − 1) log xi

and the likelihood equation (2 marks ) ∂θ ℓ(θ; x1 , x2 , . . . , xn ) = n/θ +

n X i=1

  ln xi = 0,



9 of 13

Continued over...

Cont...

Math 10013–MayJune20 ii. (3 marks) Find the maximum likelihood estimator θˆmle of θ. You should explicitly confirm that this is a maximum. Answer. We find that ∂θ2ℓ(θ; x1 , x2 , . . . , xn ) = −n/θ 2 < 0 (1 mark) and by solving   Pn the likelihood equation we find θˆmle = −n/   i=1 ln xi (2 marks). It is therefore Pn a unique maximizer. Requires i=1 ln xi > 0, but students not penalized for not mentioning this. 

(c) Confidence interval i. (3 marks) Define precisely what a 95% confidence interval for θ is? How would you explain this informally to a non-mathematical person? Answer. It is a random interval cL (α, X1 , X2 , . . . , Xn ) cU (α, X1 , X2 , . . . , Xn ) such that for all θ (2 marks) P(θ ∈ [cL , cU ]; θ) ≥ 0.95. Informally it means that if we were to observe a large number of data sets generated by the same model θ, then 95% of them are guaranteed to contain θ (1 mark) – but we do not know which ones. 

ii. (4 marks) Let X ∼ Beta(θ, 1) for θ > 0. Show that the random variable Y = −θ log X is distributed according to an Exponential(1) distribution. You should justify your reasoning carefully. Note that you can assume this result in the questions below. Answer.

Various ways to get to the answer. We start with the cdf, i.e. for y ∈ R FY (y; θ) =P(−θ log X ≤ y; θ) =P(X ≥ exp(−y/θ); θ )   =1 − FX exp(−y/θ)

and we can either differentiate, for y ∈ R+ (for y < 0 this is zero)   fX (y; θ) = θ −1 exp(−y/θ)θ exp − y(θ − 1)/θ   = exp − yθ/θ = exp(−y)

Or we could use the cdf

which would have lead to

 θ  x P(X ≤ x; θ) = 0   1

for x ∈ [0, 1]

for x < 0 for x > 1

( 1 − exp(−y) FY (y; θ) = 0

,

for y ≥ 0 . for y < 0



10 of 13

Continued over...

Cont...

Math 10013–MayJune20

(d) (5 marks) For α ∈ [0, 1] deduce the expressions for the limits cL (α, X1 , . . . , Xn ) and cU (α, X1 , . . . , Xn ) of a (1−α)100% interval for θ in terms of θˆmle and percentage points of a χ22n . You should justify every step of your reasoning carefully. [Hints: • For λ > 0 the cumulative distribution of Y ∼ Exponential(λ) is ( 1 − exp(−λx) for x ≥ 0 . FY (y; λ) = 0 otherwise Pn iid 2 Yi ∼ χ2m • If Yi ∼ Exponential(1) for i ∈ {1, . . . , m} then i=1 where χ2r , r ∈ N is the Chi-squared distribution with r degrees of freedom.] 2 • If W ∼ χ2r , r ∈ N, for any α ∈ [0, 1] we let χr;α be the real number such that 2 P(W ≥ χr;α ) = α]. 2 and we use the general recipe to construct Answer. From the hint we find that Z ∼ χ2n a CI from a pivot ! ! n n X X 2 2 P −θ log Xi ≤ χ2n;α/2 ; θ −P −θ log Xi ≤ χ2n;1−α/2 ; θ = (1−α/2)−α/2 = 1−α i=1

i=1

from which we deduce that P

χ22n;1−α/2

≤ −θ

Rearranging we find 2 / P −χ2n;1−α/2

so we can say that

n X

i=1

n X i=1

log Xi ≤

2 χ2n;α/2 ;θ

2 / log Xi ≤ θ ≤ −χ2n;α/2

n X

!

= 1 − α.

log Xi ; θ

i=1

!

=1−α

cL (α, X1 , . . . , Xn ) = θˆmleχ22n;1−α/2 /n cU (α, X1 , . . . , Xn ) = θˆmleχ22n;α/2 /n 

(e) (5 marks) Describe precisely a simulation algorithm which would allow you to compare the mean squared error of θˆmle and θˆmom for any value of θ and n ∈ N. How would you empirically decide on which estimator is best? You are not asked to write R code here, clean pseudo-code is sufficient. Answer. • • • •

(1 mark) per bullet point.

Set true value θ and n to some values Sample B >> 1 independent samples from a Beta(n, θ) of length n i For each data set compute θˆimle and θˆmom using the formulae above, for i ∈ {1, . . . , B } Compute B B 1 X ˆi 1 X ˆi (θmle − θ)2 and (θ − θ)2 B i=1 B i=1 mom

• Empirically we would try this for many values of θ and see whether one of the estimators i systematically return smaller estimated mean squared error). (probably θˆmle 

11 of 13

Continued over...

Cont...

Math 10013–MayJune20

B2. (Total 30 marks) Hypothesis testing. (a) (8 marks) We are given observations x1 , x2 , . . . , xn from a specified distribution with unknown parameter θ∗ . Consider a test of the null hypothesis H0 : θ = θ0 against the alternative H1 : θ = θ1 . To fix ideas, assume that we have designed a test statistic T (x1 , x2 , . . . , xn ) such that large values of T (x1 , x2 , . . . , xn ) indicate inconsistency of the observations with H0 . Define the following terms precisely: i. ii. iii. iv.

Critical region and critical value. Type I error and type II error. The significance level and the power of the test. p−value.

Answer.

[All bookwork]

i. (2 marks) The critical region is the set of observations/T such that we reject H0 , so here C = {x1 , x2 , . . . , xn : T (x1 , x2 , . . . , xn ) ≥ c∗ } for the critical value c∗ . ii. (2 marks) From the lecture notes A. Type I error is the error of deciding the null hypothesis H0 is false when H0 is actually true, B. Type II error is the error of deciding the null hypothesis H0 is true when H1 is actually true. iii. (2 marks) From the lecture notes A. The significance level α is the probability of making a type I error for a critical region C : P(Type I error) = P(T ∈ C; θ0 ) = α B. The power one minus the probability of a type II error 1 − P(Type II error) = 1 − P(T ∈ / C; θ1 ) = P(T ∈ C; θ1 ) iv. (2 marks) The p−value is P(T ≥ tobs; θ0 ), the probability of observing more extreme values of the statistic under the null hypothesis than what we have actually observed. Equivalently it is the smallest significance level at which we would reject the null hypothesis. Either of the answers is fine.  iid

(b) We assume here that for i ∈ {1, . . . , n}, Xi ∼ Exponential(λ∗ ) for some λ∗ > 0. ∗ We are interested in testing H0 : λ∗ = 3 vs. for X ∼  H1 : λ > 3. We know that 2 Exponential(λ) of parameter λ > 0, E X; λ = 1/λ and var(X; λ) = 1/λ . Pn i. (4 marks) Explain why the test statistic T (x1 , . . . , xn ) := x¯ = n−1 i=1 xi is a possible choice to address this hypothesis test. ˆmle = 1/¯ x, so small Answer. [Similar ideas covered] We know that λˆmom = λ values of T (x1 , . . . , xn ) (much smaller than 1/3) will be evidence that one should reject H0 in favour of the alternative. 

12 of 13

Continued over...

Cont...

Math 10013–MayJune20 ii. (4 marks) For a given critical value c ∈ R what is the critical region for the hypothesis test and test statistic T above? Answer.

[Context slightly new] Critical region is therefore of the form C = {x1 , . . . , xn : T (x1 , . . . , xn ) ≤ c}



iii. (4 marks) It can be shown that 2λ∗ nX¯ ∼ χ22n . Give an expression for the p−value for the test above in terms of the cumulative distribution function FW of the random variable W ∼ χ22n , n, x¯ and λ∗ . Answer. [Slightly new] From above we look for the probability of obtaining more extreme values (i.e. ) p−value   p = P T (X1 , . . . , Xn ) ≤ T (x1 , . . . , xn ); λ∗ = 3   ¯ ≤ 2λ∗ n¯ = P 2λ∗ nX x; λ∗ = 3 = FW (2λ∗ nx) ¯

iid

since X1 , . . . , Xn ∼ Exponential(λ∗ = 3) and from the hint. 

(c) We assume the same setup as in part (b) but now we approximate the exact distribution of the test statistic. i. (4 marks) Let Z1 , Z2 , . . . , Zn be independent and identically distributed random 2 variables of mean E(Z1 ) = µ and finite variance Pnσ = var(Z1 ). State the normal −1 approximation to the distribution of Z := n i=1 Z i given by the central limit theorem in terms of the cumulative distribution function Φ(·) of a N (0, 1) random variable. Answer.

[Bookwork] From the lecture notes, for n sufficiently large  ¯ Z −µ √ ≤ x ≈ P (N (0, 1) ≤ x) = Φ(x) P σ/ n



ii. (6 marks) Find an approximation for the p−value in part (b) in terms of the distribution of a normal random variable. Answer. therefore

[Application slightly new] Here we have that µ = 1/λ∗ and σ 2 = 1/λ∗2 and   p = P T (X1 , . . . , Xn ) ≤ tobs; λ∗ = 3   ¯ ≤x =P X ¯; λ∗ = 3 √ X¯ − 1/λ∗  √ x ¯ − 1/λ∗ ∗ =P n ≤ n ;λ = 3 1/λ∗ 1/λ∗ √ ≈ Φ( n(3¯ x − 1))



13 of 13

End of examination....


Similar Free PDFs