Statistics Exam Solutions 2018 PDF

Title Statistics Exam Solutions 2018
Course Statistics 1
Institution University of Bristol
Pages 14
File Size 331.1 KB
File Type PDF
Total Downloads 74
Total Views 149

Summary

Statistics Exam Solutions 2018...


Description

UNIVERSITY OF BRISTOL Examination for the Degrees of B.Sc. and M.Sci. (Level C/4) STATISTICS 1 MATH 11400 (Paper Code MATH-11400)

May/June 2018, 1 hour and 30 minutes

This paper contains two sections: Section A and Section B. Each section should be answered in a separate answer book.

Section A contains five questions, ALL of which will be used for assessment. This section is worth 40% of the marks for the paper. Section B contains two questions, ALL of which will be used for assessment. This section is worth 60% of the marks for the paper.

On this examination, the marking scheme is indicative and is intended only as a guide to the relative weighting of the questions. Calculators of an approved type (non-programmable, no text facility) are allowed in this examination.

THIS PAPER MUST NOT BE REMOVED FROM THE EXAMINATION ROOM.

1 of 14

Do not turn over until instructed.

Cont...

Math 11400–MayJune18

A1. (8 marks) Suppose, CEO yearly compensations are sampled and the following are found (in millions of dollars). This is before being indicted for cooking the books. sals = c(12, .4, 5, 2, 50, 8, 3, 1, 4, .25,.15) (a) Define the median and the lower and upper hinges H1 and H3 of a data set of size 2m + 1. Answer. (2 marks) [Standard material from very early in the course, this example is new] The median is the ‘middle observation’ in the ranked order x(1) ≤ x(2) ≤ . . . ≤ x(n) . If n = 2m + 1 is odd, the median is x(m+1), otherwise (x(m) + x(m+1))/2 Lower hinge H1 is the median of {data values ≤ sample median}. Upper hinge H3 is the median of {data values ≥ sample median}. 

(b) Give a five number summary of the CEO salaries given above. (3 marks) [Standard material from very early in the course, this example is

Answer. new]

f2.3= (median − H1 ), the dataset is skewed to the right. 

A2. (8 marks) Let x1 , x2 , . . . , xn be the observed values of a random sample of size n from the Pareto distribution with unknown parameter θ > 0 and probability density function ( θ if x > 0 θ+1 f (x; θ) = (1+x) 0 otherwise. (a) Derive the likelihood equation for this distribution. (b) Hence find the maximum likelihood estimate of θ .

Answer.

[Fairly standard material]

(a) (6 marks) log f (x; θ) = log θ − (θ + 1) log(x + 1), so the likelihood equation is 0 = (d/dθ )

n X i=1

log f (xi ; θ) = n/θ −

(b) (1 mark) The solution is clearly θ = n/ (d 2 /dθ 2 )

n X i=1

Pn

i=1 log(xi

n X

log(xi + 1).

i=1

+ 1) and (1 mark)

log f (xi ; θ) = −n/θ 2 < 0



3 of 14

Continued over...

Cont...

Math 11400–MayJune18

A3. (8 marks) Let x1 , x2 , . . . , xn be the observed values of a random sample of size n from b 1 , x2 , . . . , xn ) be an a distribution of density f (x; θ ) for some parameter θ. Let θb = θ(x estimator of θ .

b Why is it important in Statistics? (a) What is meant by the sampling distribution of θ? b Explain what these quantities represent? (b) Define the bias and mean squared error of θ. (c) State and prove a well known relation between the bias, mean squared error and variance of an estimator.

Answer.

[All bookwork.]

b 1 , X2 , . . . , Xn ) where Xi iid ∼ f (·; θ). It is important (a) (3 marks) It is the distribution of θ(X because it provides us with information about the fluctuations of the estimator as a function of the data and can be used e.g. to tell us about the quality of the estimator (does it depend a lot on a particular sample?). (b) (2 marks) systematic deviation from the true value and spread around the true value. b = E(θb − θ; θ) = E(θ; b θ) − θ, bias (θ) b = E(θb − θ)2 ; θ mse (θ)

(c) (3 marks) The well known relation is

b + bias (θ) b2 mse ( b θ) = Var (θ)

b which is just a number, as is θ): and it follows from (writing E = Eθ, mse ( b θ) = E(θb − θ)2  2 = E (θb − E) + (E − θ)

= E(θb − E)2 + 2E(θb − E)(E − θ) + E(E − θ)2 b + bias (θ) b 2, = Var (θ)



where the cross term vanishes because E(θb − E) = 0.

A4. (8 marks) A random variable W is defined to have a χr2 distribution with r > 0 degrees of freedom (W ∼ χ2r ) if its moment generating function is E(etW ) = (1 − 2t)−r/2 . (a) Show that if Z has a standard normal distribution N (0, 1), then Z 2 ∼ χ12.

(b) Let Z1 , Z2 , . . . , Zn be independent random variables with common distribution N (0, 1). What is the moment generating function of Z12 + Z22 + · · · + Z n2 ? You should justify your reasoning carefully. (c) Name the distribution of Z12 + Z22 + · · · + Zn2. [8 marks possible, bookwork] R R 2 R 2 2 2 2 (a) (4 marks) E(etZ ) = etz √12π e−z /2 dz = √12π e−z (1−2t) dz = (1 − 2t)−1/2 √12π e−u /2 du √ by change of variable z 1 − 2t = u, so result = (1 − 2t)−1/2 , MGF of χ12 as required.

Answer.

4 of 14

Continued over...

Cont...

Math 11400–MayJune18 (b) (3 marks) The MGF of a sum of independent random variables is the product of the individual MGFs, and since they share the same distribution  n MZ 2 +Z 2+···+Zn2 (t) = MZ 2 (t) = (1 − 2t)−n/2 1 1 2 (c) (1 mark) we recognise the MGF of a χn2.



A5. (8 marks) Let x1 , x2 , . . . , xn be the observed values of a simple random sample from the Gamma(α, λ) distribution, with expectation α/λ and variance α/λ2 , where both α > 0 and λ > 0 are unknown. (a) Write down two equations relating the method of moments estimates α ˆ and λˆ to the sample moments m1 and m2 . (b) Find explicit expressions for α ˆ and λˆ in terms of the observed values x1 , x2 , . . . , xn . Answer.

[Fairly standard]

(a) (3 marks) m1 ≈ E(X) = α/λ; m2 ≈ E(X 2 ) = α(α + 1)/λ2 and set equality for α ˆ and λˆ (b) (5 marks) Solve the system of equations:

m1 m12 λˆ = 2 and m2 − m12 m2 − m1 Pn 2 Pn xi and m2 = n−1 i=1 xi . where m1 = n−1 i=1 α ˆ=



5 of 14

Continued over...

Cont...

Math 11400–MayJune18 Remember to start a new answer book for Section B.

B1. In a study we are interested in checking whether drivers are impaired after drinking two beers. A sample of 20 drivers was chosen, and their reaction times in an obstacle course were measured before and after drinking two beers. For each driver, the two measurements are the total reaction time before drinking two beers, and after. The measurements for five drivers are given below. beers[seq(1,5),] ## ## ## ## ## ##

1 2 3 4 5

Before After 6.25 6.85 2.96 4.78 4.95 5.57 3.94 4.01 4.85 5.91

(a) It is suggested to use either a pooled t−test or a paired t−test in order to decide whether drinking alcohol increases reaction times. For both tests formally state the following: i. (2 marks) The model assumptions for the data. Answer. Pooled: assume N (µX , σ 2 ) and N (µY , σ 2 ) with parameters unknown and Xi and Yj are independent. Paired: assume W has distribution N (δ, σ 2 ) and independent. 

ii. (2 marks) The null and alternative hypotheses. Answer. With X the population before and Y after, for pooled we assume independence between Xi and Yi and the hypotheses are H0 : µX = µY and H1 : µX − µY < 0 while for the paired we compute Wi = Xi − Yi and, assuming they are iid, test whether H0 : EW = 0 or (here) H1 : EW < 0 

iii. (2 marks) The form of the test statistic and its distribution under the null hypothesis. Answer.

Pooled: with m = n here Pm Pn 2 2 i=1 (Xi − X) + j=1 (Yj − Y ) 2 Sp = n+m−2 2 (n − 1)SX + (m − 1)S 2Y . = n+m−2

The test statistic then becomes T = (X − Y )/ Sp

r

1 1 + n m

!

where T ∼ tn+m−2 .

when H0 is true

6 of 14

Continued over...

Cont...

Math 11400–MayJune18 Paired: Put W = (W1 + · · · + Wn )/n and

The test statistic is then

2 d σ 2W = SW =

√ nW T = σ d W



Pn

− W )2 (n − 1)

i=1 (Wi

where T ∼ tn−1 when H0 is true.

iv. (2 marks) Which of the two tests do you think is most appropriate here? Explain why. Answer. The paired test is the most natural in this set-up as we want to eliminate individual variations. 

(b) We use R in order to apply the theory of B1(a). i. (5 marks) Explain briefly in words what each line of the following R code is doing.

−2.0

−1.0

q.w

0.0

1.0

w...


Similar Free PDFs