Introduction to probability and statistics- Exam 2014 questions and answers PDF

Title Introduction to probability and statistics- Exam 2014 questions and answers
Course Mathematics
Institution University of York
Pages 18
File Size 268.6 KB
File Type PDF
Total Downloads 81
Total Views 151

Summary

3 hour exam for the module Introduction to probability and statistics with a total of 200 marks. Covers many topics including distributions, conditional probility etc ...


Description

MAT00004C UNIVERSITY OF YORK

BA, BSc and MMath Examinations 2014 MATHEMATICS Introduction to Probability and Statistics Time Allowed: 3 hours. Answer ALL SIX Questions. Please write your answers in ink; pencil is acceptable for graphs and diagrams. Do not use red ink. Standard calculators will be provided. Candidates are provided with copies of Statistics Tables by H.R.Neave (George Allen and Unwin 1978 / Routledge 1993). Question 1 carries 46 marks, question 2 carries 32 marks, question 3 carries 32 marks, question 4 carries 30 marks, question 5 carries 30 marks and question 6 carries 30 marks. The marking scheme shown on each question is indicative only. Your final score for the paper will be calculated out of a possible 200 marks.

Page 1 (of 7)

Turn over

MAT00004C 1 (of 6).

(a)

State the axioms of probability, i.e., the properties that a probability function P has to satisfy. [5]

(b)

Show that if B is an event such that P (B) = 1, then for all events A, P (A ∩ B) = P (A) . [6]

(c)

Two fair dice are rolled; let the scores obtained be X1 and X2 , and let S = X1 + X2 denote the total score. (i) What is the probability that both dice show an odd score? (ii) Calculate E[X1 ] and Var (X1 ). (iii) Calculate E[S] and Var (S). (iv) Calculate P (X1 ≤ 3 | S = 6). (v) Calculate P (S = 6 | X1 ≤ 3).

(d)

[21]

The random variable X has the probability mass function pX (k) = P (X = k) =

e−λ λk , k!

for k = 0, 1, 2, . . .

and pX (k) = 0 otherwise. (i) Calculate P (X ≤ 2). (ii) Calculate E [X(X − 1)]. [14]

Page 2 (of 7)

MAT00004C 2 (of 6).

(a)

Let {B1 , B2 , ...} be a partition of a sample space Ω such that P (Bi ) > 0, for each i. Let A ⊆ Ω satisfy P (A) > 0. State and prove Bayes’ Theorem for P (Bn |A). (You may use the law of total probability without proof, so long as you quote the result clearly.) [10]

(b)

Suppose that a diagnostic test for a certain disease is 95% accurate when used on people who have the disease, and 99% accurate when used on people who do not have the disease. It is also known that 0.5% of the population actually has the disease. (i)

What is the probability that a particular individual has the disease, given that their test result is positive? [14]

(ii)

What is the probability that a particular individual has the disease, given that their test result is negative? [8]

Page 3 (of 7)

Turn over

MAT00004C 3 (of 6).

(a)

Find c such that the following function is a density function of a continuous random variable X :  cx (2 − x) if x ∈ (0, 2) fX (x) = 0 otherwise. [8]

(b)

Given the density function from part (a), find the distribution function FX of X. [5]

(c)

Calculate P (1/2 < X < 1).

[5]

(d)

Calculate E [X].

[7]

(e)

Using that Var (X) = 1/5, what are the expectation and variance of Y = 5X + 2?

[7]

Page 4 (of 7)

MAT00004C 4 (of 6).

(a)

Indicate which of the following variables are quantitative and which are qualitative and then classify the quantitative variables as discrete or continuous. (i) (ii) (iii) (iv) (v) (vi)

(b)

Number of typographical errors in newspapers Salaries of football players Holiday destinations favoured by university students Length of a frog’s jump Number of cars owned by families Favourite breed of dog for each of 20 children

[6]

The following are the ages of six employees of an insurance company: 46 36 48 51 49 52 For these data: (i) calculate the sample mean and median; (ii) calculate the sample variance and standard deviation; (iii) calculate the values of the upper and lower quartiles and the interquartile range; (iv) sketch the boxplot. [10]

(c) (i) (ii)

State Chebyshev’s inequality.

[4]

The cars owned by all people living in York are, on average, 7.3 years old with a standard deviation of 2.2 years. Using Chebyshev’s theorem, find a lower bound for the probability that a randomly chosen car is 1.8 to 12.8 years old. [10]

Page 5 (of 7)

Turn over

MAT00004C 5 (of 6).

(a)

Consider a random sample X1 , . . . , Xn from a normal distribution with mean 25 and variance 340. Let X¯n be the sample mean. Find the approximate value [6] of n given that P (X¯n > 28) is approximately 0.005.

(b)

State the Central Limit Theorem.

(c)

Consider a random sample X1 , . . . , Xn from a distribution with unknown mean µ and known standard deviation σ. Assume that the sample size is large. Show that a 100(1 − α)% confidence interval for the mean µ is approximately √ X¯ ± z(α/2)σ/ n,

[5]

where z(α/2) is the upper α/2 point of standard normal distribution i.e. the area to the right of z(α/2) is α/2. [10]

(d)

The standard deviation for a population is σ = 15.3. A sample of size 36 observations selected from this population gave a mean equal to 74.8. Produce 90% and 80% confidence intervals for µ. [9]

Page 6 (of 7)

MAT00004C 6 (of 6).

Consider a bivariate dataset (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ).

[4]

(a)

Describe the simple linear regression model for such a dataset.

(b)

Give the least-squares estimators α ˆ and βˆ of the regression line y = α + βx. [4]

(c)

Suppose we have the following data set: (1, 2), (2, 2.5), (2, 2), (3, 2.5), (4.5, 3).

(i)

(ii)

Calculate estimates for α and β. You may use that X X X X xi = 12.5, yi = 12, xi2 = 38.25, xi yi = 32. Draw in one figure the scatterplot of the data and the estimated regression line. [7]

(d)

Given that the estimator βˆ is unbiased, show that the estimator α ˆ is unbiased. [5]

(e)

Assume that the yi are realisations of independent random variables Yi ∼ N(α + βxi , σ 2 ). Determine the log likelihood l(α, β, σ). Use it to find an expression for the maximum likelihood estimator of σ 2 . (You are not required to show that the extremum of the likelihood is actually a maximum.) [10]

Page 7 (of 7)

End of examination.

SOLUTIONS 1.

(a)

A probability function P assigns to each event E a real number P (E) such that (P1) (P2) (P3)

(b)

MAT00004C

P (E) ∈ [0, 1]; P (Ω) = 1; P∞ ∞ P (Ei ) P (∪i=1 Ei ) = i=1

if E1 , E2 , . . . are disjoint. ✞



✝5 Marks ✆

Since A∪B ⊇ B, it holds that P (A∪B) ≥ P (B) = 1. Hence, P (A∪B) = 1 and P (A ∩ B ) = P (A) + P (B ) − P (A ∪ B) = P (A). ✞ ✝6

(c) (i)

(ii)



Marks ✆

The probability that a particular die shows an odd score is 1/2. ✞ Therefore☎ the probability that both dice show an odd score is 1/4. ✝2 Marks ✆ Straightforward calculation gives 1 7 E[X1 ] = (6 + 5 + 4 + 3 + 2 + 1) = , 2 6   2  2  1 7 7 2 7 Var (X1 ) = + 4− + 5− 6− 6 2 2 2 2 !  2  2  7 7 7 + 1− + 2− + 3− 2 2 2  2  2  2 ! 1 5 1 3 = + + 4 4 3 4 25 + 9 + 1 35 = = . 12 12 ✞

✝4

(iii)



Marks ✆

We can use the fact that the scores on the two dice are independent and identically distributed to calculate: E [S] = E [X1 + X2 ] = E [X1 ] + E [X2 ] = 7, Var (X1 ) = Var (X1 + X2 ) = Var (X1 ) + Var (X2 ) =

35 . 6 ✞

✝4

9



Marks ✆

SOLUTIONS

MAT00004C

(iv)

Possibilities for (X1 , X2 ) given S = 6 are (1,5), (2,4), (3,3),✞(4,2) and☎ (5,1). So P (X1 ≤ 3 | S = 6) = 3/5. ✝4 Marks ✆

(v)

Using Bayes’ Theorem, we have P (X1 ≤ 3 | S = 6) P (S = 6) P (X1 ≤ 3) (3/5) × (5/36) = = 1/6. (1/2)

P (S = 6 | X1 ≤ 3) =



✝7

(d) (i)



Marks ✆

P (X ≤ 2) = P (X = 0)+P (X = 1)+P (X = 2) = e−λ (1 +✞ λ + λ2 /2) ☎. ✝4 Marks ✆

(ii) E [X(X − 1)] =

∞ X k=0

k (k − 1)pX (k ) =

∞ X = e−λ

∞ X k=0

k(k − 1)

λk −λ e k!

∞ X λk−2 λ −λ 2 =e λ (k − 2)! (k − 2)! k=2

k=2 ∞ X −λ 2

=e λ

j=0

k

λj = e−λ λ2 eλ = λ2 . j!



✝10



Marks ✆

Remarks. Part (a) is bookwork, part (b) has been seen on an exercise sheet, parts (c) and (d) contain unseen but standard calculations. We did not discuss the Poisson distribution in the course. ✞ ✝Total:

2.

(a)



46 Marks ✆

Bayes’ Theorem states that P (A|Bn )P (Bn ) , P (Bn |A) = P∞ i=1 P (A|Bi )P (Bi )

∀ n ∈ N.

The proof follows directly from the definition of conditional probability and the law of total probability, which states that, since the sets {B1 , B2 , . . . } form a partition of Ω, P (A) =

∞ X i=1

P (A |Bi ) P (Bi ) for all A ⊆ Ω , 10

SOLUTIONS

MAT00004C

We therefore find that P (Bn |A) =

(b)

P (Bn ∩ A) P (A|Bn )P (Bn ) . = P∞ P (A) i=1 P (A|Bi )P (Bi )



✝10



Marks ✆

Let D be the event that the individual has the disease and let T be the event that the test is positive. Note that D and D′ form, by definition, a partition of the sample space. (i)

The question tells us that P (D) = 0.005;

P (T |D) = 0.95;

P (T |D′ ) = 0.01 .



✞ ✝8

Marks ✆

✞ ✝7

Marks ✆

✞ ✝7

Marks ✆

Bayes’ theorem then gives P (T |D)P (D) P (T |D)P (D) + P (T |D′ )P (D′ ) (0.95)(0.005) = (0.95)(0.005) + (0.01)(0.995)

P (D|T ) =

≈ 0.323. (ii)



Similarly, P (T ′ |D)P (D) P (T ′ |D)P (D) + P (T ′ |D′ )P (D′ ) (0.05)(0.005) = (0.05)(0.005) + (0.99)(0.995)

P (D|T ′ ) =

≈ 0.00025.



Remarks. Part (a) is straight from lecture notes. Part (b) is a variant of a seen problem. ✞

✝Total:

3.

(a)



32 Marks ✆

For f to be a density function, it must be non-negative everywhere and integrate to 1. f is clearly non-negative if c > 0. Furthermore, Z 2  2 cx(2 − x)dx = c x2 − x3 /3 0 = 4c/3 . 0

Therefore f is a density function if (and only if) c = 3/4.

11





✝6 Marks ✆

SOLUTIONS (b)

MAT00004C

FX (x) = P (X ≤ x). Clearly FX (x) is zero for x ≤ 0 and one for x > 2. For x ∈ (0, 2) we have Z x 3 3 u(2 − u)du = (x2 − x3 /3) . FX (x) = 4 0 4 Therefore,

FX (x) =

  0

3 (x2 4

 1

3

− x /3)

x≤0 0 2.

P (1/2 < X < 1) = FX (1) − FX (1/2) = 1/2 − 5/32 = 11/32.

(d)

We calculate   x4 2 3 2x3 3 2 x (2 − x)dx = − E [X] = 4 3 4 0 0 4   3 3 16 = [16/3 − 16/4] = =1 4 12 4

(e)



✞ ✝5

Marks ✆

✞ ✝7

Marks ✆

✞ ✝7

Marks ✆

✝7 Marks ✆

(c)

Z





2



Can use the transformation properties of expectation of variance: E [5X + 2] = 5E [X] + 2 = 7, Var (5X + 2) = 25Var (X) = 5.

Remarks. exercises.

These are all standard calculations, similar to those performed in ✞ ✝Total:

4.





32 Marks ✆

(a) (i) Quantitative; Discrete. (ii) Quantitative; Discrete or Continuous are both marked as correct. One could argue that when expressed in terms of a currency, the values are restricted to a discrete set of values because you never see salaries with fractional pence values. (iii) Qualitative (iv) Quantitative; Continuous. (v) Quantitative; Discrete. 12

SOLUTIONS

MAT00004C

Figure 1: Boxplot of employee age (vi)

✞ ✝6

Qualitative



Marks ✆

(b) (i) From the given data, mean = 47, median = 48.5. (ii) Variance = 168/5 = 33.6 and standard deviation ≈ 5.79655. (iii) qn (0.25) = 43.5, qn (0.75) = 51.25, and IQR = 7.75. (iv) For boxplot see Figure 1 ☎ ✞ 10 Marks ✝ ✆ (c) (i)

(Chebyshev’s inequality) Let X be a random variable and let a ∈ R with a > 0. Then 1 P (|X − E[X]| ≥ a) ≤ 2 Var (X) . a ✞



✝4 Marks ✆

(ii)

We model the age of a car as a random variable X with E[X] = 7.3 and Var (X) = (2.2)2 . Then the probability that a car is between 1.8 and 12.8

13

SOLUTIONS

MAT00004C

years old is P (1.8 < X < 12.8) = P (|X − 7.3| < 5.5) = 1 − P (|X − E[X ]| > 5.5)

 2 1 2.2 ≥1− Var (X) = 1 − 2 5.5 (5.5) = 0.84. ✞

✝10

Remarks. exercises.

These are all standard calculations, similar to those performed in ✞

✝Total:

5.

(a)



Marks ✆



30 Marks ✆

¯n ∼ N (25, 340 ) for sample of size n. Thus, We know that X n ! ¯ − 25 X 28 − 25 P (X¯ > 28) = P p > p 340/n 340/n ! 3 = P Z >p 340/n = 0.005,

 and so P Z ≤ √

3 340/n



= 0.995. This implies that √

the table) and so n ≃ 250.

(b)

3 340/n

= 2.576 (from ✞

✝7



Marks ✆

Let X1 , . . . , Xn be a sequence of independent and identically distributed random variables with E[X] = µ < ∞ and Var (X) = σ 2 < ∞ and let 1 X¯n = (X1 + · · · + Xn ) . n Define Zn =

¯n − µ X √ . σ/ n

Then at any point x ∈ R lim FZn (x) = Φ(x),

n→∞





where Φ is the distribution function of the standard normal distribution. ✝5 Marks ✆ 14

SOLUTIONS (c)

MAT00004C

If Z follows a standard normal distribution, then, by definition of z(α), P (−z (α/2) ≤ Z ≤ z (α/2)) = 1 − α. √ Since the sample is large, (X n −µ)/(σ/ n) is approximately N (0, 1). Therefore we have that   Xn − µ √ ≤ z(α/2) ≈ 1 − α. P −z(α/2) ≤ σ/ n Elementary manipulation of the inequalities gives   σ σ ≈ 1 − α. P X n − z(α/2) √ ≤ µ ≤ X n + z(α/2) √ n n Thus,

  σ σ X n − z(α/2) √ , X n + z(α/2)√ n n

is an approximate (1 − α)100% confidence interval for µ, when σ✞2 is known.☎ ✝10 Marks ✆ (d)

From the given information, we know that x ¯ = 74.8, σ = 15.3 and n = 36. Here σ is known and although the shape of the distribution is unknown, the sample size is large (n > 30). Hence, using the CLT, we can use the normal distribution to make a confidence interval for µ. With 1 − α = 0.90, we have α/2 = 0.05, and z(0.05) = 1.645, 1.645 · 15.3 σ √ = 4.195. 1.645 √ = n 36 Thus, the 90% confidence interval for the population mean µ becomes   σ σ √ x ¯ − 1.645 ,x ¯ + 1.645 √ = (74.8 − 4.195, 74.8 + 4.195) n n or approximately (70.60, 78.99). With 1 − α = 0.80, we have α/2 = 0.10, and z(0.10) = 1.28, 1.28 · 15.3 σ √ = 3.264. 1.28 √ = n 36 Thus, the 80% confidence interval for µ becomes (74.8 − 3.264, 74.8 + 3.264)

or

(71.536, 78.064). ✞



✝8 Marks ✆

15

SOLUTIONS

MAT00004C

Remarks. Parts (b) and (c) are bookwork. Parts (a) and (d) are unseen variants of a seen problems. ✞



✝Total: 30 Marks ✆

6.

(a)

In a simple linear regression model we assume that x1 , x2 , . . . xn are nonrandom and that y1 , y2 , . . . yn are realisations of the random variables Yi = α + βxi + Ui where α, β ∈ R and U1 , . . . Un are independent random variables with E[Ui ] = ☎ ✞ 2 4 Marks 0 and Var (Ui ) = σ . ✝ ✆

(b)

The estimators are given by P P xi Yi − ( xi )( Yi ) P P , n x2i − ( xi )2 ¯n . α ˆ = Y¯n − βˆx

n βˆ =

P





✝4 Marks ✆

(c) (i)

Substituting the given values into the formulas for the estimators above gives 5 · 32 − 12.5 · 12 2 βˆ = = ≈ 0.2857143, 7 5 · 38.25 − 12.52 12 2 12.5 59 α ˆ= ≈ 1.685714. − = 35 7 5 5

(ii)

(d)

See Figure 2

✞ ✝7



Marks ✆

The estimator α ˆ is unbiased if E[α ˆ ] = α. We have ¯n . E[α ˆ ] = E[Y¯n − βˆx ¯n ] = E[Y¯n ] − β x Also E[Y¯n ] =

1X 1X E[Yi ] = E[α + βxi + Ui ] = α + β x ¯n . n n

Substituing this into the previous expression gives the desired property E[ˆ α] =☎ ✞ α. ✝5 Marks ✆

16

MAT00004C

2.0 0.0

1.0

y

3.0

SOLUTIONS

0

1

2

3

4

5

x

Figure 2: Scatterplot with regression line (e)

When Yi ∼ N(α + βxi , σ 2 ) then its probability density function is given by   (y − α − βxi )2 1 exp − fYi (y) = √ , 2σ 2 2πσ and thus its log is √ (yi − α − βxi )2 . log(fYi (y)) = − log(σ) − log( 2π) − 2σ 2 The log likelihood is l(α, β, σ) =

n X

log(fYi (yi ))

i=1

n 1 X = −n log(σ) − n log( 2π) − 2 (yi − α − βxi )2 . 2σ



i=1

To find the maximum likelihood estimate of σ we differentiate l(α, β, σ) with respect to σ and set the result to zero: n

n 1 X ∂ l(α, β, σ) = − + 3 (yi − α − βxi )2 = 0. σ ∂σ σ i=1 It follows that the maximum likelihood estimator is n 1X ˆ i )2 . (Yi − α ˆ − βx σ ˆ = n 2

i=1

17

SOLUTIONS

MAT00004C ✞



✝10 Marks ✆

Remarks. Parts (a) and (b) are bookwork. Part (c) contains very standard calculation. Parts (d) and (e) are unseen and will be challenging. ✞



✝Total: 30 Marks ✆

18...


Similar Free PDFs