Exam May 2014, questions PDF

Title Exam May 2014, questions
Course Introduction to Statistics
Institution University of Leeds
Pages 11
File Size 218.8 KB
File Type PDF
Total Downloads 30
Total Views 143

Summary

Download Exam May 2014, questions PDF


Description

MATH1725

This question paper consists of 11 printed pages, each of which is identified by the reference

All calculators must carry an approval sticker issued by the School of Mathematics. Statistical tables are provided at the end of the exam paper.

.



Examination for the Module MATH1725 (May-June 2014) INTRODUCTION TO STATISTICS Time allowed:

1

MATH1725

. For n = 7 data values 2.1, 1.2, 3.5, 5.3, 1.4, 4.1, 4.0, which of the statements below are certainly true? (i) The sample median equals 3.5. (ii) The sample range equals 4.1. (iii) The sample mean equals 3.5. (iv) The data values are from a normal distribution. A i, ii, and iii,

B i and ii,

C i and iv,

D i, iii, and iv,

E i only.

For n observations x1 , x2 , . . . , xn the sample median is M and the lower and upper quartiles are Q1 and Q3 respectively. How is the semi-interquartile range defined? A

1 (Q3 2

− Q1 ),

B Q3 − Q1 ,

C Q1 + Q3 ,

D

1 (Q1 2

+ Q3 ),

E

1 M. 2

A test of a null hypothesis H 0 has a 5% significance level. What does this mean? A Pr(Reject H 0 when H 0 is true)= 0.05, C Pr(Reject H 0 when H 0 is false)= 0.05,

B Pr(Accept H 0 when H 0 is true) = 0.05 , D Pr(Accept H 0 when H 0 is false) = 0.05 .

A least squares regression problem has n pairs of data (xi , yi ), i = 1, 2, . . . , n. The b i and the corresponding residual is fitted y -value corresponding to x = xi is ybi = α b + βx ri = yi − ybi . The residual plot is a plot of which two quantities? A ybi

xi ,

B ri

yi ,

C ri

xi ,

D yi

xi ,

E yi

ybi .

Two random variables X and Y have correlation ρXY = 0 . Which of the phrases below must certainly apply to these random variables? (i) X and Y are independent. (ii) X and Y are uncorrelated. (iii) X and Y have normal distributions. A i only,

B i and ii,

C i and iii,

2

D ii and iii,

E ii only.

MATH1725

If X has mean 2µ and variance 4, and Y has mean µ and variance 1, what value of the constant a makes X + aY an unbiased estimator of µ ? A a = − 21 ,

B a = −1,

D a = 21 ,

C a = 0,

E a = 1.

In question A6 above, suppose that X and Y are such that corr(X, Y ) = 0.25. What value of the constant b makes Y and X + bY uncorrelated? A b = − 12 ,

B b = −1,

D b = 21 ,

C b = 0,

E b = 1.

In question A7 above, if c is a constant, what does the variance of X + cY equal? A 1 + c + 4c2 ,

B 4 + 4c + 4c2 ,

C 4 + c + 4c2 ,

D 4 + 4c + c2 ,

E 4 + c + c2 .

Suppose X1 , X2 , . . . , X100 are independent random variables each with mean 2µ and variance 4, and U = X1 + X2 + · · · + X100 is their sum. Which of the statements below are true? (i) U has mean 200µ . (ii) U has variance 40000 . (iii) U has an approximate normal distribution. A all of them,

B i only,

C ii and iii,

D i and iii,

E i and ii.

A chi-squared test of goodness of fit has 9 groups, each with expected frequency greater than 10, and two parameters to be estimated from the data. What is the critical value of the 2 χ2 -test statistic χobs for a 5% significance level test? A 10.645,

B 12.592,

C 14.067,

3

D 16.919,

E 18.548.

MATH1725

Explain briefly what you understand by the terms “robust” and “measure of location” in the statement “the median is a robust measure of location”.

For the ten observations 3, 3, 4, 4, 6, 7, 10, 15, 22, 125 what is the cumulative frequency at x = 6 and the cumulative relative frequency at x = 8 ?

Suppose thatn n observations x1 , x2 , . . . , xn have sample mean x¯ and sample variance ! n X X 1 1 (xi − x¯)2 . Prove that s2 can be written as s2 = s2 = xi − n¯ x2 . n−1 n−1 i=1

i=1

In a least squares regression problem the sample correlation coefficient rXY = +1. Describe how the data are scattered about the regression line. What can you say about the slope of the regression line?

In a least squares regression problem one observation has associated Cook’s distance d = 4. Explain carefully what this tells you about that particular observation.

The table below gives the joint probability function pXY (x, y) for two discrete random variables X and Y which each take values 0, 1. What do E [X Y ] and cov (X, Y ) equal? Y X

0 1 0 0.3 0.1 1 0.1 0.5

In question A16 above, what is the marginal probability function pX (x) of X for x = 0, 1 ? Two random variables U and V are independent. Are they uncorrelated? A random variable X has a binomial distribution with parameters n = 25 and π = 0.2 . Using a suitable normal approximation, obtain the probability that X ≤ 6 . In a chi-squared test with two groups the observed value of the chi-squared test statistic 2 = 0.000122. What might this suggest about the under some null hypothesis H 0 is χobs observed data?

4

MATH1725

At the mid-point of the MATH1725 module, the number X of missed lectures out of 10 for a group of 268 students was as given in the table below. Number x of missed lectures 0 1 2 3 Observed frequency f 128 59 32 20

4 5 9 11

6 4

7 3

8 2

≥9 0

By displaying your calculations in a suitable table, obtain the sample mean x¯ and sample variance s2 for these data. As a simple model to fit to these data it is supposed that each student has common constant probability π of missing any lecture and all absences are independent of each other. Explain briefly why this suggests modelling the number X of missed lectures for any student using a binomial Bin(n = 10, π) distribution where π is here estimated by π b = 0.125. The cumulative distribution function for a Bin(n = 10, π =8 )1 distribution satisfies: x pr{X ≤ x}

0 1 2 3 4 5 ··· 0.2631 0.6389 0.8805 0.9725 0.9955 0.9995 · · ·

Use this table of cumulative binomial probabilities to complete the table below of observed and fitted frequencies. Number x of missed lectures 0 Observed frequency f 128 Expected frequency 70.51

1 59

2 32

3 20

4 ≥5 9 20

Total 268 268.00

By suitably modifying your table in part above, construct a χ2 -goodness-of-fit test to determine whether your fitted binomial distribution gives a good fit to these data.

5

MATH1725

The anxiety level of patients was assessed in a medical study using two different procedures: (I) the stait-trait anxiety inventory (STAI) consisting of twenty questions; (II) asking the patients to indicate on a 100mm. scale their perceived anxiety level with 0mm. on the scale corresponding to the statement “I do not feel anxious at all” and 100mm. on the scale corresponding to the statement “I could not feel more anxious”. This gives the linear analogue (LA) score of the STAI score. For ten patients the STAI and LA scores were as follows: Patient (Y) STAI score (X) LA score X

1 2 3 4 5 6 7 8 9 10 20 25 29 33 36 42 45 49 49 59 10 0 37 28 8 47 38 39 94 78

y 2 = 16323,

X

x2 = 22411,

X

xy = 17288.

Fit a least squares regression line for predicting the STAI score given an LA score. Use your fitted regression line to predict the STAI score for an LA score LA = 38 . At the 5% level of significance, perform a test of the hypothesis that the slope of your regression line equals zero. Obtain the sample correlation coefficient between the STAI and LA scores. Comment briefly on the value you obtain. b the Hint: You may assume that for a fitted least squares regression line y = α b + βx estimated slope βb has variance b= Var[β]

n X i=1

σ b2

(xi − x¯)2

where the estimated variance about the fitted line satisfies 1 σ b2 = n−2

n n X X (yi − y¯)2 − βb2 (xi − x¯)2 i=1

i=1

6

!

.

MATH1725

A simple computer program is made up of three sections; preliminary tasks A, one hundred independent looping cycles B, and finishing tasks C. A record of the times the computer program took for each section was kept over a long period. The following statistical information was obtained. Section Preliminary tasks A 100 looping cycles B Finishing tasks C

Mean time (ms) Standard deviation (ms) 5.5 2.5 3.4 2.6 4.5 1.3

Times for sections A and B are independent, as are times for sections B and C. Times for sections A and C have a correlation of 0.2.

Obtain the value of cov (A, C) . Obtain the mean and variance of the total time T = A + B + C to run the program. The times for sections A and C both have normal distributions. Although a single looping cycle does not have a normal distribution, why will a normal distribution with mean 3.4 ms. and standard deviation 2.6 ms. give a good approximation for the time of 100 looping cycles? Calculate the proportion of cases for which the computer program takes a total time: less than 10 ms, more than 20 ms. Six runs of the program gave total running times times (in ms.) 12.0,

10.6,

18.1,

10.6,

10.6,

19.1.

Obtain the sample median, mean, variance and standard deviation of these six times. Construct a dotplot for these values. (Show your dotplot in your answer booklet.)

7

MATH1725

The absenteeism rates in days and parts of days for nine employees of a large company were recorded in two consecutive years. Employee Year 1 Year 2

1 3.0 2.8

2 6.7 5.1

3 11.3 8.4

4 5.0 5.0

5 6 7 8 9.4 15.7 8.0 10.0 6.2 12.2 10.0 6.8

9 9.7 6.0

Total 78.8 62.5

Construct a test of hypothesis to determine if there any evidence that the average absenteeism rate is different for the two years. Carry out your test. Obtain a 95% confidence interval for the mean absenteeism rate in year 2. State all assumptions you have used in your answer to part

above.

Suppose now that nine results for year 1 for one group of employees and nine results for year 2 for a completely different group of employees are made available. Outline your method of analysis for determining whether the average absenteeism rate is different for the two years in this case. Give all the appropriate equations but do not do any numerical calculations.

8

MATH1725

The first table gives Z

x

1 2

e− 2 t dt

0.4

1 √ 2π

−∞ 0.3

(x) = Φ

0.0

0.1

0.2

and this corresponds to the shaded area in the figure to the right. Φ (x) is the probability that a random variable, normally distributed with zero mean and unit variance, will be less than or equal to x. When x < 0 use Φ (x) = 1 − Φ (−x), as the normal distribution with mean zero is symmetric about zero. To interpolate, use the formula

−3

−2

−1

0

x

1

2

3

  x − x1 (x) ≈ (Φx1 ) + Φ (Φx2 ) − Φ (x 1 ) x2 − x1

x

(x) Φ

x

(x) Φ

x

Φ (x)

x

(x) Φ

x

(x) Φ

x

(x ) Φ

0.5000 0.5199 0.5398 0.5596 0.5793

0.6915 0.7088 0.7257 0.7422 0.7580

0.8413 0.8531 0.8643 0.8749 0.8849

0.9332 0.9394 0.9452 0.9505 0.9554

0.9772 0.9798 0.9821 0.9842 0.9861

0.9938 0.9946 0.9953 0.9960 0.9965

0.5987 0.6179 0.6368 0.6554 0.6736

0.7734 0.7881 0.8023 0.8159 0.8289

0.8944 0.9032 0.9115 0.9192 0.9265

0.9599 0.9641 0.9678 0.9713 0.9744

0.9878 0.9893 0.9906 0.9918 0.9929

0.9970 0.9974 0.9978 0.9981 0.9984

0.6915

0.8413

0.9332

0.9772

0.9938

0.9987

−1 The inverse function Φ (p) is tabulated below for various values of p .

p −1 Φ (p)

1.2816

1.6449

1.9600

2.3263

9

2.5758

3.0902

3.2905

MATH1725

t This table gives the percentage points tν (P ) for various values of P and degrees of freedom ν , as indicated by the figure to the right. The lower percentage points are given by symmetry as −tν (P ), and the probability that |t| ≥ tν (P ) is 2P/100 . P/100

The limiting distribution of t as ν → ∞ is the normal distribution with zero mean and unit variance. 0

tν (P )

Percentage points P ν



3.078 1.886 1.638 1.533 1.476

6.314 2.920 2.353 2.132 2.015

12.706 4.303 3.182 2.776 2.571

31.821 6.965 4.541 3.747 3.365

1.440 1.415 1.397 1.383 1.372

1.943 1.895 1.860 1.833 1.812

2.447 2.365 2.306 2.262 2.228

3.143 2.998 2.896 2.821 2.764

3.707 3.499 3.355 3.250 3.169

5.208 4.785 4.501 4.297 4.144

5.959 5.408 5.041 4.781 4.587

1.363 1.356 1.350 1.345 1.341

1.796 1.782 1.771 1.761 1.753

2.201 2.179 2.160 2.145 2.131

2.718 2.681 2.650 2.624 2.602

3.106 3.055 3.012 2.977 2.947

4.025 3.930 3.852 3.787 3.733

4.437 4.318 4.221 4.140 4.073

1.337 1.330 1.323 1.316 1.310

1.746 1.734 1.721 1.708 1.697

2.120 2.101 2.080 2.060 2.042

2.583 2.552 2.518 2.485 2.457

2.921 2.878 2.831 2.787 2.750

3.686 3.610 3.527 3.450 3.385

4.015 3.922 3.819 3.725 3.646

1.303 1.299 1.294 1.290 1.282

1.684 1.676 1.667 1.660 1.645

2.021 2.009 1.994 1.984 1.960

2.423 2.403 2.381 2.364 2.326

2.704 2.678 2.648 2.626 2.576

3.307 3.261 3.211 3.174 3.090

3.551 3.496 3.435 3.390 3.291

10

63.657 9.925 5.841 4.604 4.032

318.309 22.327 10.215 7.173 5.893

636.619 31.599 12.924 8.610 6.869

MATH1725

χ2 This table gives the percentage points χ2ν (P ) for various values of P and degrees of freedom ν , as indicated by the figure to the right, plotted in the case ν = 3 . If X is a variable distributed as χ2 with ν degrees of freedom, P/100 is the probability that X ≥ χν2(P ). √ For ν > 100, 2X√is approximately normally distributed with mean 2ν − 1 and unit variance.

P/100

χν2 (P )

0

Percentage points P ν 2.706 4.605 6.251 7.779 9.236

3.841 5.991 7.815 9.488 11.070

5.024 7.378 9.348 11.143 12.833

6.635 9.210 11.345 13.277 15.086

7.879 10.597 12.838 14.860 16.750

10.828 13.816 16.266 18.467 20.515

12.116 15.202 17.730 19.997 22.105

10.645 12.017 13.362 14.684 15.987

12.592 14.067 15.507 16.919 18.307

14.449 16.013 17.535 19.023 20.483

16.812 18.475 20.090 21.666 23.209

18.548 20.278 21.955 23.589 25.188

22.458 24.322 26.124 27.877 29.588

24.103 26.018 27.868 29.666 31.420

17.275 18.549 19.812 21.064 22.307

19.675 21.026 22.362 23.685 24.996

21.920 23.337 24.736 26.119 27.488

24.725 26.217 27.688 29.141 30.578

26.757 28.300 29.819 31.319 32.801

31.264 32.909 34.528 36.123 37.697

33.137 34.821 36.478 38.109 39.719

23.542 24.769 25.989 27.204 28.412

26.296 27.587 28.869 30.144 31.410

28.845 30.191 31.526 32.852 34.170

32.000 33.409 34.805 36.191 37.566

34.267 35.718 37.156 38.582 39.997

39.252 40.790 42.312 43.820 45.315

41.308 42.879 44.434 45.973 47.498

34.382 40.256 51.805 63.167 96.578

37.652 43.773 55.758 67.505 101.879

40.646 46.979 59.342 71.420 106.629

44.314 50.892 63.691 76.154 112.329

46.928 53.672 66.766 79.490 116.321

52.620 59.703 73.402 86.661 124.839

54.947 62.162 76.095 89.561 128.261

11...


Similar Free PDFs