Title | Exam May 2013, questions and answers |
---|---|
Author | Tendayi Sanyangare |
Course | Business Statistics |
Institution | University of the Witwatersrand, Johannesburg |
Pages | 13 |
File Size | 336.3 KB |
File Type | |
Total Downloads | 12 |
Total Views | 139 |
2013 semester 1 exam...
Page 1 of 19 pages
STAT1004
May 2013
INSTRUCTIONS TO STUDENTS 1. Only a soft, dark pencil may be used on the computer card. 2. Your name and student number should be entered on the computer card. 3. All students should enter their group number as a zero. 4. The student bears full responsibility for completing the computer card in the correct manner. 5. NO EXTRA TIME WILL BE PERMITTED FOR FILLING IN COMPUTER CARDS. 6. The multiple choice questions are those that are provided with sets of answers labelled (a), (b), (c), (d) and (e). The question number corresponds to the row number on the computer card. Multiple choice questions are labelled MCQ.
7. Each multiple choice question has only one correct answer. Each multiple choice question is worth 4 marks. Each multiple choice question has 5 possible answers. The correct answer scores +4, incorrect or multiple answers score –1, and an unanswered question scores zero. 8. All questions are to be answered (in full) in the answer book provided. Rows on the computer card corresponding to questions that are not multiple choice should be left blank.
9. All rough work and working for multiple choice questions is to be done in the answer book provided.
10. Final answers must be rounded off to 3 decimal places unless otherwise instructed.
1
Page 2 of 19 pages
STAT1004
May 2013
Question 1 (4 marks) – MCQ A new suburb outside Johannesburg has experienced rapid growth in the last 2 years, mainly because of affordable housing and the relocation of many businesses to surrounding areas. Table 1 shows the age distribution of a sample of residents between the ages of 14 and 64 years. Age (X)
Frequency
14 ≤ x < 24
50
24 ≤ x < 34
362
34 ≤ x < 44
240
44 ≤ x < 54
79
54 ≤ x < 64
31
Table 1: Age distribution of residents. Which one of the following statements is false? a) The estimated mean age of residents is approximately 34,79 years. b) The estimated variance of ages is approximately 82,76. c) The modal class is 24 ≤ x < 34. d) The 76th percentile is approximately 40,995. e) Approximately 86% of residents are older than 45 years.
Question 2 (4 marks) – MCQ Many employers are finding that some of the people they hire are not who and what they claim to be. This has led to a boom of companies providing credential checking services. Suppose that your company is recruiting new staff and you receive five applications for the advertised position. Further, suppose that the probability that an applicant would falsify information on his/her application form is 0,15. Assume that applicants falsify information on their application forms independently of each other. What is the probability that at least one of the five application forms has been falsified? a) 0,649 b) 0,556 c) 0,392 d) 0,165 e) 0,150 2
Page 3 of 19 pages
STAT1004
May 2013
Question 3 (4 marks) – MCQ In a state where cars have to be tested for the emission of pollutants, 25% of all cars emit excessive amounts of pollutants. When tested, 99% of all cars that emit excessive amounts of pollutants will fail, but 17% of cars that do not emit excessive amounts of pollutants will also fail. Calculate the probability that a car that fails the test actually emits excessive amounts of pollutants. a) 0.660 b) 0.625 c) 0.375 d) 0.253 e) 0.248
Question 4 refers to the following situation: The number of customers that purchase hotdogs from a vendor on campus, recorded over a four-week period, is shown in Table 2. The vendor does not work over weekends. Figure 1 is a plot of the data. Table 2 gives relevant figures for decomposition.
Figure 1: Plot of number of customers versus time worked (in days).
3
Page 4 of 19 pages Week
Day
STAT1004
Time
May 2013
Number of
Isolated
De-trended
Adjusted
customers
trend
data
seasonal indices
1
2
3
4
Mon
1
30
–9,75
Tue
2
35
–3,88
Wed
3
39
39,60
–0,60
–0,41
Thur
4
44
A
B
C
Fri
5
50
40,40
9,60
9,79
Mon
6
32
40,60
–8,60
–9,75
Tue
7
37
41,00
–4,00
–3,88
Wed
8
40
41,40
–1,40
–0,41
Thur
9
46
41,60
4,40
Fri
10
52
42,20
9,80
9,79
Mon
11
33
43,00
–10,00
–9,75
Tue
12
40
43,60
–3,60
–3,88
Wed
13
44
44,20
–0,20
–0,41
Thur
14
49
44,60
4,40
Fri
15
55
45,00
10,00
9,79
Mon
16
35
45,60
–10,60
–9,75
Tue
17
42
46,00
–4,00
–3,88
Wed
18
47
46,40
0,60
–0,41
Thur
19
51
Fri
20
57
9,79
Table 2: Decomposition analysis figures of the number of customers
Question 4 (4+2+2+2 = 10 marks) 4.1 (4 marks) Calculate the missing values labelled A, B, C in the table. Show all working.
4.2 (2 marks) Suppose that the estimated trend line is Tˆt = 38,082 + 0,459t. Forecast the number of customers that will arrive on Tuesday of week 6.
4
Page 5 of 19 pages
STAT1004
May 2013
4.3 (2 marks) Use Brown’s exponential smoothing method (with smoothing constant 0,3) to forecast the number of customers that will arrive on Tuesday of week 6. Use the fact that the predicted value for Friday of week 4 is 46,27. 4.4 (2 marks) Which of the forecasts from 4.2 and 4.3, do you think is more appropriate to use? Please state a reason for your choice. Question 5 (2+2+2 = 6 marks) The number of policies sold, for each 5 year period from 1980 to 2004, by a Life Insurance Company appear in Table 3. 5 year period
Number sold (in millions)
Period (in coded form)
Jan 1980 – Dec 1984
1,0
1
Jan 1985 – Dec 1989
1,3
2
Jan 1990 – Dec 1994
1,7
3
Jan 1995 – Dec 1999
1,9
4
Jan 2000 – Dec 2004
2,1
5
Table 3: Number of policies sold per period by a Life Insurance company. 5.1 (2 marks) Give the estimated regression line by regressing the number of policies sold (in millions) on Period (in coded form). 5.2 (2 marks) Calculate and interpret the correlation coefficient. 5.3 (2 marks) Does the interpretation of the slope of the estimated regression line correspond with the interpretation of the correlation coefficient in 5.2 above? Substantiate your answer. Question 6 (4 marks) The car sales at a car dealership follow a Poisson distribution with the average number of cars sold per day being two cars. At the start of business on Monday morning, the dealership has 3 cars for sale and expects a consignment of a further car on the Friday morning, before the start of business. Calculate the probability that at most 2 cars are sold by the end of the business day on Wednesday.
5
Page 6 of 19 pages
STAT1004
May 2013
Question 7 (5 marks) James travels by minibus taxi every morning. He catches a taxi at a bus stop on a busy road. He has noticed that, on average, three taxis approach the bus stop every five minutes, but only 20% of them stop as they are not full. Find the probability that James has to wait at least 10 minutes for a taxi to stop for him. (Assume that any taxi that is not full will stop.) Question 8 (5+5 = 10 marks) A manufacturer of model airplanes knows that battery lifetime on their remote control is approximately normally distributed with a mean of 2000 hours and a standard deviation of 100 hours.
8.1 (5 marks) What is the probability that a randomly chosen remote control battery has a lifetime between 2000 and 2075 hours? 8.2 (5 marks) In a sample of 50 model airplanes, what is the probability that the average battery lifetime on their remote controls exceeds 1980 hours? Question 9 (7 marks) Ten years ago an advertising agency took a random sample of 6 personal computer owners who used brands such as Dell, Gateway, Hewlett Packard, IBM and Apple. At the time, the agency recorded a satisfaction score, with a maximum of 100 possible points. Recently the same people were contacted and scored again. The scores are represented in Table 4 below. Using a 5% level of significance, test whether on average, the satisfaction score has changed over time. Assume that the satisfaction scores are normally distributed. Past
60
68
76
69
89
85
Present
80
90
88
60
80
66
Table 4: Customer satisfaction scores.
6
Page 7 of 19 pages
STAT1004
May 2013
Question 10 (3 marks) An executive of a new telephone company collected a sample of 25 evening longdistance calls and finds, that the average duration of the calls is 17,2 minutes with a sample standard deviation of 4 minutes. Assuming that the duration of calls is well approximated by a normal distribution; calculate the 95% confidence interval for the average duration of evening long-distance calls. Question 11 (6 marks) An executive believes that at least 50% of shoppers entering a department store recognise the company’s brand name, as was true in the past. In a random sample of 25 shoppers, 10 shoppers recognise the brand name. Formulate the appropriate hypothesis to test if this percentage has decreased. Use a p-value to arrive at your conclusion. Question 12 (7 marks) An advertising agency wants to know whether there is an association between the sex of consumers and their coffee brand preference. The answer will determine whether different advertisements must be created for men’s and women’s magazines. Carry out the appropriate test on the data summarised in Table 5 using a p-value. Brand Preference Sex
A
B
C
Total
Male
18 (30)
25
17
60
Female
32 (20)
5
3
40
Total
50
30
20
100
Table 5: Preference indicated by coffee drinkers.
Note: Only some of the expected values have been calculated and they are indicated by the values in brackets.
7
Page 8 of 19 pages
STAT1004
May 2013
MEMO MAY 2013
Question 1 – MCQ (4 marks) E Question 2 – MCQ (4 marks) B Question 3 – MCQ (4 marks) A Question 4 (4+2+2+2=10 marks) 4.1 (2+1+1=4 marks)
Y2 Y3 Y4 Y5 Y6 5 35 39 44 50 32 5 40 A MAV4
( 2 marks)
B 44 40 4
(1 mark)
C 0 [ 9.75 ( 3.88) ( 0.41) 9.79) 4.25
(1 mark)
4.2 (2 marks)
Fˆ27 Tˆ27 Sˆ27 [38.082 0.459(27)] ( 3.88) 46.595 (1 mark for t=27) and 1 mark for correct use of forecasting equation and answer. 4.3 (2 marks)
Fn k An A20 0.3(57) 0.7(46.27) 49.489
8
Page 9 of 19 pages
STAT1004
May 2013
4.4 (2 marks) The decomposition anaylsis forecast is more appropriate (1 mark) as it takes the trend and seasonality into account in the forecast whereas Brown’s methods does not. (1 mark)
Question 5 (2+2+2= 6 marks) 5.1 (2 marks) yˆ 0.76 0.28x 5.2 (2 marks) r 0.989949493. Therefore, a strong positive linear relationship exists
between sales and the period .
5.3 (2 marks) Yes (1 mark), the regression line has a positive slope, b 0.28 and r 0.98 which is also positive. Therefore, both the slope and the correlation
coefficient are positive.(1 mark) Question 6 (4 marks) Let X = number of cars sold by dealership on a random day. X ~ POI (2) Let Y = number of cars sold by dealership from start of Monday until end of business on Wednesday. Y ~ POI (6) (1 mark for correct lambda)
P( at most 2 cars sold by Wednesday) 1 mark P(Y 2) 6 0 6 1 6 2 e 6 e 6 e 6 1 mark for eq 0! 1! 2! 1 mark for answer 0.062
** wrong lambda but correct method = 3 Question 7 (5 marks)
3 in 5 min s ? in 1 min
(1 mark)
Thus ? = 0.6 but only 20% of taxis stop, hence
9
0.12
(1 mark)
Page 10 of 19 pages
STAT1004
P( wait for at least 10min s) P( X 10) 0.12(10)
e 0.301
May 2013
(1 mark ) (1 mark ) (1 mark)
****if student uses lambda=0.6 and doesn’t calc correct lambda =0.2*0.6; give 4 marks( for correct method and calcs)****** OR a student could solve using Poisson:
3 in 5 min s ? in 10 min
(1 mark)
Thus ? = 6 but only 20% of taxis stop, hence
1.2
P( wait for at least 10min s) P(0 taxis arrive in 10min s) e 1.21.2 0 0! 0.301
(1 mark ) (1 mark)
(1 mark)
Question 8 (5 + 5 = 10 marks) 8.1(5 marks)
X ~ N (2000,100 2 ) P(2000 X 2075)
(1 mark)
2075 2000 2000 2000 P Z 100 100 P(0 Z 0.75) 0.7734 0.5 0.2734
(1 mark) (1 mark ) (1 mark) (1 mark)
10
(1 mark)
Page 11 of 19 pages
STAT1004
May 2013
8.2 (5 marks)
= P( X 1980) 1980 2000 P Z 100 50 P( Z 1.41) P( Z 1.41) 0.9207
(1 mark)
(1 mark )
(1 mark) (1 mark) (1 mark)
Question 9 (7 marks)
1. H0 : past present = 0 versus H1 : past present 0
(1 mark )
2. 0.05 3. Value of test statistic d 0 2.833 = (1 mark for d and sd ) 17.337 sd 6 n = 0.400; (1 mark) =
4. Critical value = 2.5706. (1 mark) Thus reject null hypothesis if t 2.5706 or if t 2.5706
(1 m
5. Since -0.4003 does not fall into the reject region, we fail to reject the null hypothesis. (1mark ) 6. At a 5% level of significance, we conclude that insufficient evidence exists to support H1 that the average satisfaction index has changed.
(1 mark )
Note: If they do as 2 independent sample – give 3.5 method marks
Question 10 (3 marks)
x t24,0.025
s n
4 25 ( 15.549, 18.851) 17.2 2.0639
(1 mark) (1 mark ) minutes
(1 mark)
11
Page 12 of 19 pages
STAT1004
May 2013
Question 11 (6 marks)
1. H 0 : p 0.5 vs H1 : p 0.5 (1 mark) 2. Value of the test statistic ˆp p z p(1 p) / n 10 0.5 25 (1 mark) 0.5(1 0.5) / 25 1
(1 mark)
3. p value P ( Z 1) 0.1587 .(1 mark) 4.Since the p-value lies between 0.1 and 0.2, we h ave very weak evidence in favour of the alternative hypothesis.(1 mark) 5.and conclude that weak evidence exists to support that the percentage has decreased. (1 ma Question 12 (7marks) H0: There is no association between sex and brand preference of coffee H1 : There is an association between sex and brand preference of coffee [1 mark]
Table of expected frequencies [2] Preference Sex
A
B
C
Male
30
18
12
Female
20
12
8
Total
50
30
20
12
Page 13 of 19 pages
STAT1004
May 2013
2 [(18 30)2 ] / 30 ..... [(3 8)2 ] / 8 4.8 2.72 2.08 7.2 4.08 3.13 24.01 2 P value = P( 24.01) ~ (2 1)(3 1) 2
[1 ]
[1 ]
At 2 d.o.f since 24.01> 10.597, the p-value will be < 0.005, we reject the null hypothesis in favour of the alternative. [1] Hence we conclude that sufficient evidence exists to support the hypothesis that there is an association between sex and coffee brand preference. [1]
13...