Title | DAT121 Sample Final Exam |
---|---|
Course | Managerial Statistics |
Institution | Washington University in St. Louis |
Pages | 18 |
File Size | 530.7 KB |
File Type | |
Total Downloads | 13 |
Total Views | 140 |
practice exam...
DAT 121: Managerial Statistics II Sample Final Exam Questions PLEASE READ INSTRUCTIONS FIRST Instructions: Check to see that you have many problems. You have 2 hours to complete the following exam. You may not discuss this exam with anyone (except the proctor). By printing or signing your name on this exam, you reaffirm your pledge to uphold the Washington University honor code. Show all of your work to receive full credit, but be succinct. Partial credit is given when deserved. Answers should be in well-written English sentences. If you think you need more space than what I have provided, please think again. You are allowed to use the other side of the pages to complete your answers and make calculations. Please keep the exam stapled together. The only material you are allowed is one sheet of 8.5” x 11” (both sides). Use only a non-programmable calculator.
If you cannot answer part of a question, and a subsequent question requires the previous result, make some assumptions so that you can answer the subsequent question. Make sure to read all questions carefully and answer everything I ask! ALL OF YOUR WORK MUST BE TRANSCRIBED TO THIS EXAM. THIS PAPER IS WHAT I WILL GRADE. GIVE ME A ROADMAP OF WHAT YOU DID! I DO NOT RETURN EXAMS!
GOOD LUCK! NAME __________________________
Section:
(As it appears on official Wash U records)
SIGNATURE ____________________________ A formula sheet is included at the end of this exam, as are relevant tables. In some cases part of the regression output is not included. It is safe to assume that the omitted information is not important. Note: This only covers material after the midterm. The material from the beginning of the semester is also relevant for the final exam. The point allocation is intended to give you insight into the weight of each question. The points on your exam will add up to 100.
DAT 121 – Sample Final
In addition to these problems, go over homework questions and practice homework questions! Some of those are from past exams, and ARE NOT included here. Answers to the homework assignments are posted on Blackboard. 1. 10 Points - Based upon data across different franchises, you have developed the following linear regression between Sales (measured in millions of dollars), Advertising (measured in millions of dollars) and Price (measured in dollars): Sales = 25.3 – 27.0 Advertising + 3.65 Price Predictor Constant Advert Price S = 3.354
Coef 25.332 -26.932 3.649
StError 6.678 12.345 1.134
R-Sq = 99.2%
T 3.79 -2.18 3.22
P 0.008 0.125 0.113
R-Sq(adj) = 97.8%
(a) Diagnose this regression. What problem might this regression be suffering from? (3 Points) (b) How would you determine if your guess is correct? Explain. – (4 Points) (c) Suggest several ways to fix this model – (3 Points).
[2]
DAT 121 – Sample Final
2. 5 Points: (a) When you plot your data, you get the following picture:
What regression model would fit this data the best? 1 (i) y 0 1 x (ii) y 0 1x 2x 2 (iii) y 0 1e x (iv) y 0 1x
[3]
DAT 121 – Sample Final
3. 10 Points – You decide on the following form for modeling economic data: y i 0 1x i i After you perform the regression, you plot the residuals vs. x to see if this model suffers from mis-specification. The plot looks like:
On the basis of this information alone, write down the form for a new polynomial regression equation that might be a “better model”.
[4]
DAT 121 – Sample Final
4. 16 Points - Assume that you’ve been supplied with the following data about sales person performance in your company: Sales ($K) 480 507 210 246 148 50 72 99 160 243 200 1000 350 550 500 1100 416 650 292
Advertising ($K) 60 37.5 27.5 8 8.5 13.5 12.5 15 8.5 15 22.5 150 65 41 100 90 40 34.5 42.5
Price ($) 35 59 40 43 45 40 85 43 29 38 42 20 31 50 50 52 40 63 42
R&D ($K) 20 15 4.6 2.8 4.4 5 5 13 10 5 6.4 24 11.1 12.5 12 11.1 9.2 9 15
Gender 0 0 1 0 1 1 1 0 1 0 1 0 0 0 1 0 0 1 0
You decide to develop the following multiple linear regression model:
Sales 1 2 Adverti sin g 3Price 4 R & D 5Gender Before you actually do the regression, you decide to create a correlation matrix between all of the independent variables. The result: Price R&D Gender
Adv 0.761 - 0.937 - 0.295
Price
R&D
- 0.321 0.322
- 0.483
(a) Based upon the output above, would you expect to see multicollinearity in this model? Why? (2 Points) (b) Which of the independent variables, if any, are probably the ones that will cause multicollinearity to occur? Why? (2 Points) (c) List the practical consequence(s) of multicollinearity in a multiple regression model (5 Points) (d) Suggest at least one way to “fix” apparent multicollinearity in a multi-variable regression model. (4 Points) [5]
DAT 121 – Sample Final
(e) In general, once a regression has been run, what “clues” are present in the computer output that would suggest that your model may be suffering from multicollinearity? (3 Points)
[6]
DAT 121 – Sample Final
5. 6 Points – You are trying to develop a regression model to relate output to hours worked. When you plot your data, you get the following picture:
What potential regression model would fit this data the best? (i) y 0 1x 2 x 2 (ii) y 0 1e (iii) y 0 1x
x
[7]
DAT 121 – Sample Final
6. 12 Points – You were given the following data:
Advertising ($M) 0.5 0.8 0.7 0.6 0.9 1.1 0.4 0.1 0.3 1.0 0.2
Sales ($M) 4.47 4.81 4.95 3.60 4.34 5.96 3.06 0.87 2.73 5.83 2.00
You then created the following two models for forecasting Sales: #1
Sales 0 1 Adv
Sales = 1.17 + 4.51 Adv
ANOVA df Regression Residual Total Predictor Constant Adv S = 0.5792
SS MS F Significance F 1 22.34706 22.34706 66.62335 1.88E-05 9 3.018815 0.335424 10 25.36587 Coef 1.1702 4.5073
SE Coef 0.3745 0.5522
R-Sq = 64.1%
T 3.12 8.16
P 0.012 0.000
R-Sq(adj) = 66.8%
#2 ln Sales 0 1 Adv LnSales = 0.351 + 1.48 Adv Predictor Constant Adv S = 0.2955
Coef 0.3511 1.4817
SE Coef 0.1911 0.2817
R-Sq = 85.4%
T 1.84 5.26
P 0.099 0.001
R-Sq(adj) = 82.7%
Perform the necessary calculations to determine which model is more accurate.
[8]
DAT 121 – Sample Final
7. 5 Points – Discuss, in detail, the best reason why the dummy (indicator) variable I2 might be in the following model: y 0 1 x1 2 I 2 3 x1I 2
[9]
DAT 121 – Sample Final
8. 6 Points – Given the following non-linear relationship: y Ax 1ax 2bx 3c
(a) Transform this model so that a multiple linear regression could be performed. (3 Points) (b) Interpret the exponents “a”, “b” and “c” – (3 Points)
[10]
DAT 121 – Sample Final
9. 5 Points – What would you do, in detail, to determine if the following simple linear regression model suffered from “model mis-specification”: y i 0 1x i i
[11]
DAT 121 – Sample Final
10. 20 Points – Using the data in the table below, and the models developed from it, demonstrate via direct calculation, which of the two models forecasts Profit most accurately: Profit
R&D
13.41 14.43 14.85 10.80 13.02 13.41
1.5 2.4 2.1 1.8 2.7 1.5
(#1): y 0 1x 1 Profit = 11.3 + 0.95 R&D Predictor Constant R&D
Coef SE Coef 11.307 3.956 0.950 0.271
S = 1.752
R-Sq = 68.1%
T 2.86 3.51
P 0.065 0.001
R-Sq(adj) = 73.1%
(#2): ln y 0 1 ln x1 LnProfit = 2.47 + 0.159 LnR&D Predictor Constant LnR&D S = 0.1379
Coef 2.4675 0.1587
SE Coef 0.2227 0.0745
R-Sq = 48.7%
T 11.08 2.13
P 0.002 0.031
R-Sq(adj) = 53.1%
[12]
DAT 121 – Sample Final
11. 15 Points – Investment analysis has discovered that a statistically significant relationship exists between the expected rate of return on a specific stock or security (R, measured in percent) and the rate of return on a market index such as the S&P 500 ( Rm , measured in percent): R 0 1 R m The coefficient ( 1 ) is known as the beta coefficient of the stock and is used as a measure of the market risk of the stock, that is, how changes in the market affect the stock price of an individual company. Based upon 24 monthly rates of return for the time period 2000 - 2001, the following regression was obtained for a particular common stock: R = 6.80 + 0.76 Rm Predictor Constant Rm S = 1.8247
Coef 6.8013 0.7589
StError 1.1932 0.1664
R-Sq = 85.1%
T 5.700 4.561
P 0.000 0.000
R-Sq(adj) = 83.7%
A stock whose beta coefficient is less 1.0 is called a non-aggressive growth stock. Test, at a 5% level of significance, whether this common stock can be considered a nonaggressive growth stock. Assume that prior to 2000 the stock was an aggressive growth stock.
[13]
DAT 121 – Sample Final
12. 20 points – Our friend, Silly Statistician, just finished learning statistics and decided that Multiple Linear Regression was the solution to all his problems. To validate the concepts of MLR he decided to evaluate the Pythagorean Theorem of right triangles. He collected the following data relating the length of the hypotenuse to the length of the legs of a right triangle. Note these measurements are the results of correctly applying the theorem.
Average Standard Deviation Variance
Leg 1 Leg 2 3 5 6 9 9 9 12 12 15 15 18 18 21 21 24 24 27 27 30 30 3 4 6 8 9 12 12 16 15 20 18 24 21 28 24 32 27 36 30 40 16.5 19.5 8.8407 10.4655 78.1579 109.5263
Hypotenuse 5.830951895 10.81665383 12.72792206 16.97056275 21.21320344 25.45584412 29.69848481 33.9411255 38.18376618 42.42640687 5 10 15 20 25 30 35 40 45 50 25.613 13.5632 183.9596
a) What is the correct functional form for this data? Instead of applying the correct model, Silly decided to perform and evaluate a linear model for this data. y 0 1x 1 2x 2 This generated the following output. Regression Statistics Multiple R 0.999999113 R Square 0.999998226 Adjusted R Square 0.999998017 Standard Error 0.019097331 Observations 20 [14]
DAT 121 – Sample Final ANOVA df Regression Residual Total
Intercept Leg 1 Leg 2
2 17 19
SS 3495.226312 0.006200037 3495.232513
Coefficients Standard Error 0.023987966 0.009307079 0.654998212 0.001623326 0.75804039 0.001371301
MS F Significance F 1747.613156 4791814.217 1.30558E-49 0.000364708
t Stat P-value 2.57738926 0.019573342 403.4915434 2.75168E-35 552.7891148 1.3045E-37
Lower 95% Upper 95% 0.004351745 0.043624186 0.651573294 0.65842313 0.755147198 0.760933583
RESIDUAL OUTPUT Predicted Hypotenuse Observation 1 5.779184553 2 10.77634075 3 12.74133539 4 16.98045119 5 21.219567 6 25.4586828 7 29.69779861 8 33.93691442 9 38.17603022 10 42.41514603 11 5.021144162 12 10.01830036 13 15.01545656 14 20.01261275 15 25.00976895 16 30.00692515 17 35.00408134 18 40.00123754 19 44.99839374 20 49.99554993
Residuals 0.051767342 0.040313077 -0.013413324 -0.009888443 -0.006363563 -0.002838682 0.000686199 0.004211079 0.00773596 0.01126084 -0.021144162 -0.018300359 -0.015456556 -0.012612753 -0.00976895 -0.006925147 -0.004081344 -0.001237541 0.001606262 0.004450065
Standardized Residuls 2.710710885 2.110927304 -0.702366421 -0.517791903 -0.333217385 -0.148642866 0.035931652 0.22050617 0.405080688 0.589655207 -1.10717895 -0.958267924 -0.809356898 -0.660445872 -0.511534846 -0.36262382 -0.213712794 -0.064801768 0.084109258 0.233020284
b) From a statistical perspective discuss the quality of this model c) Why is this model not a valid test to validate the Pythagorean Theorem? Give an out-of-sample example that is inconsistent with the results of the regression model.
[15]
DAT 121 – Sample Final
13. 20 points – A group of students developed a regression model using time series of various macro-economic factors to predict the performance of the stock market. Their hypothesis was that GDP, Inflation (as measured by the CPI), and Consumer Confidence (CCI) are all important in determining the value of the stock market. The data they collected is: S&P 500 Index
US GDP
Consumer Price Index
Consumer Confidence Index
(Last day of October*)
(Current Dollars billion)
(October*)
(October*)
C
D
E
S&P
GDP
CPI
CCI
1982
132.7
$3,255.00
98.2
54.3
1983
167.7
$3,536.70
101
92.1
88.08
1,583.70
42.08
59.52
1984
164.8
$3,933.20
105.3
99.1
64.18
1,811.18
44.7
43.84
1985
186.2
$4,220.30
108.7
96.1
87.32
1,860.38
45.52
36.64
Year
B
A
F
G
H
I
Change in Change in Change in Change in S&P GDP CPI CCI
1986
237.4
$4,462.80
110.3
85.8
125.68
1,930.62
45.08
28.14
1987
280.2
$4,739.50
115.3
115.1
137.76
2,061.82
49.12
63.62
1988
277.4
$5,103.80
120.2
116.9
109.28
2,260.10
51.02
47.84
1989
347.4
$5,484.40
125.6
117
180.96
2,422.12
53.48
46.86
1990
307.1
$5,803.10
133.5
62.6
98.66
2,512.46
58.14
-7.6
1991
386.9
$5,995.90
137.4
60.1
202.64
2,514.04
57.3
22.54
1992
412.5
$6,337.70
141.8
54.6
180.36
2,740.16
59.36
18.54
1993
27.74
463.9
$6,657.40
145.7
60.5
216.4
2,854.78
60.62
1994
463.8
$7,072.20
149.5
89.1
185.46
3,077.76
62.08
52.8
1995
582.9
$7,397.70
153.7
96.3
304.62
3,154.38
64
42.84
1996
701.5
$7,816.90
158.3
107.3
351.76
3,378.28
66.08
49.52
1997
951.2
$8,304.30
161.6
123.4
530.3
3,614.16
66.62
59.02
1998
1032.5
$8,747.00
164
119.3
461.78
3,764.42
67.04
45.26
1999
1300
$9,268.40
168.2
130.5
680.5
4,020.20
69.8
58.92
2000
1390.1
$9,817.00
174
135.8
610.1
4,255.96
73.08
57.5
2001
1076.6
$10,128.00
177.7
85.3
242.54
4,237.80
73.3
3.82
Based on the tools they learned in class, they developed two regression models. Model #1: is a model of the nominal values: YS & P ,t 0 1 X GDP,t 2 X CPI , t 3 X CCI, t t [16]
DAT 121 – Sample Final Regression Statistics Multiple R 0.974988118 R Square 0.950601831 Adjusted R Square 0.941339674 Standard Error 95.73639205 Observations 20 ANOVA df Regression Residual Total
Intercept GDP CPI CCI
3 16 19
SS MS F Significance F 2822031.72 940677.2399 102.6328817 1.1574E-10 146647.3082 9165.456762 2968679.028
Coefficients Standard Error 601.2331439 510.7174917 0.423201919 0.095649098 -20.99271886 7.627331892 1.243352632 1.129463041
t Stat 1.177232332 4.424526003 -2.752301743 1.100835164
P-value Lower 95% Upper 95% 0.256318772 -481.4395732 1683.905861 0.000425047 0.22043489 0.625968948 0.014166869 -37.16194016 -4.823497564 0.287254706 -1.151002054 3.637707318
Model #2: is a model of the differences across years, using = 0.6: YS & P ,t 0 1 X GDP, t 2 X CPI , t 3 X CCI, t t
Here represents the difference between adjacent years, using = 0.6 For example: YS&P,t = YS&P,t – 0.6*YS&P,t-1 (shown in Columns F – I) Also written as: yt yt 1 0 1 x1, t x1, t 1 2 x2, t x2, t1 ... k xk ,t x k , t1 u t [17]
DAT 121 – Sample Final Regression Statistics Multiple R 0.919152748 R Square 0.844841774 Adjusted R Square 0.813810128 Standard Error 80.48973712 Observations 19 ANOVA df Regression Residual Total
Intercept Change GDP Change CPI Change CCI
3 15 18
SS MS F Significance F 529142.7505 176380.9168 27.22516859 2.56805E-06 97178.96673 6478.597782 626321.7173
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% -274.1074603 316.1389637 -0.86704738 0.399574058 -947.9417109 399.7267902 0.236494709 0.130899256 1.806692537 0.090909333 -0.042510451 0.515499868 -4.625242252 11.20250006 -0.412875896 0.685538818 -28.50280592 19.25232141 3.181541279 1.08875366 2.922186529 0.010510737 0.860917785 5.502164773
Question: Using the appropriate model, from the 2 alternatives above, predict the S&P index in October 2002, if the values of the independent variables are GDP = $10,400; CPI = 185; CCI = 87, still using = 0.6.
[18]...