Title | Statistics Final Exam |
---|---|
Author | charlie O |
Course | Managerial Statistics |
Institution | Wilfrid Laurier University |
Pages | 14 |
File Size | 1013.6 KB |
File Type | |
Total Downloads | 39 |
Total Views | 156 |
Download Statistics Final Exam PDF
Statistics Final Exam The content of the examination is as follows • Weeks 9-12 • Chapter 11-14 • Check the course outline for textbook sections and topics • PPT files within the weeks 9-12 folders in My LS • Wiley Plus quizzes 8-10 Week 9 Chapter 11: Analysis of Variance One-Way Analysis of Variance (ANOVA) Evaluate the difference among the means of three or more populations Examples
- Accident rates for 1st, 2nd, and 3rd shift - Expected milage for five brands of tires
Key Assumptions
- Populations are normally distributed - Populations have equal variance - Samples are randomly and independently drawn * Uses the “f table” “are they statistically the same or statistically different” Hypothesis of One-Way ANOVA All populations mean are equal - No variation in means among populations At least one population mean is different - Does not mean that all population means are different - Some populations pairs may be the same One-Way ANOVA
Not Rejected
Partitioning the Variation
Rejected
*SS = sum of squares
SST = Total sum of squares Aggregate dispersion of the individual data values across the various populations SSC = Treatment sum of squares (between sample) Dispersion among the sample mean SSE = Error sum of squares (within sample) Dispersion that exists amount the data values within a particular population
SST = SSC + SSE One-Way ANOVA Table C = Number of populations (how many groups there are) NT = Sum of the sample sizes from all the populations Df = Degrees of Freedom
*Use 1 Tail test (Use alpha) Means Square (MS) Calculations = “average” of the SS calculations
One-Way ANOVA F Test Statistic
Test Statistic F = MSC / MSE Degree of freedom V1= dfC = C - 1 V2= dfE = nT – C
Note: The chart given will be incomplete. Remember things like rearranging SST = SSC + SSE To find the unknown
One-Way ANOVA Interpretation The F statistic is the ratio of the between estimate of variance and the within estimate of variance The ration must always be positive
V1= dfC = C – 1 will typically be small V2= dfE = nT – C will typically be large The ratio (The Observed F Value) is close to 1 you do not reject
usually The ratio (The Observed F Value) is larger than 1 you reject usually
EXAMPLE #1
Means
SSC = 5 [(249.4 – 227 ) = 4716.4 Do not nee d to cal c thi s we wil l be giv en it
SSE = (254 – 249.2) = 1119.6
2
2
n1= C = 35 n2 = 5 n3 = 5 nT = 15
+ (226 – 227 )
+ (236 – 249.2)
2
2
+ (205.8 – 227 ) 2]
+ … + (204 – 249.2)
2
SST = 4716.4 + 1119.6 = 5836 MSC = SSC / (C – 1) = 4716.4 / (3 – 1) = 2358.2 MSE = SSE / (nT – C) n = 1119.6 / (15 – 3) = 93.2 F = MSC / MSE = 2358.2 / 93.3 = 25.275 (calculated F)
Critical F: Using the F value chart… Alpha = 0.05 V1 = (C – 1) = 2 V2 = (nT – C) = 12 ANS = 3.89 *because the calculated value is >1 then Then we will most likely will reject (wb^ EXAMPLE #2 ANSWER:
EXAMPLE #3
Fill in the ?... - MSC = SSC / (C-1) = 0.2366 / (4-1) = 0.0789 - F = MSC / MSE = 0.0789 / 0.0077 = 10.2468 - Critical F = 3.10 What us the statistical Conclusion? - Reject it ? Week 9 Chapter 16: Goodness-of-Fit Test & Test of Independence X2 Goodness-of-Fit Test
We are testing if “it came from a uniform distribution” df = k – 1 – c F0 = frequency of observed values (Given in the table) fe = frequency of expected values (Needs to be calculated) k = number of categories (ex; if months, k = 12) c = number of parameters estimate (Always 0 in uniform) *products that do not have a seasonality, they are uniformly distributed (Sell the same no matter the time of year) Ex: Toilet paper, Milk, Toothpaste etc. *Always one tail *Alpha always critical probability (ex; 0.01)… (or alpha / 2 for two tailed ??) fe = 18,447 / 12 = 1,537.25 per mon X2 ¿
( 1,610 – 1,537.2 1,537.25
= 3.44 … Cont. down
Use this on the X2 Chart: X2 a = 0.01 X2 df = 11 On Table = 24.725
Critical X2 = 24.725 Calculated X2 = 74.37
ANS: We Reject because calculated X2 (73.37) is bigger the Critical X2 (24.7) X2 Test of Independence “will tell you if the statement is independent or dependent”
F0 = frequency of observed values (Given in the table) fe = frequency of expected values that make the values independent (Needs to be calculated) df = (r-1)(c-1) r = Number of rows c = Number of columns EXAMPLE #1
•r=4 •c=3 • df = (4-1)(3-1) =6 *Unexpected value if they were in dependent
You then total each Colum and each row- as well as a grand total
Then Use the formula to solve e… Total of row x Total of Colum / Grand Total = 66.15 Once you have done the expected for each cell it should look like this
Then you use the formula, and calculated x2 for each cell 2
( 85 – 66.15 ) 66.15 You do this for all squares and then add up the ANS x 2=
FINAL CALCULATED VALUE = 70.78 Do the Test: X2 a = 0.01 X2 df = 6
Use the X2 Chat
FINAL CRITICAL VALUE = 16.81
* We Reject Because Calculated (70.7) is bigger the then Critical (16.81) Meaning it is NOT Statistically independent (it is dependent)
Week 10 Chapter 12: Correlation and simple regression analysis Relationship Between Variables • Ex: Relationship between cost and price profit of a hotel and number of motels near by… etc. Y = Dependent X = Independent Linear Relationship
Positive :
Negative:
Coefficient of Correlation “r” • Measures strength and direction of a linear relationship between two variables • It ranges from -1 to 1, the closer to -1 or 1 means it is a stronger correlation where the closer to 0 means a weak correlation r=
SSxy √(SSxx)( SSyy )
Coefficient of Determination “r2 ” • Measures Strength of a linear relationship between two variables 2 2 r =( r )
Examples of Approximate Values: Example 1 2 r =1 • Unlikely to happen • Perfect linear relationship between x and y
Example 2
Example 3
Correlation and Causation: • Correlation between variables does not imply that change in one variable is the cause of change in another variable • Causation Indicates that one event is result of the occurrence of another event. There is a causal relationship between the two events, also referred to as cause and effect Simple Linear Relationship ^y =b 0 +b1 x =Predicted Equation
´y = Average of Dependent Variables ´x = Averageof independent Variables
b1=
∑(x−´x )( y − ´y ) 2
∑ ( x− x´ ) b0 = ´y −b 1 ´x =Slope
=
S S xy =Intercept S Sx x
e = y − ^y =Residule(difference between ibserevd ∧predicited ) *in a question the numbers may be under “coefficient” in tabes Example 1
Aka ^ slope *** The 98.24833 is like a fixed cost, where the 0.10977 is like a variable cost as it depends on how big the house is
Example 2 We need to find r= SSxy (SSxx)( SSyy ) √ We know r = 0.75
*we want to prove useful
it IS
• You Use the ANOVA part of the table. * If we have the “Significance F” given in the table USE THAT - If the Significance F is inside the rejection area AKA Smaller than the Alpha / critical Probability you reject the hypothesis proving that it is Useful. *ALPHA * If we do not have the Significance F… F Table Always Alpha T Table Alpha or Alpha / 2 *alpha/2= When something doe or doesn’t equal #
- MSR/MSE = Calculated F - Go to the F Table to find the Critical F * If calculated is bigger then critical you reject Testing the Slope, b1 * we don’t ever test the value of the intercept. * Use the bottom part of the table -> Slope = b1 Standard error of the Slope = Sb1 • We are testing that the slop is statistically not 0 (it can be zero even if it just appears to be a number close to 0) • We Use that hypothesis to figure it out: *This is Two Tailed so you must use Alpha/2 • Calculated *sometimes the “t stat” will be given • Critical t = use the T chart (Using alpha/2 and n-2) • If the Calculated Is > the critical you REJECT * Watch for Negatives * “Confidence interval” - This is a way to calculate the Alpha - Ex: 95% Confidence interval is = to 5% alpha so 0.05 - Don’t forget most times it is alpha/2 so it would actually be 0.025 Margin of Error/ Confidence interval “Lower and Upper 95%” • Upper: b1 + Critical t value b1 + ta/2, n-2 • Lower: b1 - Critical t value Week 11 Chapter 12C: Correlation and Simple Regression Analysis Point Estimate of y for x0 ^y =b 0 +b1 ( x 0 ) b1 =
Margin of error
SSxy SSxx
b0 = ´y −b 1 ´x Average of all x values
Confidence Interval For The Mean (the average of a group)
^y± t a /2,n−2 S e
√
2 1 ( x 0−´x ) + SS xx n
Prediction Interval For an Indavidulae ^y ± t a /2,n−2 S e
√
1 ( x 0−´x ) 1+ + n SS xx
2
Week 11 Chapter 13A: Multiple Regression What is Multiple Regression *This means there is multiple different slops *For example rather then just comparing the house size vs market price we can compare house size vs age vs market price • Things to consider when predicting energy consumption - Past History - Weather - Day of the Week - Holidays - Daylight hours - Time of day - Customer type - Daylight savings time ^y =b 0 +b1 ( x 1)+b 2 ( x 2) The difference between the expected vs what was observed is called residual or error (e) y−¿ ^y ) e=¿ Example 1 ^y =b 0 +b1 ( x 1)+b 2 ( x 2) ^y=59.0848+0.0173 ( x 1) −0.7713(x 2) x 1=2,500 x 2=12
^y=59.0848+0.0173 (2,500 )−0.7713(12) = 93.1363 Thousand $ Testing the overall Model *same as what we did prior
i = 1,2,……k, k = Number of slopes F Table • You Use the ANOVA part of the table. * If we have the “Significance F” given in the table USE THAT - If the Significance F is inside the rejection area AKA Smaller than the Alpha / critical Probability you reject the hypothesis proving that it is Useful.
k n–k–1 n–1 *if k = 1 (like it would in a single regression) the df values would be the same!!!
Calculated F = MSR / MSE Critical F= Fa, k, n-k-1
MSR= SSR/k MSE= SSE/n-k-1
*We are testing… Size of house -> Age of house ->
Test the Slope • Calculated t = b1 – B1 / Sb1 (given under t-stat) • Critical t = use the T chart (Using alpha/2 and n-k-1) • If the Calculated Is > the Critical you REJECT SSE and Standard error of estimate Se = Standard Error SSE Se = n−2
√
R2 Calculation R2 = SSR / SSyy R2 = 1- (SSE / SSyy)
Adjusted R2 =
1−( 1−R2 )(
n−1 ) n−k −1
Week 12 Chapter 14: Multiple Regression cont. Qualitative (dummy) Variables in Multi Regression • Yes or No variables - Ex: on or off, wet or dry, etc. - Codded as 1,0 Example 1 Y = weekly sales X1 = price X2 = Holiday yes or no - This is not a real variable it is either 1 or 0
*it is like having to regression line...