Statistics Final Exam PDF

Title	Statistics Final Exam
Author	charlie O
Course	Managerial Statistics
Institution	Wilfrid Laurier University
Pages	14
File Size	1013.6 KB
File Type	PDF
Total Downloads	39
Total Views	156

Preview

CLICK TO PREVIEW PDF

Summary

Download Statistics Final Exam PDF

Description

Statistics Final Exam The content of the examination is as follows • Weeks 9-12 • Chapter 11-14 • Check the course outline for textbook sections and topics • PPT files within the weeks 9-12 folders in My LS • Wiley Plus quizzes 8-10 Week 9 Chapter 11: Analysis of Variance One-Way Analysis of Variance (ANOVA) Evaluate the difference among the means of three or more populations Examples

- Accident rates for 1st, 2nd, and 3rd shift - Expected milage for five brands of tires

Key Assumptions

- Populations are normally distributed - Populations have equal variance - Samples are randomly and independently drawn * Uses the “f table” “are they statistically the same or statistically different” Hypothesis of One-Way ANOVA All populations mean are equal - No variation in means among populations At least one population mean is different - Does not mean that all population means are different - Some populations pairs may be the same One-Way ANOVA

Not Rejected

Partitioning the Variation

Rejected

*SS = sum of squares

SST = Total sum of squares  Aggregate dispersion of the individual data values across the various populations SSC = Treatment sum of squares (between sample)  Dispersion among the sample mean SSE = Error sum of squares (within sample)  Dispersion that exists amount the data values within a particular population

SST = SSC + SSE One-Way ANOVA Table C = Number of populations (how many groups there are) NT = Sum of the sample sizes from all the populations Df = Degrees of Freedom

*Use 1 Tail test (Use alpha) Means Square (MS) Calculations = “average” of the SS calculations

One-Way ANOVA F Test Statistic

 Test Statistic F = MSC / MSE Degree of freedom  V1= dfC = C - 1  V2= dfE = nT – C

Note: The chart given will be incomplete. Remember things like rearranging SST = SSC + SSE To find the unknown

One-Way ANOVA Interpretation The F statistic is the ratio of the between estimate of variance and the within estimate of variance  The ration must always be positive

 V1= dfC = C – 1 will typically be small  V2= dfE = nT – C will typically be large  The ratio (The Observed F Value) is close to 1 you do not reject 

usually The ratio (The Observed F Value) is larger than 1 you reject usually

EXAMPLE #1

Means

SSC = 5 [(249.4 – 227 ) = 4716.4 Do not nee d to cal c thi s we wil l be giv en it

SSE = (254 – 249.2) = 1119.6

2

2

n1= C = 35 n2 = 5 n3 = 5 nT = 15

+ (226 – 227 )

+ (236 – 249.2)

2

2

+ (205.8 – 227 ) 2]

+ … + (204 – 249.2)

2

SST = 4716.4 + 1119.6 = 5836 MSC = SSC / (C – 1) = 4716.4 / (3 – 1) = 2358.2 MSE = SSE / (nT – C) n = 1119.6 / (15 – 3) = 93.2 F = MSC / MSE = 2358.2 / 93.3 = 25.275 (calculated F)

Critical F: Using the F value chart… Alpha = 0.05 V1 = (C – 1) = 2 V2 = (nT – C) = 12 ANS = 3.89 *because the calculated value is >1 then Then we will most likely will reject (wb^ EXAMPLE #2 ANSWER:

EXAMPLE #3

Fill in the ?... - MSC = SSC / (C-1) = 0.2366 / (4-1) = 0.0789 - F = MSC / MSE = 0.0789 / 0.0077 = 10.2468 - Critical F = 3.10 What us the statistical Conclusion? - Reject it ? Week 9 Chapter 16: Goodness-of-Fit Test & Test of Independence X2 Goodness-of-Fit Test

We are testing if “it came from a uniform distribution” df = k – 1 – c F0 = frequency of observed values (Given in the table) fe = frequency of expected values (Needs to be calculated) k = number of categories (ex; if months, k = 12) c = number of parameters estimate (Always 0 in uniform) *products that do not have a seasonality, they are uniformly distributed (Sell the same no matter the time of year) Ex: Toilet paper, Milk, Toothpaste etc. *Always one tail *Alpha always critical probability (ex; 0.01)… (or alpha / 2 for two tailed ??) fe = 18,447 / 12 = 1,537.25 per mon X2 ¿

( 1,610 – 1,537.2 1,537.25

= 3.44 … Cont. down

Use this on the X2 Chart: X2 a = 0.01 X2 df = 11 On Table = 24.725

Critical X2 = 24.725 Calculated X2 = 74.37

ANS: We Reject because calculated X2 (73.37) is bigger the Critical X2 (24.7) X2 Test of Independence “will tell you if the statement is independent or dependent”

F0 = frequency of observed values (Given in the table) fe = frequency of expected values that make the values independent (Needs to be calculated) df = (r-1)(c-1) r = Number of rows c = Number of columns EXAMPLE #1

•r=4 •c=3 • df = (4-1)(3-1) =6 *Unexpected value if they were in dependent

You then total each Colum and each row- as well as a grand total

Then Use the formula to solve e… Total of row x Total of Colum / Grand Total = 66.15 Once you have done the expected for each cell it should look like this

Then you use the formula, and calculated x2 for each cell 2

( 85 – 66.15 ) 66.15 You do this for all squares and then add up the ANS x 2=

FINAL CALCULATED VALUE = 70.78 Do the Test: X2 a = 0.01 X2 df = 6

Use the X2 Chat

FINAL CRITICAL VALUE = 16.81

* We Reject Because Calculated (70.7) is bigger the then Critical (16.81) Meaning it is NOT Statistically independent (it is dependent)

Week 10 Chapter 12: Correlation and simple regression analysis Relationship Between Variables • Ex: Relationship between cost and price profit of a hotel and number of motels near by… etc. Y = Dependent X = Independent Linear Relationship

Positive :

Negative:

Coefficient of Correlation “r” • Measures strength and direction of a linear relationship between two variables • It ranges from -1 to 1, the closer to -1 or 1 means it is a stronger correlation where the closer to 0 means a weak correlation r=

SSxy √(SSxx)( SSyy )

Coefficient of Determination “r2 ” • Measures Strength of a linear relationship between two variables 2 2 r =( r )

Examples of Approximate Values: Example 1 2 r =1 • Unlikely to happen • Perfect linear relationship between x and y

Example 2

Example 3

Correlation and Causation: • Correlation between variables does not imply that change in one variable is the cause of change in another variable • Causation Indicates that one event is result of the occurrence of another event. There is a causal relationship between the two events, also referred to as cause and effect Simple Linear Relationship ^y =b 0 +b1 x =Predicted Equation

´y = Average of Dependent Variables ´x = Averageof independent Variables

b1=

∑(x−´x )( y − ´y ) 2

∑ ( x− x´ ) b0 = ´y −b 1 ´x =Slope

=

S S xy =Intercept S Sx x

e = y − ^y =Residule(difference between ibserevd ∧predicited ) *in a question the numbers may be under “coefficient” in tabes Example 1

Aka ^ slope *** The 98.24833 is like a fixed cost, where the 0.10977 is like a variable cost as it depends on how big the house is

Example 2 We need to find r= SSxy (SSxx)( SSyy ) √ We know r = 0.75

*we want to prove useful

it IS

• You Use the ANOVA part of the table. * If we have the “Significance F” given in the table USE THAT - If the Significance F is inside the rejection area AKA Smaller than the Alpha / critical Probability you reject the hypothesis proving that it is Useful. *ALPHA * If we do not have the Significance F… F Table Always Alpha T Table Alpha or Alpha / 2 *alpha/2= When something doe or doesn’t equal #

- MSR/MSE = Calculated F - Go to the F Table to find the Critical F * If calculated is bigger then critical you reject Testing the Slope, b1 * we don’t ever test the value of the intercept. * Use the bottom part of the table -> Slope = b1 Standard error of the Slope = Sb1 • We are testing that the slop is statistically not 0 (it can be zero even if it just appears to be a number close to 0) • We Use that hypothesis to figure it out: *This is Two Tailed so you must use Alpha/2 • Calculated *sometimes the “t stat” will be given • Critical t = use the T chart (Using alpha/2 and n-2) • If the Calculated Is > the critical you REJECT * Watch for Negatives * “Confidence interval” - This is a way to calculate the Alpha - Ex: 95% Confidence interval is = to 5% alpha so 0.05 - Don’t forget most times it is alpha/2 so it would actually be 0.025 Margin of Error/ Confidence interval “Lower and Upper 95%” • Upper: b1 + Critical t value b1 + ta/2, n-2 • Lower: b1 - Critical t value Week 11 Chapter 12C: Correlation and Simple Regression Analysis Point Estimate of y for x0 ^y =b 0 +b1 ( x 0 ) b1 =

Margin of error

SSxy SSxx

b0 = ´y −b 1 ´x Average of all x values

Confidence Interval For The Mean (the average of a group)

^y± t a /2,n−2 S e

√

2 1 ( x 0−´x ) + SS xx n

Prediction Interval For an Indavidulae ^y ± t a /2,n−2 S e

√

1 ( x 0−´x ) 1+ + n SS xx

2

Week 11 Chapter 13A: Multiple Regression What is Multiple Regression *This means there is multiple different slops *For example rather then just comparing the house size vs market price we can compare house size vs age vs market price • Things to consider when predicting energy consumption - Past History - Weather - Day of the Week - Holidays - Daylight hours - Time of day - Customer type - Daylight savings time ^y =b 0 +b1 ( x 1)+b 2 ( x 2) The difference between the expected vs what was observed is called residual or error (e) y−¿ ^y ) e=¿ Example 1 ^y =b 0 +b1 ( x 1)+b 2 ( x 2) ^y=59.0848+0.0173 ( x 1) −0.7713(x 2) x 1=2,500 x 2=12

^y=59.0848+0.0173 (2,500 )−0.7713(12) = 93.1363 Thousand $ Testing the overall Model *same as what we did prior

i = 1,2,……k, k = Number of slopes F Table • You Use the ANOVA part of the table. * If we have the “Significance F” given in the table USE THAT - If the Significance F is inside the rejection area AKA Smaller than the Alpha / critical Probability you reject the hypothesis proving that it is Useful.

k n–k–1 n–1 *if k = 1 (like it would in a single regression) the df values would be the same!!!

Calculated F = MSR / MSE Critical F= Fa, k, n-k-1

MSR= SSR/k MSE= SSE/n-k-1

*We are testing… Size of house -> Age of house ->

Test the Slope • Calculated t = b1 – B1 / Sb1 (given under t-stat) • Critical t = use the T chart (Using alpha/2 and n-k-1) • If the Calculated Is > the Critical you REJECT SSE and Standard error of estimate Se = Standard Error SSE Se = n−2

√

R2 Calculation R2 = SSR / SSyy R2 = 1- (SSE / SSyy)

Adjusted R2 =

1−( 1−R2 )(

n−1 ) n−k −1

Week 12 Chapter 14: Multiple Regression cont. Qualitative (dummy) Variables in Multi Regression • Yes or No variables - Ex: on or off, wet or dry, etc. - Codded as 1,0 Example 1 Y = weekly sales X1 = price X2 = Holiday yes or no - This is not a real variable it is either 1 or 0

*it is like having to regression line...