CH1 Anova notes - PPT PDF

Title CH1 Anova notes - PPT
Author Andrea Billings
Course Statistical Analysis
Institution Fashion Institute of Technology
Pages 31
File Size 1.1 MB
File Type PDF
Total Downloads 25
Total Views 164

Summary

PPT...


Description

Chapter 10

10-1

Statistics for Managers Using Microsoft® Excel

Chapter 1 Analysis of Variance

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-1

Chapter Goals After completing this chapter, you should be able to: 

   

 

Recognize situations in which to use analysis of variance (ANOVA) Understand different analysis of variance designs Evaluate assumptions of the model Perform a single-factor ANOVA and interpret the results Conduct and interpret a Tukey-Kramer post-analysis to determine which means are different Analyze two-factor analysis of variance tests Conduct and interpret a Tukey-Kramer post-analysis procedure to determine which factors are different

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-2

© 2004 Prentice Hall, Inc.

Chapter 10

10-2

General ANOVA Analysis 

Investigator controls one or more factors of interest  Each factor contains two or more levels  Levels can be numerical or categorical  





Different levels produce different groups Think of each group as a sample from a different population

Observe effects on the dependent variable  Are the groups the same? Experimental design: the plan used to collect the data

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-3

One-Factor ANOVA 



Also known as Completely Randomized Design and One-way ANOVA Experimental units (subjects) are assigned randomly to treatments 



Only one factor or independent variable 



Subjects are assumed homogeneous

With three or more treatment levels

Analyzed by one-factor analysis of variance

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-4

© 2004 Prentice Hall, Inc.

Chapter 10

10-3

One-Factor Analysis of Variance Evaluates the difference among the means of three or more groups Examples: Accident rates for 1st, 2nd, and 3rd shift Expected mileage for five brands of tires

Assumptions 





Populations are normally distributed (test with Box plot or Normal Probability Plot) Populations have equal variances (use Levene’s Test for Homogeneity of Variance) Samples are randomly and independently drawn

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-5

Why Analysis of Variance? 

 

We could compare the means in pairs using a t test for difference of means Each t test contains Type 1 error The total Type 1 error with k pairs of means is 1- (1 - a) k  If there are 5 means and you use a = .05   

Must perform 10 comparisons Type I error is 1 – (.95) 10 = .40 40% of the time you will reject the null hypothesis of equal means in favor of the alternative even when the null is true!

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-6

© 2004 Prentice Hall, Inc.

Chapter 10

10-4

Hypotheses of One-Factor ANOVA



: μ1  μ2  μ3    μc All population means are equal



i.e., no treatment effect

 H0



H1 : Not all of the populationmeans are the same  At least one population mean is different  

i.e., there is a treatment effect Does not mean that all population means are different (some pairs may be the same)

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-7

One-Factor ANOVA H0 : μ1  μ2  μ3    μc H1 : Not all μi are the same All Means are the same: The Null Hypothesis is True (No Treatment Effect)

μ1  μ2  μ3 Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-8

© 2004 Prentice Hall, Inc.

Chapter 10

10-5

One-Factor ANOVA H0 : μ1  μ2  μ3    μc

(continued)

H1 : Not all μi are the same At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) or

μ1  μ2  μ3

μ1  μ2  μ3

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-9

Partitioning the Variation 

Total variation can be split into two parts:

SST = SSA + SSW SST = Total Sum of Squares (Total variation) SSA = Sum of Squares Among Groups (Among-group variation) SSW = Sum of Squares Within Groups (Within-group variation) Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-10

© 2004 Prentice Hall, Inc.

Chapter 10

10-6

Partitioning the Variation (continued)

SST = SSA + SSW Total Variation = the aggregate variation of the individual data values across the various factor levels (SST) Among-Group Variation = variation among the factor sample means (SSA) Within-Group Variation = variation that exists among the data values within a particular factor level (SSW)

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-11

Partition of Total Variation Total Variation (SST)

=

Variation Due to Factor (SSA)

+

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Variation Due to Random Error (SSW)

Chap 10-12

© 2004 Prentice Hall, Inc.

Chapter 10

10-7

Total Sum of Squares SST = SSA + SSW c

nj

SST   ( Xij  X)2 j 1 i 1

Where:

SST = Total sum of squares c = number of groups or levels nj = number of observations in group j Xij = ith observation from group j X = grand mean (mean of all data values) Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-13

Total Variation (continued)

SST  ( X11  X )2  ( X12  X ) 2      ( X cnc  X ) 2 Response, X

X Group 1

Group 2

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Group 3 Chap 10-14

© 2004 Prentice Hall, Inc.

Chapter 10

10-8

Among-Group Variation SST = SSA + SSW c

SSA   n j ( X j  X)2 j 1

Where:

SSA = Sum of squares among groups c = number of groups nj = sample size from group j Xj = sample mean from group j X = grand mean (mean of all data values) Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-15

Among-Group Variation (continued) c

SSA   n j ( X j  X)2 j 1

Variation Due to Differences Among Groups

MSA 

SSA c 1

Mean Square Among = SSA/degrees of freedom

i

j

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-16

© 2004 Prentice Hall, Inc.

Chapter 10

10-9

Among-Group Variation (continued)

SSA  n 1 (X1  X)2  n2 (X2  X)2      nc (Xc  X)2 Response, X

X3

X2

X1 Group 1

Group 2

X

Group 3

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-17

Within-Group Variation SST = SSA + SSW c

SSW   j 1

nj



( Xij  Xj )2

i 1

Where:

SSW = Sum of squares within groups c = number of groups

nj = sample size from group j Xj = sample mean from group j Xij = ith observation in group j Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-18

© 2004 Prentice Hall, Inc.

Chapter 10

10-10

Within-Group Variation (continued) c

SSW   j1

nj



( Xij  Xj )2

i 1

Summing the variation within each group and then adding over all groups

MSW 

SSW n c

Mean Square Within = SSW/degrees of freedom

μj Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-19

Within-Group Variation (continued)

SSW  (X11  X 1) 2  (X 12  X 2 ) 2      (X cnc  Xc ) 2 Response, X

X3

X1 Group 1

Group 2

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

X2 Group 3 Chap 10-20

© 2004 Prentice Hall, Inc.

Chapter 10

10-11

Obtaining the Mean Squares The Mean Squares are obtained by dividing the various sum of squares by their associated degrees of freedom

MSA 

SSA c 1

Mean Square Among (d.f. = c-1)

MSW 

SSW n c

Mean Square Within (d.f. = n-c)

MST 

SST n 1

Mean Square Total (d.f. = n-1)

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-21

One-Way ANOVA Table Source of Variation Among Groups

Degrees of Freedom

c-1

Sum Of Squares

Mean Square (Variance)

SSA c-1

SSA

MSA =

SSW MSW = n-c

Within Groups

n-c

SSW

Total

n–1

SST

F

FSTAT = MSA MSW

c = number of groups n = sum of the sample sizes from all groups df = degrees of freedom Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-22

© 2004 Prentice Hall, Inc.

Chapter 10

10-12

One-Way ANOVA F Test Statistic H 0: μ 1= μ 2 = … = μ c H1: At least two population means are different 

Test statistic

FSTAT 

MSA MSW

MSA is mean squares among groups MSW is mean squares within groups 

Degrees of freedom 

df1 = c – 1

(c = number of groups)



df2 = n – c

(n = sum of sample sizes from all populations)

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-23

Interpreting One-Way ANOVA F Statistic 

The F statistic is the ratio of the among estimate of variance and the within estimate of variance   

The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large

Decision Rule:  Reject H0 if FSTAT > Fα, otherwise do not reject H0 Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

a

0

Do not reject H0

Reject H0

Fα Chap 10-24

© 2004 Prentice Hall, Inc.

Chapter 10

10-13

One-Factor ANOVA F Test Example You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the .05 significance level, is there a difference in mean driving distance?

Club 1 254 263 241 237 251

Club 2 234 218 235 227 216

Club 3 200 222 197 206 204

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-25

One-Factor ANOVA Example: Scatter Diagram Club 1 254 263 241 237 251

Club 2 234 218 235 227 216

Club 3 200 222 197 206 204

Distance 270 260 250 240 230

• ••

• •

220 210

x 1  249.2 x 2  226.0 x 3  205.8

200

x  227.0

190

Statistics for Managers Using Microsoft Excel, 4/e

•• • ••

X2

• •• • •

1 Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

X1

2 Club

X X3

3 Chap 10-26

© 2004 Prentice Hall, Inc.

Chapter 10

10-14

ANOVA -- Single Factor: Excel Output EXCEL: Tools | Data Analysis | ANOVA: Single Factor SUMMARY Count

Sum

Average

Variance

Club 1

Groups

5

1246

249.2

108.2

Club 2

5

1130

226

77.5

Club 3

5

1029

205.8

94.2

ANOVA

Source of Variation

SS

df

MS

Between Groups

4716.4

2

2358.2

Within Groups

1119.6

12

93.3

Total

5836.0

14

F 25.275

P-value

F crit

4.99E-05

3.89

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-27

One-Factor ANOVA Example Solution Test Statistic:

H0: μ1 = μ2 = μ3 H1: μi not all equal a = .05 df1= 2 df2 = 12

a = .05

0

Do not reject H0

Reject H0

F

MSA 2358.2   25.275 MSW 93.3

p-value: 4.99E-05 Decision: Reject H0 at a = 0.05 Conclusion: There is evidence that at least one μi differs F = 25.275 from the rest

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-28

© 2004 Prentice Hall, Inc.

Chapter 10

10-15

The Tukey-Kramer Procedure 

Tells which population means are significantly different  



e.g.: μ1 = μ2  μ3 Done after rejection of equal means in ANOVA

Allows pair-wise comparisons 

Compare absolute mean differences with critical range

μ1 = μ

2

x

μ3

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-29

Tukey-Kramer Critical Range

Critical Range  Q α

MSW 2

 1 1     n j n j'   

where: Qα =

Upper Tail Critical Value from Studentized Range Distribution with c and n - c degrees of freedom (see appendix E.7 table) MSW = Mean Square Within nj and nj’ = Sample sizes from groups j and j’ Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-30

© 2004 Prentice Hall, Inc.

Chapter 10

10-16

The Tukey-Kramer Procedure: Example Club 1 254 263 241 237 251

Club 2 234 218 235 227 216

Club 3 200 222 197 206 204

1. Compute absolute mean differences: x1  x2  249.2 226.0  23.2 x1  x3  249.2 205.8  43.4 x2  x3  226.0  205.8  20.2

2. Find a QU value from the table in appendix E.7 with c = 3 (across the table) and n – c = 15 – 3 = 12 degrees of freedom (down the table) for the desired level of a (a = .05 used here): Q  3.77 U

Get MSW from the ANOVA output. Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-31

The Tukey-Kramer Procedure: Example (continued)

3. Compute Critical Range: Critical Range  QU

MSW  1 1  93.3  1 1   3.77      16.285   2  n j nj'  2  5 5

4. Compare: 5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at 5% level of significance.

x1  x 2  23.2 x 1  x 3  43.4 x 2  x 3  20.2

Thus, with 95% confidence we can conclude that the mean distance for club 1 is greater than club 2 and 3, and club 2 is greater than club 3. Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-32

© 2004 Prentice Hall, Inc.

Chapter 10

10-17

ANOVA Assumptions 

Randomness and Independence 



Normality 



Select random samples from the c groups (or randomly assign the levels) The sample values for each group are from a normal population

Homogeneity of Variance 



All populations sampled from have the same variance Can be tested with Levene’s Test

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Chap 10-33

ANOVA Assumptions Levene’s Test 







Tests the assumption that the variances of each population are equal. First, define the null and alternative hypotheses: 

H0: σ21 = σ22 = …=σ2c



H1: Not all σ2j are equal

Second, compute the absolute value of the difference between each value and the median of each group. Third, perform a one-way ANOVA on these absolute differences.

Statistics for Managers Using Microsoft Excel, 7e © 2014 Pearson Education, Inc.

Statistics for Managers Using Microsoft Excel, 4/e

Chap 10-34

© 2004 Prentice Hall, Inc.

Chapter 10

10-18

Levene Homogeneity Of Variance Test Example H0: σ21 = σ22 = σ23 H1: Not all σ2j are equal Calculate Medians

Calculate Absolute Differences

Club 1

Club 2

Club 3

237

216

197

14

11

7

241

218

200

10

9

4

251

227

204 Median

0

0

0

254

234

206

3

7

2


Similar Free PDFs