Section 2 Manova and Mancova PDF

Title	Section 2 Manova and Mancova
Course	Introduction to Data Analysis in Educational Research
Institution	University of Alberta
Pages	39
File Size	701.9 KB
File Type	PDF
Total Downloads	113
Total Views	146

Preview

CLICK TO PREVIEW PDF

Summary

Download Section 2 Manova and Mancova PDF

Description

EDPY 605 – Section 2

Section 2: Multivariate Analysis of Variance 2.1 Introduction 2.2 Review of One-way ANOVA 2.3 One-Factor MANOVA: The Use of Four Multivariate Test Statistics 2.4 Assumptions and Data Considerations Underlying MANOVA 2.5 Robustness of the Multivariate Test Statistics 2.6 Following a Significant Multivariate Effect 2.7 MANCOVA

1

EDPY 605 – Section 2

2

2.1 Introduction • MANOVA: Multivariate Analysis of Variance • MANOVA, like ANOVA, has one (or possibly several) categorical predictor variables, usually called factors. • ANOVA includes only one quantitative outcome or dependent variable (Y) • In MANOVA there are multiple dependent variables (Y1, Y2, …., Yp).

EDPY 605 – Section 2

• What are the benefits of using a multivariate procedure in comparison of conducting several ANOVAs?  The overall type I error rate can be controlled.  Univariate statistical tests tend to ignore the intercorrelation found among dependent variables. As we will see, multivariate procedure considers the dependent variable intercorrelation by examining variance-covariance matrices.  MANOVA works best with highly negatively correlated dependent variables and acceptably well with moderately correlated dependent variables in either direction (about |. 6|).

3

EDPY 605 – Section 2

4

EDPY 605 – Section 2

• There are some circumstances under which the multivariate procedure should not be used.  First, it should not be used if the dependent variables are uncorrelated.  Since dependent variables are uncorrelated, the only advantage of MANOVA over separate ANOVA tests on each dependent variable is its ability to control family-wise Type I error.  However, this can be easily achieved by using more stringent alpha levels when conducting univariate tests on each dependent variable.  Given that the univariate procedure is less complicated and also more interpretable, it is preferred.

5

EDPY 605 – Section 2

 Multivariate test is not appropriate either when intercorrelations among dependent variables are highly positively correlated.  Statistically, such correlations will run the risk of multicollinearity condition, which can lead to improper results.  Conceptually, to the extent that variables are highly correlated, they can be said to be measuring the same construct and are therefore redundant.  Separate univariate tests are also misleading because they suggest effects on different behavior when there is indeed only one behavior being measured multiple times.  A better way to deal with this situation is to pick one dependent variable (the most reliable one) or create a composite score for use in univariate analysis.

6

EDPY 605 – Section 2

7

Activities: Design a MANOVA study. Your design should clearly describe the nature of the study (experimental vs. nonexperimental), the factors (what they are and how many levels each factor has), and the dependent measures (what patterns of relationship among the dependent variables). What pattern of results would you predict, and what follow-up analyses do you think you would need?

EDPY 605 – Section 2

8

2.2 Review of One-way ANOVA • The statistical technique, analysis of variance (ANOVA), simultaneously tests the equality of any number (฀฀) of means.

• The ANOVA answers the question: “Does at least one of the ฀฀ means differ from one of the other means by more than what would be expected from sampling error?” • In ANOVA, the null hypothesis is as follows: ฀฀0 : ฀฀1 = ฀฀2 = ฀฀3 = ⋯ = ฀฀฀฀

EDPY 605 – Section 2

9

EDPY 605 – Section 2

10

• One obvious characteristic of the data on the previous page is that scores are different. • The total variability for the entire set of data can be found by combining all the scores from all the separate samples and calculating the variability of all the scores. • The total variability can be divided into two basic components. 1. Between-groups variance: measuring the differences between sample means. This variance is due to both treatment effect and chance. 2. Within-groups variance: measuring variability within each sample. The variance is due to chance alone.

EDPY 605 – Section 2

11

EDPY 605 – Section 2

12

• After we have analyzed the total variability into two basic components, we simply compare them by computing an ฀฀-ratio. ฀฀= =

between groups variance within groups variance

treatment effect + differences due to chance differences due to chance

EDPY 605 – Section 2

• When the treatment has no effect, then the treatment variance is entirely due to chance.

13

 The numerator and the denominator of the ฀฀ -ratio should be roughly the same size  ฀฀-ratio should have a value around 1.

• When the treatment does have an effect, then  The numerator of the ฀฀-ratio should be noticeably larger than the denominator.  F-ratio should have a value significantly larger than 1.

EDPY 605 – Section 2

14

2.3 One-Factor MANOVA: The Use of Four Multivariate Test Statistics • The hypotheses to be tested in a one-way MANOVA are: H0: μ1 = μ2 = … = μJ H1: at least two population mean vectors are not identical where J is the number of groups. • The logic of one-way MANOVA

Combine the dependent variables into a linear composite. Use the composite scores as the new dependent variable Perform one way ANOVA on this new dependent variable

EDPY 605 – Section 2

15

• The weights used to combine the dependent variables are selected so as to

produce the maximum possible ฀฀ value from the ANOVA analysis. • In one way ANOVA, the partition of variance in Y scores was as follows: SStotal = SSbetween + SSwithin SS = Sum of Squared deviations from a mean for the Y scores

• In one way MANOVA, there are multiple dependent variables. The multivariate counterpart of SS is a matrix, called the sums-of-squares and cross-products matrix (SSCP) ฀฀฀฀1 ⋯ ฀฀฀฀1฀฀ ⋱ ⋮ � ฀฀฀฀฀฀฀฀ = � ⋮ ฀฀฀฀฀฀1 ⋯ ฀฀฀฀฀ ฀

CPij = Sum of Cross Products of deviations of Yi and Yj from their means ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ = ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ + ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

EDPY 605 – Section 2

16

Discussion question: What information is provided by the elements of a withingroup SSCP matrix and a between-group SSCP?

EDPY 605 – Section 2

17

• In ANOVA, sums of squares are divided by degrees of freedom to produce variances, which in turn form F ratios. In MANOVA, the multivariate counterpart of variances is determinants. A determinant is found for each sums-of-squares and cross-products matrix (SSCP), denoted by |฀฀฀฀฀฀฀฀|.

• To test for significance in MANOVA, ratios of determinants are formed when using Wilk’s lambda. Λ=

|฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀| |฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ + ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀|

EDPY 605 – Section 2

18

• There are three other statistics that can also be used in MANOVA. 1) Roy’s largest root criterion 2) Lawley-Hotelling’s trace criterion 3) Pillai’s trace criterion

Once these statistics are calculated, they are converted in to ฀฀ statistics in order to test the null hypothesis. The reason for this translation is because of the easy availability of published tables of the F distribution.

The important issue to recognize is that in some cases, the ฀฀ statistic is exact and in other cases it is approximate. Many statistical packages will inform you whether the ฀฀ is exact or approximate.

EDPY 605 – Section 2

Activity: Read Ortner and Vormittag (2011) article, identify the following  Research design  IVs and DVs  Analytic procedures  Results and interpretation  Strengths and weaknesses (if any)

19

EDPY 605 – Section 2

20

2.4 Assumptions and Data Considerations Underlying MANOVA • Independence.  Subjects must be independent within each group.  Experimentally, independence of subjects is assumed if subjects have been randomly assigned to treatments or conditions of the study.  Sometimes, problem occurs when convenient samples are used, such as placing a class of students in one treatment level. • Multivariate Normality  Multivariate normality implies that the means of the various dependent variables for each group and all linear combinations of them are normally distributed.  The multivariate normality assumption is found to show little or mild influences on the actual Type I and Type II error rates. Therefore, MANOVA tests are fairly robust to the violation of the normality assumption.

EDPY 605 – Section 2

21

• Homogeneity of variance-covariance matrices.  In MANOVA, the pooled sums-of-squares and cross-products matrix across groups (cells), indicating an underlying assumption that withingroup population matrices are identical.  The assumption of homogeneity of variance-covariance matrices among dependent variables across groups is the multivariate analog to the homogeneity of variance assumption in ANOVA.  In the univariate situation, ฀฀1 2 = ฀฀2 2 = ⋯ = ฀฀฀฀ 2 .

 In the multivariate situation, ฀฀฀฀฀฀฀฀1 = ฀฀฀฀฀฀฀฀2 = ⋯ = ฀฀฀฀฀฀฀฀฀฀ .

EDPY 605 – Section 2

22

 Violation of the homogeneity of variance-covariance assumption produces only minor consequences when sample sizes are fairly equal.  However, when sample sizes are unequal, MANOVA tends to be problematic.  The actual probability of making Type I errors will be larger than the nominal level specified by the researcher.  The more number of dependent variables in the design and the greater the discrepancy in cell sample sizes, the greater potential distortion of alpha levels.  Therefore, when sample sizes are unequal, the homogeneity of variance-covariance assumption should be tested. The standard vehicle for assessing this assumption is the Box’s M test, where statistical significance (e.g., ฀ ฀ < 0.05 ) is indicative of the violation of the assumption.

EDPY 605 – Section 2

23

 However, the use of Box’s M test statistic for testing for homogeneity of variance-covariance is somewhat problematic. There are two reasons.  First, . If the normality assumption is violated, Box’s M may be significant because of the non-normality and not due to a lack of homogeneity of variance-covariance.  Further, the degrees of freedom associated with this test are often quite large, leading to a significant finding when the departures from homogeneity of variance-covariance are only minor or inconsequential.

EDPY 605 – Section 2

24

 What should one do?  First, look at the value of F for Box’s M test statistic.  If it is non-significant, then things are okay to move to the next step.  If the test is significant but the value of the F-test statistic is small, then the departures from homogeneity of variance-covariance are only minor or inconsequential. In this case proceed.  If the value of the F-test statistic is very large, usual transformations of the dependent variables or random deletion of cases to equalize sample sizes can be used.

EDPY 605 – Section 2

25

2.5 Robustness of the Multivariate Test Statistics • When the assumptions are not met, we need to consider the robustness of the four multivariate statistical tests with respect to the violations of the assumptions.  A statistical test is said to be robust if departure from the underlying assumptions for the test do not greatly affect the significance level or the power of the test.  The assumption of independence within groups, is crucial. If this assumption is not met, then it will be necessary to change the design to a nested design.

EDPY 605 – Section 2

26

 non-normality a. Skewness: little effect upon the four multivariate statistics. The nominal and actual levels of significance and of power will be close. b. Kurtosis: generally mild unless the distribution of one or more of the dependent variables becomes markedly leptokurtic.

 The effect is most severe for Roy’s ฀฀; the actual probability of committing a Type I error is less than the nominal ฀฀.

 Of the remaining three test statistics, Pillai’s ฀฀ appears to be less affected; the actual probability of committing a Type I

error using ฀ ฀ is more close to the nominal level when compared to Wilk’s ฀฀ and Hotelling’s ฀฀.

EDPY 605 – Section 2

27

c. Regarding power, the power for all four test statistics is affected downward, with the effect greater for Roy’s ฀฀ especially when the eigenvalues are comparable in size. However, for the majority of variables in educational and social science research, markedly leptokurtic distributions are rarely found. Indeed, the degree of the lack of normality of most variables in education and the social

sciences is such that the deleterious effects on ฀฀ and power are minimal.

EDPY 605 – Section 2

28

 Lack of homogeneity of variance-covariance. a. When sample sizes are similar across groups, the four multivariate statistics are not really affected by a lack of homogeneity of variance-covariance. b. However, the failure to meet the assumption of homogeneity of variance-covariance is more troublesome than the failure to meet the assumption of non-normality when the sample sizes are not

equal. As before, Roy’s ฀฀ seems to be the most affected, followed in turn by Wilk’s ฀฀, Hotelling’s ฀฀, and Pillar’s ฀฀.

EDPY 605 – Section 2

29

 The basic parameters of a study have some bearing on the robustness of the four MANOVA test statistics. The basic parameters include a. Dimensionality of the dependent variables  Reducing the number of dependent variables, the affect of the failure to meet the assumptions is reduced somewhat.  One way of reducing the number of dependent variables is to examine the full set for overlapping or redundant information. This can be done logically and empirically using factor analysis.  By reducing the number of dependent variables, the interpretation of results will become more straightforward and simpler.

EDPY 605 – Section 2

30

b.The number of levels of the factor  The affect of the failure to meet the assumptions is reduced somewhat if the number of levels of the factor is reduced.  The researcher would need to consider if all of the groups to be compared are necessary or if some can be dropped or, if the treatments are not that different, combined. c.The size of the samples.  Large sample size is also advantageous.  An exception to this generalization is when one or two of the groups have a much greater number of subjects than the other groups to be compared. In the case, smaller samples yield better results, with Pillai’s ฀฀ the test statistic of choice.

EDPY 605 – Section 2

31

 Interestingly and importantly, when the sample sizes are very large, and the condition of multivariate normality is essentially met, the four multivariate statistics become more and more equivalent as the sample sizes increase.

EDPY 605 – Section 2

32

2.6 Following a Significant Multivariate Effect • In one-factor MANOVA, a statistically significant multivariate effect tells us that the independent variable is associated with differences among the mean vectors of dependent variables. • The next step in the process is to discover which specific dependent variables are affected by the independent variable. • There are several procedures to identify the dependent variables that contribute to the group differences. These procedures are:  Separate univariate tests  Roy-Bargmann’s step down analysis  Discriminant function analysis

EDPY 605 – Section 2

33

• If a dependent variable is identified as significantly contributing to the group differences, a multiple comparison test can further be used to assess which of the groups are significantly different on the dependent variable. • In EDPY505, we introduced different multiple comparison procedures, such as planned orthogonal method and Tukey’s method of multiple comparison. These procedures can be used here.

EDPY 605 – Section 2

34

• Separate univariate ฀฀ tests  Perhaps the most popular procedure to follow up multivariate significance is to conduct separate ANOVAs with a more stringent alpha level.  The common procedure for deciding the alpha level is to use Bonferroni correction, where the new alpha level equals ฀ �฀ ฀฀(i.e., the omnibus alpha level is divided by the number of dependent variables).

 If this approach is taken, you might wonder why we bother to use MANOVA in the first place.  The idea is that the multivariate test controls inflated Type I error because when the omnibus test is not significant, no further analysis is required.

EDPY 605 – Section 2

35

 However, an implicit assumption of using separate ANOVAs is that dependent variables are uncorrelated, which is contradictory to the use of MANOVA in the first place.  According to Tabachnick and Fidell (2013), if the separate ANOVAs are reported after a significant MANOVA, the correlations among dependent variables should also be reported so the reader can make necessary interpretive adjustments.

EDPY 605 – Section 2

• Roy-Bargmann stepdown analysis (Bock, 1966; Bock & Haggard, 1968) The problem associated with separate t test for correlated dependent variables can be resolved by stepdown analysis.  The procedure of stepdown analysis is as follows.  Priorities among dependent variables should first be assigned according to theoretical or practical considerations.  An ANOVA is then performed on the highest-priority dependent variable with appropriate adjustment of alpha.  Next, ANCOVA is used to test the significance of the second highest-priority dependent variable after controlling the effect of the highest-priority dependent variable.

36

EDPY 605 – Section 2

 Similarly, each of the remaining DVs will be tested in ANCOVA with higher-priority dependent variables as covariates.

 In each stepdown analysis, a nonsignificant ฀฀ does not mean the dependent variable is not affected by the treatment.

 Rather, it tells us that the dependent variable has no unique variability shared with treatment after overlapping variance with one or more higher-priority variables are removed.  The use of stepdown analysis requires a compelling priority ordering among dependent variables.  If there is no basis or only a weak basis for ordering dependent variables, the interpretations of stepdown analysis will be more difficult.

37

EDPY 605 – Section 2

38

2.7 MANCOVA • ANCOVA: To evaluate whether means on the Y variable differ significantly across levels of the independent variable when parts of the Y scores that are linearly predictable from one or more covariates have been partialled out or statistically removed. • The covariate variables must be measured before treatments are administered (to ensure that scores on the covariate are not influenced by type or amount of treatment). • Additional assumptions: The covariate should be linearly related to the dependent variable Y; There should not be a treatment by covariate (A x Xc) interaction • Partition of SS in One way ANCOVA

฀฀฀฀′ ฀฀฀฀฀฀฀฀฀฀ = ฀฀฀฀′฀฀฀฀฀฀฀฀฀฀฀฀฀฀ + ฀฀฀฀′฀฀฀฀฀฀ℎ฀฀฀฀

EDPY...