Lecture 4 - Paired and Independent T Tests PDF

Title Lecture 4 - Paired and Independent T Tests
Author Sara Aasa
Course Biostatistics
Institution University of Sydney
Pages 11
File Size 603.1 KB
File Type PDF
Total Downloads 30
Total Views 151

Summary

summary...


Description

PUBH5018 Introductory Biostatistics – Lecture 4

1

Lect Lectur ur ure e 44:: Pair Paired ed ssam am ample ple pless t-tes ests ts an and d Ind Indep ep epen en ende de dent nt sa samp mp mplles t-te test st stss Objectives By the end of this topic, you should be able to: 1. Identify the appropriate statistical test to use for analysing a continuous variable when data come from either paired samples or two independent samples 2. Compare the means of paired samples of continuous data using the paired t-test 3. Compare the means of two independent samples of continuous data using the independent samples t-test 4. Calculate confidence intervals for the difference between two means

Bac Backgro kgro kgrou und In Lecture 3, we introduced the principles of hypothesis testing, and provided examples of a test for comparing the mean of a continuous variable from one sample to some hypothesised value. In this lecture, we extend the concept of hypothesis testing to situations where we have two sets of observations of a continuous variable – obtained either from repeated (paired) observations of the same individuals, or observations from two independent (non-related) samples. In future lectures, we will demonstrate how the equivalent analyses can be conducted for binary data (proportions) and distribution-free methods (non-parametric tests). Because we will be developing an analysis framework that depends on i) the type of data to be analysed, and ii) the number of samples (one or two groups) to be compared, it is important that you are able to correctly identify these aspects from the study design and/or research question. Table 4.1 provides a brief overview of the appropriate statistical tests given the sample, the comparison to be made, and the type of outcome variable. Table 4.1: Overview of statistical tests depending on the type of outcome variable, and type of comparison to be made. Type of Outcome Variable Continuous Variable Binomial Variable One sample compared Population SD known: z-test Binomial test, or to a fixed value Population SD unknown: One-sample t-test Normal approximation to Non-parametric: One-sample Wilcoxon the Binomial signed-rank test Two samples Paired samples Paired samples t-test McNemar’s 𝜒 2 test Non-parametric: Wilcoxon signed-rank test Comparison

Independent samples Independent (two) samples t-test Non-parametric: Mann-Whitney U test

Sydney School of Public Health

Independent samples 𝜒 2 test

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

2

Com Compa pa paris ris rison on o off Paire Paired dM Means eans (P (Paired aired ssam am ample ple pless t-tes -test) t) A commonly used study design in public health and medical research involves taking pairs of observations on individuals at two distinct time points, e.g., before and after some intervention. To test the effectiveness of the intervention on a continuous variable 𝑥, say serum cholesterol, we measure cholesterol before the intervention, 𝑥1 , and then again afterwards, 𝑥2 , on the same individual. Another setting involving paired observations could be, for example, an ophthalmological trial where each eye of the patient is randomised to treatment or control and we then compare the effects. The effect of the intervention is then measured by the difference, 𝑑, between the pairs of observations. As we have calculated a single difference value, 𝑑 , for each pair of observations, we effectively now have just one sample of 𝑛 differences, 𝑑 . The analysis can therefore be approached as a one-sample t-test (see Lecture 3), where the population standard deviation is estimated by the sample standard deviation of the differences, 𝑠𝑑 = 𝑆𝐷(𝑑), testing the null hypothesis that there is no effect of the intervention, i.e., the test value, 𝜇𝑑 = 0. Let’s explore this with a simple example. Note: when using statistical software, there is no need to calculate the difference between pairs. However, this is what the statistical software is doing in the background when performing calculations. Exa Examp mp mple le 4.1: A small study was conducted to assess the impact of short-term carcinogenic exposure on lung capacity in non-smokers. Ten males who did not usually smoke cigarettes were tested for maximum voluntary ventilation (MVV), measured in L/min before, and after a period of 15 minutes during which each smoked 3 cigarettes. The study design is visually represented in Figure 4.1 below.

Smoked 3 cigarettes in 15 minutes ["exposure"]

MVV (L/min) measured [MVV before]

MVV (L/min) measured [MVV after]

Figure 4.1: Schematic representation of the study design in Example 4.1. The same sample of participants have their MVV measure before, and then again after, smoking three cigarettes. The results from the study are contained in the jamovi file MVV.omv and are reproduced below.

Participant number 𝑖 1 2 3 4 5 6 7 8 9 10

Maximum voluntary ventilation (MVV); L/min MVV before MVV after 𝑥𝑖1 𝑥𝑖2 151 110 102 89 144 113 130 100 107 107 153 113 149 168 138 112 131 123 96 113

We are interested in finding out whether exposure to three cigarettes in a short period of time in nonsmokers will impact their maximum voluntary ventilation (a measure of the maximum amount of air that can be inhaled and exhaled within one minute, where higher values are considered better). If

Sydney School of Public Health

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

3

smoking had no effect on MVV, then we would expect the differences to be zero. We can formulate a null hypothesis to this effect, then calculate the probability of the observed data, or more extreme data, occurring if this null hypothesis were true (the p-value). Null hypothesis: There is no difference in the mean maximum voluntary ventilation before and after smoking three cigarettes. (Equivalently, the null hypothesis could be stated as: the mean difference between MVV before and after smoking three cigarettes is equal to zero; or that smoking three cigarettes will have no effect on mean MVV). We begin by calculating the difference in MVV for each participant by taking the ‘MVV after’ measurement away from the ‘MVV before’ measurement (𝑑𝑖 = 𝑥𝑖1 − 𝑥𝑖2 ). This is shown in the results table below: Participant number 𝑖 1 2 3 4 5 6 7 8 9 10 Mean

Maximum voluntary ventilation (MVV); L/min MVV before MVV after Difference 𝑥𝑖1 𝑥𝑖2 𝑑𝑖 = 𝑥𝑖1 − 𝑥𝑖2 151 110 41 102 89 13 144 113 31 130 100 30 107 107 0 153 113 40 149 168 -19 138 112 26 131 123 8 96 113 -17 130.1 114.8 15.3

We then calculate the mean of the differences (𝑑) which serves as our point estimate, and the standard deviation of the differences (𝑠𝑑 ). Using a calculator or statistical software, we can obtain these values: 𝑑 = 15.3 L/min,

𝑠𝑑 = 22.01 L/min

To construct our test statistic, we also need to calculate the standard error of the mean of the differences (𝑆𝐸(𝑑 ) ) which is given by: 𝑆𝐸(𝑑) =

𝑠𝑑

√𝑛

=

22.01 √10

= 6.96

The test statistic for the paired samples t-test is given by: 𝑡=

𝑑 with 𝑛 − 1 𝑑𝑓 𝑆𝐸(𝑑 )

Paired samples t-test

Substituting the values in this example, we have: 𝑡=

15.3 = 2.198 𝑤𝑖𝑡ℎ 9 𝑑𝑓 6.96

This quantity is referred to the t-distribution with 9 degrees of freedom, which has a two-tailed p-value of 0.056.

Sydney School of Public Health

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

4

Finally, we can also calculate a confidence interval for the point estimate (i.e., the mean of the differences). Recall from Lecture 2 that the formula for the confidence interval depends on the point estimate, a multiplication constant (or critical value) for the desired level of confidence, and the standard error of the point estimate. For a 95% confidence interval using the tdistribution with 𝑛 − 1 = 9 degrees of freedom, the multiplication constant is 2.26. Using the values for this example, the 95% confidence interval (CI) for the difference is: 95%CI = (𝑑 ± 𝑡𝑛−1,0.05 × 𝑆𝐸(𝑑 )) = (15.3 ± 2.26 × 6.96) = −0.4 to 31.0 L/min That is, while smoking may affect MVV, this small study does not provide strong evidence to make such a statement. Our best estimate of the effect is the point estimate of a reduction in MVV of 15.3 L/min. However, the 95% confidence interval indicates that smoking may actually increase MVV by 0.4 L/min or it may decrease MVV by as much as 31.0 L/min. Any value within this interval for the true change in MVV is consistent with the data. Note: The 95% confidence interval also contains the value zero, consistent with the effect not being statistically significant at a 5% level. Conclusion: This study of ten male non-smokers provides only weak statistical evidence of an effect of smoking on MVV (t(9)=2.20, p=0.056). Smoking three cigarettes in 15 minutes was found to reduce MVV by an average of 15.3 L/min. The mean MVV before smoking was 130.1 L/min, compared to 114.8 L/min afterwards. The 95% confidence interval of the change in mean MVV covered an increase of 0.4 L/min, to a decrease of 31.0 L/min. Example 4.1 illustrates the importance of reporting significance levels (p-values) as precisely as possible. If the result had been reported simply as "P > 0.05" this would have obscured the fact that the p-value was only slightly greater than 0.05. When using statistical software to carry out such calculations, the exact value should be given (to no more than 2 significant figures). This study design, using pairs of measurements on individuals, allows us to compare each participant with themselves. Any between-individual variability does not affect the results; only the withinindividual variability matters. Thus, when there is much more variation between individuals' measurements than within an individual, this form of pairing provides a more powerful design than simply comparing two different groups. Another way of achieving a paired design is to match participants for at least one variable which is thought to influence the response. Each member of the matched pair is then randomly allocated to a different treatment group and the differences between the responses for each matched pair are analysed. If the matching is effective, the effect of between-individual variability can be substantially reduced. Therefore, when matching is possible, it can improve the efficiency of a study. However, sometimes it may be difficult to match some individuals, especially if many variables are used for matching. More advanced approaches to matching do exist, however these are beyond the scope of this introductory course. Furt Further her re reading ading Kirkwood & Sterne (§7.6); Armitage, Berry & Matthews (§4.3)

Sydney School of Public Health

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

5

Com Compa pa paris ris rison on o off Two Inde Indepen pen penden den dentt Mea Means ns (Inde (Independe pende pendent nt sam sample ple pless t-te -test) st) Another common type of comparison made in health and medical research is between the means obtained from two independent (i.e., non-related) groups. These groups can occur by design, such as in a randomised controlled trial (RCT) where individuals are randomised to either an intervention or control group; or the groups may already exist without any experimental manipulation, such as in a cross-sectional study comparing individuals from two different occupational groups (e.g., office workers compared to labourers). Regardless of the study design, the main consideration here is that individuals appear in one and only one group and are therefore independent of each other. Even when pairs of observations are being taken on the same individuals to measure change in a variable, it is usually necessary to have a control group for comparison. This can help in separating out the effect of an intervention from changes that occur due to the study design itself (e.g., placebo or practice effects). In this situation, we have two independent samples, this time with the outcome being change in the variable, which is measured in both the treatment and control groups and then compared. In this section, we will consider the comparison of means from two independent samples when the outcome variable is continuous and assumed to be approximately Normally distributed. Theo Theorreti etical cal consid consideratio eratio eration ns Unlike in the one sample t-test and paired samples t-test, we are now dealing with two unrelated means for a variable. In an independent samples t-test, we are interested in testing the equality of the means (estimated from the sample means), under the assumption that the variability in the two groups is the same (i.e., 𝜎1 = 𝜎2 (= 𝜎)) and both groups are Normally distributed.

As with the one sample and paired samples t-tests, the population standard deviation (𝜎) is not usually known, and so needs to be estimated from the data. However, since we have two (independent) samples, we also now have two available estimates for the common standard deviation – the sample standard deviation from group 1 (𝑠1) and the sample standard deviation from group 2 (𝑠2). We can obtain a pooled (or combined) estimate of the standard deviation, 𝒔, as a weighted mean of the standard deviations within the two samples:

(𝑛 − 1)𝑠12 + (𝑛2 − 1)𝑠22 𝑠=√ 1 𝑛1 + 𝑛2 − 2 where 𝑛1 is the sample size in group 1, 𝑛2 is the sample size in group 2.

From this, we can derive the standard error of the difference in the means, 𝑺𝑬(𝒙 𝟏 −  𝒙𝟐 ), which can be used in the construction of our test statistic: 𝑆𝐸(𝑥1 − 𝑥2 ) = 𝑠√

1 1 + 𝑛1 𝑛2

While these equations can look quite intimidating, the good news is that it they are calculated for you by statistical software packages. Let’s demonstrate this with an example. Stud Student’s ent’s In Inde de depen pen pendent dent samp samples les t-t t-test est Exa Examp mp mple le 4.2: As part of an investigation of factors underlying capacity for exercise, a group of elite athletes in training was compared with a group of factory workers of the same age range. Heart rate (beats per minute; bpm) at a given level of oxygen consumption was obtained during an exercise challenge. The research question was whether capacity for exercise (as indexed by heart rate) differed

Sydney School of Public Health

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

6

between factory workers and elite athletes. The study data has been provided in the jamovi dataset exercise.omv. Null hypothesis: There is no difference in the mean heart rate between athletes and factory workers. After stating our null hypothesis, we can calculate the sample statistics from the data: Group 1: Factory Workers Group 2: Athletes

𝑛1 = 22 𝑛2 = 18

𝑥1 = 122.455 bpm 𝑥2 = 106.667 bpm

𝑠1 = 16.186 bpm 𝑠2 = 12.132 bpm

Thus, 𝑥1 − 𝑥2 = 122.455 − 106.667 = 15.788 bpm. Note: these values do not need to be derived separately from the test procedure in jamovi but are provided to demonstrate how the test statistics are constructed by the software package. We can then calculate the pooled estimate of the standard deviation: (22 − 1)16.192 + (18 − 1)12.132 𝑠=√ = 14.515 bpm 22 + 18 − 2 and the standard error of the difference in the means: 𝑆𝐸(𝑥1 − 𝑥2 ) = 14.515√

1 1 + = 4.612 22 18

We now have everything we need to construct our test statistic. For an independent samples ttest, the test statistic is given by: 𝑡=

𝑥1 − 𝑥2 with 𝑛1 + 𝑛2 − 2 𝑑𝑓 𝑆𝐸(𝑥1 − 𝑥2 )

Independent Samples t-Test

Substituting the values from this example gives: 𝑡=

15.788 = 3.42 with 38 𝑑𝑓 4.612

This quantity is referred to the t-distribution with 𝑛1 + 𝑛2 − 2 = 38 degrees of freedom, which has a two-tailed p-value of 0.001. Finally, we can also calculate a confidence interval for the difference in means in a comparable way as before, by using the point estimate (the difference in sample means; 15.78), a multiplication constant (for a 95% confidence interval based on the t-distribution with 𝑛1 + 𝑛2 − 2 = 38 degrees of freedom, the critical value is 2.024), and the standard error of the point estimate (4.613). Using the values in this example, the 95% confidence interval is given by: 95% 𝐶𝐼 = (15.78 ± 2.024 × 4.6131) = 6.4 to 25.1 bpm Conclusion: There is strong evidence that athletes have a slower mean heart rate during exercise of 107 bpm compared with 123 bpm for factory workers (t(38)=3.42, p=0.001). The mean difference is estimated as 16 bpm, and we are 95% confident that the true (population) difference is between 6 to 25 bpm slower for athletes compared to factory workers.

Sydney School of Public Health

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

7

Even though the conclusion for this hypothesis test is concise (only 2 sentences) it contains everything that we would expect to see based on the recommendations provided in Lecture 3: • Strength of evidence statement (i.e., “strong evidence”) • Statistics supporting this statement (i.e., t-value, degrees of freedom, and p-value) • A point estimate of the effect (the difference in means of 16 bpm) • A confidence interval for the point estimate (95% CI for the difference of 6 to 25 bpm) • The direction of the effect is clear (i.e., “..athletes have a slower mean heart rate … “) • The group means have been provided (107 bpm for athletes; 123 bpm for factory workers) • Units of measurement have been included (heart rate, measured as beats per minute) • Results are displayed with an appropriate degree of precision (heart rate is typically measured in “full” beats, so having integer values is appropriate, as would to 1 decimal place) Further reading Bland 3rd&4th ed. (§8.5, 10.3) Kirkwood & Sterne (§7.4) Armitage, Berry and Matthews (§4.3)

Sydney School of Public Health

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

8

jam jamovi ovi ovi:: Pai Paired red sa samp mp mples les t-te t-tests sts Paired data in jamovi should be entered in two separate columns, as each participant should have a pair of observations. For Example 4.1 (data available in MVV.omv), the data should appear as follows:

To perform a paired samples t-test, select: Analyses ► T-Tests ► Paired Samples T-Test Select the two variables to be analysed and click the to move each variable to the Paired Variables area. The first variable moved across will appear on the left side of the Paired Variables box. If the variables are moved together, the variable which was clicked first appears on the left. This is important because jamovi will test whether the mean of the differences between the variables is different from 0, by subtracting the second from the first variable (e.g., MVV_before – MVV_after in this example). If the variables were switched around the direction of the difference would be different (i.e. MVV_after - MVV_before). You will also need to request Mean difference, Confidence interval and Descriptives under Additional Statistics.

Sydney School of Public Health

Semester 1, 2021

PUBH5018 Introductory Biostatistics – Lecture 4

9

Select each variable and click the arrow to move them into the Paired Variables area

Select these checkboxes to generate descriptive statistics, mean difference, and a confidence interval for the mean difference.

Pair Paired ed Sa Samp mp mples les T-T T-Test est O Outp utp utput ut

[Values may differ slightly from the hand calculations due to loss of precision with rounding]

Sydney School of Public Health


Similar Free PDFs