Biostatistics 2nd Committee-Afrah s Notes PDF

Title Biostatistics 2nd Committee-Afrah s Notes
Course Medicine
Institution Hacettepe Üniversitesi
Pages 17
File Size 663.8 KB
File Type PDF
Total Downloads 123
Total Views 201

Summary

Biostatistics Lab Phase 1 COMMITTEE 2 Biostatistics Lab 1 Paired Samples t Test We are going to learn how to use the Samples t and the Wilcoxon signed rank test There is NO grouping variable in these two tests. We have just two measures that are obtained from the same individuals, related units or o...


Description

Biostatistics Lab 1/2017

Phase 1

COMMITTEE 2

Biostatistics Lab 1 Paired Samples t Test

We are going to learn how to use the “Paired Samples t Test” and the “ Wilcoxon signed rank test ”. There is NO grouping variable in these two tests. We have just two measures that are obtained from the same individuals, related units or objects. These measures are “dependent” measures.

What are those tests and when we can use them ? We use these two tests to compare between paired samples. Let us take an example to understand what do we mean by “Pair Samples”. Let us suppose that there is a cholesterol medicine and we want to know whether this medicine has an effect on the Diastolic Blood Pressure (DBP) of the patients or not. Of course we need to have patients to make a conclusion for our research. Therefore, we have got 30 patients and we measured their DBP before using the cholesterol medicine. After that, those patients started to use the cholesterol medicine for 6 months. After 6 months of using the cholesterol medicine, we have measured the DBP of those (same) patients again. Thus, we have got 30 patients and each one of them has two DBP values, which are Before taking the Cholesterol medicine and after taking it. Generally, to see the effect of a treatment, the interested outcome is measured before and after the treatment. Thus, the data consists of paired measurements recorded at the two different times from the same individuals. This data is called paired samples. We use the “Paired Samples t Test” and the “ Wilcoxon signed rank test ”to : 1. Measure a specific feature of the same individuals at two different times. 2. Measure a specific feature of the same individuals by two different measurement tools. 3. Measure a specific feature of the same individuals by two different observers. In this lab, we are going to talk about the Paired Samples t Test. The Paired Samples t Test compares two means that are from the same individual, object, or related units.

Now, let us solve an example and understand more. Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 1/2017

Phase 1

COMMITTEE 2

Open “Data Sets” > 13-14-1.sav Open “13-14-Hypothesis testing for two paired samples” PDF worksheet. Go through example 1 which is : A clinician searches whether the cholesterol medicine affects the diastolic blood pressure (DBP). He gathers the DBP values of 30 patients right before they begin to use the medicine and 6 months later. Firstly, we must calculate the difference between the two values of DBP ( pre-DBP and postDBP ). To do this, follow the 2nd step in page 1 of the working sheet. Note that : the arrangement of the two measures in the subtraction process is not important. i.e. it is ok if you do [(Pre_DBP) – (Post_DBP )] OR [(Post_DBP ) - (Pre_DBP)]. Secondly, get the distribution of difference and see whether it is normally distributed or not. ( Follow the 3rd step in the working sheet ) Note : When the normality assumption of the “difference of the two values” is normal (i.e. p-value is higher than 0.05 ), we must use Paired Samples t Test. On the other hand, When the normality assumption of the “difference of the two values” is NOT normal (i.e. p-value is less than 0.05 ), we must use Wilcoxon signed rank test. Thus, The Paired Samples t Test ,which deals with the mean, is a parametric test and the “Wilcoxon signed rank test” ,which deals with the median, is a non-parametric test. Now, You will see that the p-value of difference is 0.252 which is higher than (0.05). Therefore, we can accept the null hypothesis and say that : There is no difference between the distribution of “difference” and the normal distribution. Important note : We do not need to examine the Homogeneity of Variances because we do not have different groups in this study. In other words, the two values ( Pre-DBP and Post-DBP ) are NOT

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 1/2017

Phase 1

COMMITTEE 2

independent; they are related to the same patients. Thus, we will not do the Homogeneity of Variances test. Finally, as the difference is normally distributed, now we can start our final step and use the Paired Samples t Test. Before we start doing the test, we have to construct the null hypothesis and the alternative hypothesis which are : H0: There is no difference between the mean values of Pre-DBP and Post-DBP. H1: There is a difference between the mean values of Pre-DBP and Post-DBP. Now, follow the 4th step in the work sheet to do the Paired Samples t Test. Note that the arrangement of the two measures in the table of the test is not important. i.e. you can put the Post-DBP in the variable 1 box or in the variable 2 box and vise versa. Now, as you can see from the results that the p-value = 0.003 which is less than (0.05). Therefore, we are going to reject the null hypothesis and say : There is a difference between the mean values of Pre-DBP and Post-DBP. Namely, ( as it is written in the work sheet) : Since p value is less than 0.05 (p=0.003), there is a statistical difference between the mean values of Pre-DBP and Post-DBP. We can see the mean values in the “Paired Sample Statistics” table. We can result that the mean Post-DBP is significantly greater than the mean Pre-DBP. After 6 month-use of cholesterol medicine, the mean DBP increases. Now, do practice on the exercises at the last paper of your worksheet.#

Thank you and Good Luck ☺

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 2/2017

Phase 1

COMMITTEE 2

Biostatistics Lab 2 Wilcoxon signed rank test

Important Note : To understand this lesson, you MUST go through the first lab note, which is the one before this note. In this lab we are going to learn how to do Wilcoxon signed rank test which is a non-parametric test that is used to compare between the medians of paired samples. Open “Data Sets” > 13-14-2.sav Open “13-14-Hypothesis testing for two paired samples” PDF worksheet. Go to page 3 of your worksheet > example 2 which is : A clinician wants to find out whether the postprandial blood glucose (PBG) changes after 2 week-diet in diabetic patients. PBG values of 12 patients are gathered. Firstly, we must calculate the difference between the two values of PBG ( pre- PBG and postPBG ). To do this, follow the 2nd step in page 1 of the working sheet. Note that : the arrangement of the two measures in the subtraction process is not important. i.e. it is ok if you do [(Pre_ PBG) – (Post_ PBG )] OR [(Post_ PBG ) - (Pre_ PBG)]. Secondly, examine whether the “Difference” variable is normally distributed or not. ( Follow the 3rd step of the first example in the working sheet page 1 ) Now, You will see that the p-value of difference distribution is 0.004 which is less than (0.05). Therefore, we are going to reject the null hypothesis and say that : There is a difference between the distribution of “difference” and the normal distribution.

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 2/2017

Phase 1

COMMITTEE 2

Finally, as the difference is not normally distributed, we cannot use the Paired Samples t Test. Instead, we are going to use the Wilcoxon signed rank test to compare Pre-PBG and Post-PBG values. Note that, before we start doing the test, we have to construct the null hypothesis and the alternative hypothesis which are : H0: There is no difference between the median values of Pre_PBG and Post_PBG. H1: There is a difference between the median values of Pre_PBG and Post_PBG. Now, follow the 4th step in the work sheet ( page 3 ) to do the Wilcoxon signed rank test. Now, as you can see from the results that the p-value = 0.017 which is less than (0.05). Therefore, we are going to reject the null hypothesis and say : There is a statistically difference between the median values of Pre_PBG and Post_PBG. As you can see that we cannot get the descriptive statistics, such as Median, maximum and minimum, from Wilcoxon signed rank test . Therefore, we should know how to get them to make a conclusion for our data. (To do that, follow the steps mentioned in page 4 of the worksheet) From the table that we have got, we can interpret (as it is written in the work sheet) : The median PBG values are 234.5 mg/dl (min-max: 201-326) before the diet and 216 mg/dl (min-max: 132-256) after the diet. We can conclude that the median of Post-PBG is significantly less than the median of Pre-PBG. In other words, There is a difference between the median values of Pre_PBG and Post_PBG. #

Thank you and Good Luck ☺

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 3/2017

Phase 1

COMMITTEE 2

Biostatistics Lab 3 One-way ANOVA

As we have learnt in the first committee that to compare between two independent groups according to a continuous variable, we have to use one of two types of tests which are : 1. Independent Sample T-Test. (parametric test) 2. “Mann-Whitney U test”. ( non-parametric test ) Refer to LAB 8 and LAB 9 (First committee notes) for more details. However, when we have more than two independent groups, we cannot use the tests mentioned above. This is because Type 1 error is going to increase by this way. Therefore, to control Type 1 Error, we are going to use other types of tests to compare between more than two independent groups according to a continuous variable. These tests are One-Way ANOVA and Kruskal-Wallis Analysis. In this lab, we are going to talk about One-Way ANOVA which is a parametric test. One-way ANOVA is a test which is used to compare between more than two independent groups with respect to the means of a continuous variable. In other words, by using One-way ANOVA test we can determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups. Now, Let us start to practice and understand more. Open “Data Sets” > 15-16-ANOVA.sav Worksheets > 15-16-Hypothesis testing for more than 2 independent samples. Read the first example in page 1 of your worksheet. As you can see, in this example, that we are asked to analyze the difference between the ways of drug administration with respect to drug concentration. Firstly, we have to check the assumptions of Normal Distribution and the Homogeneity of Variances.

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 3/2017

Phase 1

COMMITTEE 2

I am not going to go through the steps of these assumptions as all of them are mentioned in the worksheet and the previous lab notes as well. As you can see from the results, that the assumptions of Normality and Homogeneity of Variances are supported. In other words, the p-value for the Normality assumption is higher than (0.05) and for the Homogeneity of Variances (Based on mean ) is also higher than (0.05). Thus, according to those results, we can use the ONE-Way ANOVA test which is a parametric test. Before we start this test we have to construct the null and alternative hypothesis. H0: There is no difference between the four groups with respect to the means of drug concentrations in blood. H1: There is a difference between the four groups with respect to the means of drug concentrations in blood. Now, follow the steps below : Analyze > Compare Means > One-Way ANOVA > Dependent List: Drug_concentration > Factor: Group > OK As you can see that the p-value that we have got from the table is (0.00). Thus, we are going to reject the null hypothesis and say : There is a difference between the four groups with respect to the means of drug concentrations in blood. However, the step above gives us a general result. Thus, we can do another step to get a comparison between all the groups in details. To do that, follow the steps below : Analyze > Compare Means > One-Way ANOVA > Dependent List: Drug_concentration > Factor: Group > Click on: “Post Hoc..” > Select: “Tukey” > Continue > OK As you can see from the table that we have got from this step, there are differences between all pairs of groups (all p values 15-16-KW .sav Worksheets > 15-16-Hypothesis testing for more than 2 independent samples. Read the example in page 4 of your worksheet. As you can see, in this example, that we are asked to find out whether there is a difference between the stages of cancer with respect to the AgNor values. Firstly, we have to check the assumptions of Normal Distribution and the Homogeneity of Variances. I am not going to go through the steps of these assumptions as all of them are mentioned in the worksheet and the previous lab notes as well. As you can see from the results, that the normality assumptions for stage I and stage III are not supported. On the other hand, normality assumption is supported for stage II. In addition, Homogeneity of Variances assumption (Based on mean ) is supported. Note that, supported means that the p-value is higher than (0.05) and not supported means the pvalue is less than (0.05). As you can see that not ALL the reliable p-values are higher than (0.05). Therefore, we cannot use the One-Way ANOVA test. Instead, we are going to use the non-parametric test which is Kruskal-Wallis Analysis. Before we start this test we have to construct the null hypothesis as well as the alternative one. H0: There is no difference between three groups with respect to the median AgNor value. H1: At least one group is different from others with respect to the median AgNor value. Now follow the steps (shown in the last box ) in page 5 of the worksheet. As you can see from the table, which you got, that p-value (0.019) is less than (0.05).

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 4/2017

Phase 1

COMMITTEE 2

In this case we are going to reject the null hypothesis and say : At least one group is different from others with respect to the median AgNor value. Now, we need to determine which groups are exactly different. To do that we need to have “posthoc test results”. To get the “post-hoc test results”, please watch the video attached with this LAB Note or follow the steps in the worksheet, page 6. As you can see that you are going to get a table titled “Pairwise Comparisons of Stage”. From this table you can interpret as it is written in the worksheet : There is no significant difference between AgNor median values of stage 1 and stage 3 (p=1.000) and the median AgNor values of stage 2 and stage 3 (p=0.061). The median AgNor value of stage 1 differs from the one of stage 2 (p=0.032). It is important that you know how to interpret the data provided in the table. You should also have to know how to get the median, minimum and maximum AgNor values of each group. To do that follow the last step in page 6 of your worksheet. Actually, we have already learnt how to do this in the previous labs. As you can see from the results that median AgNor value of stage 2 is greater than the one of stage 1. That actually justifies the p-value, which is (0.032), of stage1-stage2. Important note : I have covered the important points in this lab. However, you need to refer to the worksheet as it has more details. Now, go through the examples and start to practice. #

Thank you and Good Luck ☺

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 5/2017

Phase 1

COMMITTEE 2

Biostatistics Lab 5 Chi-Square Test

Chi-Square Test

is used to determine if there is a significant relationship between two categorical

variables. In other words, the main aim of Chi-Square Test is to check the relationship between two categorical variables. Now, let us do an example : Open “Data Sets” > 17-.sav Worksheets > 17-Chi Square. Read the first example in page 1 of your worksheet. As you can see in this example, we are asked to identify whether there is a relationship (association) between taenia presence and the environmental health conditions. You can also see that, there is a table provided in this question . It is important that you know the name of this table is “Crosstabs”. Be careful, from the data provided in the crosstabs we cannot evaluate or give a conclusion about the relationship between taenia presence and the environmental health conditions. However, from this crosstabs we are going to search the

relationship between these two

categorical variables. As we have said that this table is called “Crosstabs” which has a size. Now, what is the size of this crosstabs ? It is 2x2 with 4 cells. As you can see that we have 2 variables and each one of them has 2 subvariables. In other words, the first variable is the “environmental health conditions ” which has 2 sub-variables which are good and bad. The second variable is the “presence of taenia” which has two sub-variables that are present and absent. 2x2 = 4 cells.

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 5/2017

Phase 1

COMMITTEE 2

Note that we are going to be asked about the 2x2 crosstabs in the exam. There will not be any other size. Now, let us go back to solve our example. You have to know that in these kinds of questions you will not get the data ready in the SPSS. You have to enter them by yourself using the data from the crosstabs provided in the question. You must know how to enter them perfectly. To do that : Firstly, enter the data as it shown in the picture in step number one of the working sheet, page 1. Secondly, make the labels as explained in the same step (step number one). We have already learnt how to do the labels in the last committee. After you have entered your data correctly, you must do an important step which is introducing the cell frequency. That can be done by the steps below: go to : Data

> Weight Cases > Activate the “Weight Cases by” and put the “frequency”

variable into the Frequency variable box > OK Now, you should perform a checking step to be sure that you have done everything correctly so far. This step is as the following . Analyze

> Descriptive Statistics

> Crosstabs

> Put the “environmental_health” variable

into the Row(s) and the “presence_taenia” variable into the Column(s) > Ok

Note that it is not important whether you put the “environmental_health” variable in the row(s) or column(s) and of course, it is the same for “presence_taenia” variable. However, it is better to place them according to their locations in the crosstabs provided in the question to get a table that is identical to the crosstabs of the question. In other words, as you can see that the “environmental_health” sub-variables are in the rows and the “presence_taenia” subvariables are in the columns. Now, after this step, you are going to get a crosstabs that is exactly similar to the one given in the question. By this way, you will be reassured that the steps you have done recently are accurate. Therefore, we can start our work now to do the Chi-Square Test . Follow the third step in the first page of your worksheet.

Afrah AL Balushi

Sultanate of Oman

Biostatistics Lab 5/2017

Phase 1

COMMITTEE 2

Now, our hypothesis will be : H0 : There is no significant relationship between presence of the taenia and the environmental health conditions. H1 : There is a significant relationship between presence of the taenia and the environmental health conditions. Or as it is written in the worksheet : H0: The association between the environmental health conditions and presence of the taenia is not significant. H1: The association between the environmental health conditions and presence of the taenia is significant. Now, after you have done the last step, you are going to get more than one table, one of them is the table titled “Chi-Square Te...


Similar Free PDFs