Title | Independent Project |
---|---|
Course | Intro to Biomedical Statistics |
Institution | National University (US) |
Pages | 8 |
File Size | 354.8 KB |
File Type | |
Total Downloads | 50 |
Total Views | 154 |
Graded independent project - got a 16/20...
16 of 20 Eric Sweeten BST 322 5/31/2015 Independent Project
DUE DATE: End of Course 20 points Using the data set supplied in Doc Sharing and StatCrunch, provide the following for the variable(s) of your choice: 1. Frequency distribution of a variable and bar graph of the same variable
Frequency table results for Age: Count = 972 Age Frequency Relative Frequency 15 to 20
1
0.0010288066
20 to 25
38
0.03909465
25 to 30
109
0.11213992
30 to 35
165
0.16975309
35 to 40
315
0.32407407
40 to 45
243
0.25
45 to 50
101
0.10390947
StatCrunch gave a small image of the bar graph, and so I blew it up twice the size using Paint.net. 2. Descriptives of a continuous variable: mean, median, skewness, kurtosis, standard deviation and graph of that variable
Summary statistics:
Column BMI
Mean 29.224268
Median Skewness
Kurtosis Std. dev.
28.03 0.97395123 1.0616977 7.397582
I chose BMI since it is continuous, and since it is continuous, a histogram was appropriate. The mean, median, skewness, kurtosis, and standard deviation are all highlighted in the table above the histogram.
3. Cross tabulation of two variables You should not run cross tabulation with 2 ratio variables . It is very hard interpret these results. The scatterplot between these two variables would be more appropriate
Contingency table results: Rows: Age Columns: Kids Cell format Count (Percent of total) (Expected count) 0
1
2
3
4
5
6
7
8
9
10
0 0 0 0 0 0 0 1 0 0 0 15 to (0%) (0%) (0%) (0.1 (0%) (0%) (0%) (0%) (0%) (0% (0% ) ) %) (0.14 (0.0 (0.0 (0.0 (0.0 20 (0.0 (0.21 (0.29 1) (0) (0) 1) 2) 5) ) ) (0.2) ) 6)
Total 1 (0.1 %)
20 to 25
38 0 0 0 0 0 0 1 7 14 14 2 (0.2 (1.44 (1.44 (0.72 (0.1 (0%) (0%) (0%) (0%) (0% (0% (3.91 %) ) ) %) (2.0 (0.8 (0.4 (0.2 %) %) %) 1%) 3) (0.0 (0.0 3) 2) 7) (2.3 (8.05 (11.0 (7.55 (5.32 4) 4) ) ) 6) ) 8)
25 to 30
1 109 0 0 0 1 9 19 30 30 17 2 (0.2 (1.75 (3.09 (3.09 (1.95 (0.9 (0.1 (0%) (0%) (0% (0.1 (11.2 ) %) 1%) %) 3%) %) (1.2 (0.6 %) %) %) 1%) 7) (0.1 (0.1 3) (6.8 (23.1 (31.7 (21.6 (15.2 (5.9 (2.3 1) 1) 5) 4) 5) 4) 4) ) 4)
30 to 35
0 165 0 2 6 4 35 13 33 36 29 7 (0.7 (2.98 (3.7 (3.4 (3.6 (1.3 (0.4 (0.6 (0.2 (0% (0% (16.9 ) 8%) ) %) 4%) 1%) 2%) 1%) %) %) %) 2%) (10. (34.9 (48.0 (32.7 (23.0 (9) (3.5 (1.8 (1.0 (0.1 (0.1 2) 7) 7) 7) 6) 9) 6) 4) 7) 35)
35 to 40
0 315 1 3 3 11 46 23 79 82 47 20 (2.0 (4.84 (8.44 (8.13 (4.73 (2.3 (1.1 (0.3 (0.3 (0.1 (0% (32.4 ) 1%) %) 7%) 3%) 1%) 1%) %) %) %) %) 6%) (19. (66.7 (91.7 (62.5 (44.0 (17. (6.8 (3.5 (1.9 (0.3 (0.3 4) 2) 2) 6) 1) 7) 18) 5) 1) 6) 77)
40 to
17 59 88 (1.7 (6.07 (9.05
33 31 8 4 2 1 0 0 243 (3.4 (3.19 (0.8 (0.4 (0.2 (0.1 (0% (0% (25%
45
5%) %) %) %) (15. (51.5 (70.7 (48.2 25) ) 5) 5)
%) 2%) 1%) 1%) %) ) ) (34) (13. (5.2 (2.7 (1.5) (0.2 (0.2 25) 5) 5) 5) 5)
45 to 50
0 101 0 0 0 1 0 4 10 33 40 13 (1.3 (4.12 (3.4 (1.03 (0.41 (0%) (0.1 (0%) (0%) (0% (0% (10.3 ) 9%) ) %) (5.5 %) (1.1 (0.6 %) %) %) 4%) 2) (0.1 (0.1 4) 1) (2.1 (6.3 (21.4 (29.4 (20.0 (14.1 ) ) 8) 3) 5) 1) 1) 4)
Tot 61 206 283 193 136 53 21 11 6 1 1 al (6.2 (21.1 (29.1 (19.8 (13.9 (5.4 (2.1 (1.1 (0.6 (0.1 (0.1 8%) 9%) 2%) 6%) 9%) 5%) 6%) 3%) 2%) %) %) Chi-Square test: Statistic DF Value
)
972 (100 %)
P-value
Chi-square 60 136.76828 0.05. This means number of kids one has had does not significantly affect BMI. 0.0022 < 0.05. This means the number of doc visits one has is significantly affected by BMI. This is correct interpretation, however, you should avoid the word “affected” also, BMI is your outcome variable, so you want to say that the only significant predictor of BMI is Doc visits 5. Scatterplot of two continuous variables
This is the scatterplot of BMI (x) by Physical Health (y). I figured there would be a significant correlation between the two, so that’s why I picked these variables. Looking at the graph, before calculating the correlation coefficient in the next problem, there doesn’t seem to be a significant correlation, but you cannot always judge based on looks, which is why the correlation coefficient is handy to calculate. A regression line would show you that there is weak negative correlation
6. Correlation between the two continuous variables from #5 above
Simple linear regression results: Dependent Variable: Physical Health Independent Variable: BMI Physical Health = 51.36503 - 0.21017009 BMI Sample size: 849 R (correlation coefficient) = -0.14536102 R-sq = 0.021129827 Estimate of error standard deviation: 10.626462
There is a weak correlation between physical health and BMI. With physical health being on the y-axis and BMI on the x-axis, it is a weak negative correlation. This means there is a correlation, though slight, that as BMI increases, physical health decreases. Is this correlation statistically significant? You should indicate that.
Think carefully about what kind of variables to choose for the given tasks. A short descriptive statement should accompany each of the above including a description of the variables used and any meaning that may be attached to the results. The student must show that she or he is able to synthesize and apply the materials learned in class. Part of the class computer time is expected to be spent on this project. Submit to the appropriate Drop Box in a Word document. Grading on this project is as follows: 3 points for each task 1-6: 1 point each for variable choice, appropriate display/test, description of result. 2 points for overall format/readability/construct (so make it neat and tidy)...