STA108 - Descriptive Statistic Project PDF

Title STA108 - Descriptive Statistic Project
Author Ibrahim Alli
Course Statistics & Probability
Institution Universiti Teknologi MARA
Pages 27
File Size 962 KB
File Type PDF
Total Downloads 78
Total Views 109

Summary

UNIVERSITI TEKNOLOGI MARA KAMPUSCOURSECOURSE CODESEMESTER: STATISTICS & PROBABILITY: STA: OCTOBER 2020 – FEBRUARY 2021PROJECT TITLE“A study on weight and running speed: The descriptivemeasures and simple linear regression”GROUP MEMBERSNO. NAME STUDENT ID SIGNATURE16 / 20Marked.TABLE OF CONTE...


Description

UNIVERSITI TEKNOLOGI MARA KAMPUS

COURSE

: STATISTICS & PROBABILITY

COURSE CODE

: STA108

SEMESTER

: OCTOBER 2020 – FEBRUARY 2021

PROJECT TITLE “A study on weight and running speed: The descriptive measures and simple linear regression”

GROUP MEMBERS NO.

NAME

STUDENT ID SIGNATURE

Marked.

16.67 / 20

TABLE OF CONTENT

PAGE ACKNOWLEDGEMENT

1

CHAPTER 1:INTRODUCTION 1.1 Background of study

2

1.2 Objectives of Study

2

1.3 Significance of Study

3

1.4 Limitation of Study

3

CHAPTER 2: METHODOLOGY 2.1 Data Description

4

2.2 Graphical Description

6

2.3 Numerical Technique

7

CHAPTER 3: RESULTS AND INTERPRETATION 3.1 Data Representation

8

3.2 Descriptive Statistics Analysis

14

3.3 Correlation and Regression

18

CHAPTER 4: CONCLUSION 4.1 Report Summary

20

4.2 Appendix

21

REFERENCES

25

ACKNOWLEDGMENT We are very grateful as we managed to complete our analytical and statical report on the relationship between the age, gender and weight affecting the speed of running. We are also blessed with our lecturer, Madam X for helping and guiding us with this assignment. This assignment was a successful because of the effort and co-operation from our group members, A, B and C This semester is a very challenging as we are going through online class due to Covid-19, but we managed to pull it off with the good communication and teamwork in completing every assignment given.

1

CHAPTER 1: INTRODUCTION 1.1.

Background of Study The study is done to analyze the relationship between one’s weight against the speed of his or her running. Forty (40) volunteers from both genders participated in this study. The data for this study is taken from StatCrunch.com1, therefore, it is a secondary data set. The location of the study taken was not mentioned in the source.

The study was organized by semester 4 students to satisfy the requirements for the syllabus of STA108: Basic Statistics and Probability. We had chosen to evaluate the relationship between the weight against the running speed. Even though, gender is included but the focus of the study would be on the weight and the running speed.

Scientifically, one’s weight could significantly influence or affect the running speed. Anyone with heavier weight would need more force to overcome the gravity. Therefore, his or her running speed would be adversely affected. Thus, in this study we would investigate the relationship of these two significant continuous quantitative variables namely weight and running speed through a statistical approach. Another qualitative variable, gender of the volunteers is also included in this study as another possible factor that would affect the running speed.

1.2

Objectives of Study The objectives of this study are as below: 1. To describe the gender and weight of the volunteers. 2. To investigate the relationship between weight of the volunteers and their speed of running.

1

(n.d.). Retrieved January 01, 2021, from https://www.statcrunch.com/app/index.php?dataid=2862328

2

1.3

Significance of Study It is very important to know how our body works. Learning and understanding our body would eventually lead us to healthier lifestyle. This study will benefit everyone to understand how lighter weight could eventually improve our stamina which is shown in faster running speed. As gender is also included in this study as other possible factor, one would understand more the physiological differences between men and women in term metabolism rate. Therefore, they could know their biological limit that could eventually prevent them pushing their body to work beyond the limits. Furthermore, pushing one’s biological limit could lead to unwanted adverse effects. Moreover, experts note that you would be able to run about two seconds faster per mile for every pound that you lose. This means that if you lose 15 pounds, you would run about 30 seconds per mile faster, cutting a 5k time by a minute and a half just from your weight loss or a marathon time by 13 minutes2. In general, this study would motivate overweight men and women to lose weight and others to maintain the ideal weight.

1.4

Limitation of Study

The limitation of this study is the nature of the data set procurement. The data is obtained from an open database online. Thus, it is a secondary data. Furthermore, the data set does not include more background of the study when the data was collected. Therefore, we assume that the data was collected through a scientific experiment which included specific number of participants with their weight recorded before performing a run at designated distance. The study also does not include any existing health issues that might also affect the speed of running.

2

Basinger, R. (2019, August 12). Does Weight Loss Make You Run Faster? Retrieved January 27, 2021, from https://thewiredrunner.com/does-weight-loss-make-you-run-faster/

3

CHAPTER 2: METHODOLOGY 2.1

Data Description This assignment was a study for us to analyze the relationship between gender and weight affecting the speed of running. Due to this pandemic, the data set used was from the internet. This study was participated by 40 people with different age, gender, and weight.

The range of the age for the participant was from 18 - 30 years old, which can be considered from a teenager to an adult range of age. This study was participated by 19 females and 21 males, the weight of the participant was from 120 +/- lbs to 190+/- lbs. This case study helps us to learn how the speed of running is affected by gender, and weight of a person. The speed of the person was recorded as miles per hour.

In this study, gender is the qualitative variables. Meanwhile, weight and the speed of running are the quantitative variables.

I.

Population All members of the public in the unknown city/town of the study.

II.

Sample 40 members of the public of both genders with their age.

III.

Sampling Technique Simple random sampling. The sample was chosen randomly from the sampling frame of the researcher’s list of volunteers.

IV.

Data Collection Method Direct observation from the experiment conducted.

4

V.

Descriptions of Variable The variables of this study are gender, weight, and speed of running (in mph).

Variable

Type of Variable

Level of Measurement

Gender

Qualitative

Nominal Scale

Weight

Quantitative Continuous

Ratio Scale

Speed

Quantitative Continuous

Ratio Scale

5

2.2

Graphical Description

Firstly, a pie chart will be constructed based on the table of distribution of volunteers according to their gender. Therefore, the pie chart will show the percentage of the male and female volunteers in the study.

Secondly, a table will be tabulated based on the weight of the volunteers and the number of volunteers. From the data tabulated, a histogram will be constructed according to the distribution of the weight and the number of volunteers. The weight of the volunteers is placed on the x-axis of the graph and number of volunteers are on the y-axis.

Thirdly, a table with data of the running speed of the volunteers and the numbers of volunteers are tabulated. Based on the data, another histogram is construct with the distribution numbers of volunteers and the speed of running on y-axis and x-axis, respectively.

Fourthly, the box and whiskers plot for skewness will be used for the weight of the volunteers based on gender and for running speed based on gender.

In addition, a scatter plot diagram will be construct based on the speed and the weight of the volunteers. The running speed of the volunteers will be deposited on the y-axis and the weight deposited on the x-axis.

6

2.3

Numerical Technique

Based on this study, we have included the value of mean, median, mode, variance, standard deviation, skewness, range, minimum, maximum, first quartile and third quartile.

The relationship between the weights against the running speed is also represented by correlation and regression analysis. Correlation analysis is used to measure the strength of these two variables. The value of correlation coefficient for the data obtained in the study of the weight and the running speed is determine whether it has strong or weak positive or negative correlation.

Furthermore, regression is a simple linear relationship involving only two variables. One would be the dependent variable (y) while the other would be the independent variable (x). The dependent variable (running speed) is the variable in regression that cannot be controlled or manipulated. Besides, it will help to determine the type of relationship between two variables. Simple Linear Regression Equation: y = a + bx where, x = independent variable y = dependent variable a = y-intercept b = slop of the line

7

CHAPTER 3: RESULT INTERPRETATION

3.1

Data Presentation Based on the data that we have obtain, we already identified the qualitative and quantitative variable based on their characteristic. We can pinpoint that the quantitative data includes the variable of the weight in lbs. and the running speed in miles per hour. While the variable gender is classified as the qualitative data.

3.1.1

Gender The data includes a total of 19 (47.5%) Female and 21 (52.5%) Male of volunteers as shown in Table 3.1.1 below.

Table 3.1.1 Distribution of volunteers Based on Gender Gender

Number of Volunteers

Percentage (%)

Male

21

52.5

Female

19

47.5

TOTAL

40

100

Distribution of Volunteers based on Gender

48% 52%

Male

Female

Figure 3.1.1 Pie Chart for the number of volunteers based on gender

8

3.1.2

Weight in lbs. Table 3.1.2 Data distribution of volunteers based on weight (lbs.) Weights (lbs.)

Number of volunteers

120

1

123

1

124

1

126

1

128

1

134

1

137

1

138

1

140

1

142

2

146

1

147

2

148

1

149

2

159

1

163

1

164

1

165

1

166

1

167

1

169

1

170

1

172

1

174

1

175

1

176

1

178

1

179

1

180

1

183

1

185

1

192

1

9

193

1

194

1

196

2

197

1

TOTAL

40

Figure 3.1.2.1 Histogram of distribution of number of volunteers based on their weight (lbs.)

With the sample size of 40 volunteers, we can construct a histogram as the figure above. From the histogram we can observe that the graph is skewed to the left. This shows that the volunteers who participated in this study have a much heavy weight overall. The heaviest weight of the volunteer recorded that take part in this study was 197lbs. with 1 volunteer and the lightest weight of the volunteer recorded that take part in this study was 120lbs with only 1 volunteer. From the graph we can also identify the mod which is 142lbs. From the data given we can calculate and construct a box & whisker plot to further explain the skewness of this variable.

10

Figure 3.1.2.2 Box & Whiskers Plot for Skewness

Based on the data for this variable, we can calculate the Q1, Q2, and Q3 to construct the box & whiskers plot. The overall skewness of the weight of volunteers in lbs. is -0.091. This means, the overall distribution of the variable is left skewed since the value of the skewness is at a negative.

From the Box & Whiskers Plot, we could see that weight of the volunteers in lbs. are skewed to the left (negatively skewed). The plot shows that the median line Q2 is located more towards the right. In addition, the left whisker of the plot is much longer than the right whisker. These properties can be observed in the Box & Whiskers plot

11

3.1.3 Running Speed Table 3.1.3 Data distribution of volunteers based on Running speed Running Speed (miles per hour)

Number of volunteers

5

2

6

5

7

1

8

4

9

3

10

4

11

1

12

4

13

7

14

7

15

2

TOTAL

40

Figure 3.1.3.1 Histogram of distribution of number of volunteers based on their running speed

12

With the sample size of 40 volunteers as well, we can construct a histogram just life the figure above shown. From the histogram, we can observe that the graph is skewed to the left. This indicates that the running speed in mph for the volunteers are quite high with being 15mph is the highest recorded speed with 2 volunteers. From the graph we can also identified the mod for this variable which is 13mph and 14mph reading which both have 7 volunteers that were recorded. From this variable data, we can also construct a box & whisker plot graph.

Figure 3.1.3.2 Box & Whiskers Plot for Skewness Based on the data for this variable, we can calculate the Q 1, Q2, and Q3 to construct the box & whiskers plot. The skewness of the weight of volunteers in lbs. is -0.360 This indicate that the overall distribution of the variable is left skewed since the value of the skewness is at a negative.

From the Box & Whiskers Plot, we could see that the running speed of the volunteers are skewed to the left (negatively skewed). The plot shows that the median line Q 2 is located more towards the right. In addition, the left whisker of the plot is much longer than the right whisker. These properties can be observed in both Box & Whiskers plot.

13

3.2

Descriptive Statistic Analysis

Based on the quantitative variables that we have identified and graphed. We will now look at the descriptive analysis for the quantitative variables of weight in lbs. and the running speed in mph for the volunteers. This will involve the measures of central tendency, measures of dispersion, measures of position, and the skewness for each of the variables.

3.2.1

The Weight of the volunteers in lbs.

Table 3.2.1 Descriptive statistics for weight of volunteers in lbs. Descriptive Statistics

Weight (lbs.)

Mean, x

160.83

Median, x

164.5

Mode, 𝐱

142

Standard deviation, s

23.0289

Variance, s2

530.302

Skewness

-0.091

Range

77.00

Minimum

120

Maximum

197

First Quartile, Q1 (25%)

142

Second Quartile, Q2 (50%)

164.5

Third Quartile, Q3 (75%)

178.5

For the first quantitative variable which is the weight of the volunteers in lbs. The Mean, x value for this variable are 160.83. this means that on the average, the volunteers weighed at 160.83 lbs. From the Median, x we can interpret that 50% of the volunteers weigh less than 164.5 lbs. while 50% of the volunteers weigh more than 164.5 lbs. As for the Mode, 𝐱. We can interpret that most of the volunteers weigh at 142 lbs. The standard deviation, s for this variable is 23.0289. While the variance, s 2 is 530.302. From this value, we can imply that the deviation of the data from the mean are large and the data will spread much greater than the mean since the variance value is big.

14

Figure 3.2.1 Box & Whisker for the weight of both Female and Male volunteers

The overall skewness of the variable is -0.091. This means, the distribution of the variable is left skewed since the value of the skewness is at a negative. From the Box & Whiskers Plot above, we could see that weight for both males and females are skewed to the left (negatively skewed). This also implies that the Mode value are greater than the Mean value. The range of the variable is 77.0 lbs. This means, the difference between the largest and the smallest dataset for the weight are 77.0 lbs. The minimum or the smallest dataset of weight are 120 lbs. while the maximum or the largest dataset of weight are 197 lbs. The first, second and third quartiles are 142lbs, 164.5lbs and 178.5lbs. respectively. This implies that 25% of the volunteers weigh less than 142 lbs. and 75% of the volunteers weigh more than 178.5 lbs. The second quartile will be referring to the Mean, x since the values of it are the 50% from the overall dataset.

15

3.2.2

The Running Speed of the volunteers in Miles per Hour

Table 3.2.2 Descriptive Statistics for running speed of volunteers in mph Descriptive Statistics

Running Speed (mph)

Mean, x

10.6

Median, x

11.2

Mode, 𝐱

13

Standard deviation, s

3.144

Variance, s2

9.857

Skewness

-0.360

Range

10

Minimum

5

Maximum

15

First Quartile, Q1 (25%)

8

Second Quartile, Q2 (50%)

11.2

Third Quartile, Q3 (75%)

13.57

For the first quantitative variable which is the weight of the volunteers in lbs. The Mean, x value for this variable are 10.6. this means that on the average, the volunteers running speed is at 10.6 mph. From the Median, x we can interpret that 50% of the volunteers running speed was less than 11.2 mph. while 50% of the volunteers running speed was more than 11.2 mph. As for the Mode, 𝐱. We can interpret that most of the volunteers running speed is at 13 mph. The standard deviation, s for this variable is 3.144. While the variance s2 is 9.857. From this value, we can imply that the deviation of the data from the mean are much smaller and the data will spread not much greater than the mean since the variance value is considerably small.

16

Figure 3.2.2 Box & Whisker plot for R...


Similar Free PDFs