Math SL Internal assessment PDF

Title Math SL Internal assessment
Course  Elements of Modern Mathematics
Institution Syracuse University
Pages 16
File Size 762.2 KB
File Type PDF
Total Downloads 37
Total Views 157

Summary

Internal assessment ...


Description

1



IB Math SL IA

Investigating correlations between height, weight and average lifetime of different dog breeds. Does the weight of dogs follow a normal distribution?

2

Table of contents Introduction……………………………………….3 Theories…………………………………………..4 Section A………………………………………....5 Data collection and process………………….....5 Section B………………………………………...10 Normal distribution…………………………….10 Conclusion ……………………………………...12 Appendix…………………………………….......13 References ………………………………………16

3

Introduction Rationale  he scheme of my applied research is to find out whether the dogs’ weightss follow the normal T distribution and if there is a relationship between different characteristics of dogs. More specifically, do the life expectancy, height, weight of dogs seem to be correlated? I intend to focus on the maximum height and weight of different breeds of different origins. My interest for dogs came along years ago. From the time I remember myself I have always had good experiences with dogs. Having as pets both bigger and smaller dogs, as the years went by, I noticed that smaller dogs lived a bit longer than the bigger ones. Moreover, lately a research that stated that behavioural responses of dogs depend on their weight dragged my attention. All the above caught my interest and in this IA I will attempt to investigate different characteristics of dogs using Mathematics.

Background The mathematical background that the investigation is based on Probability and Statistics, as it is taught in the Math SL class I follow. First, using the normal distribution simulation I will answer Also another aim of my investigation is to find a correlation between different dog breeds and if there is one, to find out how positively or negatively correlated the factors are. In the research process for correlation theories I came across a theory conceived by Karl Pearson an English mathematician and biostatistician. It is called the Pearson correlation coefficient. Calculating the Pearson’s correlation coefficient, also referred to as the Pearson's r, the linear correlation between two variables X and Y can be measured. ‘r’ can take a value between +1 and −1, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation. In this case X and Y will be maximum height and weight, maximum height and average life and also maximum weight and average lifetime.

4

Theories Pearson’s correlation coefficient (Statistics.laerd.com, 2018) Pearson's correlation coefficient when applied to a data set is commonly symbolized by the letter r as it is discussed above. If we have one dataset of the variable X { xi ,..., xn } containing n values and another dataset of Y { y 1 ,..., y n } containing n values then that formula for r is:

n

Equation 1 :

r=

∑ (xi −x)(yi −y)

i=1



n

∑ (x i−x) 2

i=1



n

∑ (yi −y) 2

i=1

Where:

● n  is the number of samples ● xi and y i are the single samples indexed with i n



x=

∑ xi i=1

n

n

is the formula that the mean values can be obtained

∑ yi



y=

i=1

n

is the formula that the mean values can be obtained

5

Normal distribution (Stevegallik.org, 2018)

In statistics, a very efficient way to study populations, is by using the normal distribution. When collecting all the data and plotting the probability of getting the different values, the normal distribution curve has a bell curve shape. Most of the population is arranged around the mean value, and therefore getting a value closer to mean is more possible. When dealing with a normal distruíbution it is possible to calculate the mean and the standard deviation as illustrated in figure 1. The mean value, as well as the standard deviation value are given by: n

Mean value x =

∑ xi i=1

n

(Equation 2) and Standard deviation σ =



n

2

∑ | xi −x|

i=1

n

(Equation 3)

This study focuses on the normal distribution of weight trends of dog if there is any and will be analyzed in the next paragraphs.

Figure 1: An example of a normally distributed quantity, illustrated together with its standard deviation and mean value.

The normal distribution curve, also known as the “Bell shaped” curve has a concentration at its center which then decreases on both sides as shown on the above curve. The curve allows us to see if there is a possibility of symmetry in the data. The table then helps us to visualise the symmetry with range of the center, left and right.

6

Section A Data collection and process All the dogs’ data were collected on a website called PetBreeds a specialised pet breed research site that uses Graphiq’s semantic technology to deliver deep insights via data-driven articles, visualizations and research tools. The data collection was manually plugged into an excel sheet in order to make further calculations. The data collected from the website, were the breeds, their maximum weight, height, average lifetime, and popularity. 1: Mean Weight and Height The first calculations using the data were the mean weight and mean height. More specificaly : The weight corresponded to the variable X and the height to the variable Y. Therefore the calculation of the mean was about adding all the different weights and dividing by the total number: 80+95+75+...+200+40 ≈ 63, 68 kg . 50 ≈ 20.60 cm Same procedure was followed for the calculation of the height mean: 25+26+24+...+35+19 50 The next calculation that took place was : (X i − X ) and ( Y i − Y ) : For the X : 80 − 63.68 = 16.32 95 − 63.68 = 31.32 etc. for all 50 different values. For the Y : 25 − 20.60 = 4.4 26 − 20.60 = 5.4 etc. for all 50 different values. Later on, the sum of the above calculations was derived and multiplied : ∑(X i − X)(Y i − Y ) ≈ 16459, 6 The Pearson’s correlation’s coefficient’s numerator therefore corresponds to the value of 16459.6 For the denominator, the following two equations were calculated:

√ √

∑(X i − X)

2

≈ 347.1

∑(Y i − Y )

2

≈ 53.59

Finally,

r=

16459.6 347.1 . 53.59

≈ 0.88

7

The correlation coefficient r, is a number close to 1, and we can characterize the correlation between weight and height as a strong positive correlation. This was partly expected as taller dogs weigh more most of the times. This case also confirmed that the calculation of r in order to define if two quantities are correlated is a reliable method. The graph below illustrates the data points, as well as a linear fitted line, that shows the strong positive correlation. Figure 2

Maximum Height and Average lifetime The next correlation that is investigated is by using the data of the maximum height and average lifetime . More specificaly : The height corresponded to the variable X and the average lifetime to the variable Y. Therefore the calculation of the mean was about adding all the different heights and dividing by the total number: 25+26+24+...+35+19 ≈ 20.60 cm. Same procedure was followed for the calculation of the 50 11+11+12+...+9+12 ≈ 11.72 cm average lifetimes’ mean: 50 The next calculation that took place was : (X i − X ) and ( Y i − Y ) : For the X : 25 − 20.60 = 4.4 26 − 20.60 = 5.4 etc. for all 50 different values. For the Y : 11 − 20.6 = -9.6 12 − 20.6 = -8.6 etc. for all 50 different values. Later on, the sum of the above calculations was derived and multiplied : ∑(X i − X)(Y i − Y ) ≈ − 404.6

The Pearson’s correlation’s coefficient’s numerator therefore corresponds to the value of − 404.6 .

8

For the denominator, the following two equations were calculated:

√ √

2 ∑(Xi − X ) ≈ 53.59

∑(Y i − Y )

Finaly,

r=

2

≈ 13.4

−404.6 53.59 × 13.4

≈ -0.55

The correlation coefficient r, is a negative number and not very close to -1. We can characterize the correlation between average lifetime and height as a moderate negative correlation. This means, that the taller the dog, the smallest is the value of the expected average lifetime. The graph below illustrates the results.

Figure 3

In order to automatize the calculations, considering their complexity and the fact that were time-consuming, we created an excel tool that produces the exact same results, just by importing the data. The excel tool that is created for that purpose, not only can be used for this IA but has the ability of future development and use for other purposes/with other data. Maximum weight and Average lifetime The last correlation that is investigated is by using the data of the maximum weight and average lifetime . More specificaly : The weight corresponded to the variable X and the average lifetime to the variable Y. Therefore the calculation of the mean was about adding all the different weights and dividing by the total number: 80+95+75+...+200+40 ≈ 11.72 cm. Same procedure was followed for the calculation of the 50 average lifetimes’ mean:

11+11+12+...+9+12 50

≈ 11.72 cm

9

The next calculation that took place was : (X i − X ) and ( Y i − Y ) : For the X : 80 − 63.68 = 16.32 95 − 63.68 = 31.32 etc. for all 50 different values. For the Y : 11 − 20.6 = -9.6 12 − 20.6 = -8.6 etc. for all 50 different values. Later on, the sum of the above calculations was derived and multiplied : ∑(X i − X)(Y i − Y ) ≈ − 3276.48

The Pearson’s correlation’s coefficient’s numerator therefore corresponds to the value of − 3276.48 For the denominator, the following two equations were calculated:

√ √

2 ∑(Xi − X ) = 347.1 2

∑(Y i − Y ) = 13.4

Finaly,

r=

−3276.48 347.1 × 13.4

≈-0.7

The correlation coefficient r, is a strong negative number and close to -1. We can characterize the correlation between average lifetime and weight as a moderate negative correlation. This means, that the heavier the dog, the smallest is the value of the expected average lifetime. The graph below illustrates the results.

Figure 4

10

Section B Investigating a sample of 49 Dog Owners, that measured their dog’s weight. For this section, I will use the data that I collected through the poll found on the website (Easypolls.net, 2018)

Question asked - ‘For dog owners : What is the weight of your dog in kg?’’ Sharing this through Social Media and groups of pet lovers I have joined, I managed to collect 49 answers. The participants were chosen randomly as anybody could give his or her answer online. Therefore the results that I collected include different dog breeds, and therefore it is expected that the standard deviation will be relatively large. Below, the question that was distributed through the poll is illustrated.

Figure 5

Table 1 Weight of dogs

Frequency

0 ≤l...


Similar Free PDFs