Worksheet-12-Descriptive statistics Studocu PDF

Title	Worksheet-12-Descriptive statistics Studocu
Author	Linderson Johns
Course	Finite Mathematics
Institution	Central Washington University
Pages	7
File Size	320.1 KB
File Type	PDF
Total Downloads	4
Total Views	161

Preview

CLICK TO PREVIEW PDF

Summary

Weekly worksheet assignment...

Description

Math 130

WORKSHEET 12

Descriptive Statistics Answers are provided, so please show your work for credit. CENTRAL LOCATION/TENDENCY:

¯x =

∑ xi

1. Mean (Average)

n

(

∑

is summation; n is the data set size.)

e.g. Find the mean of the data set: 75, 81, 88, 92. ¯X =

75+81 + 88 + 92 =84 4

Mean = 84

Ans. 84 2. Median (page 30) The median is the middle value when the data are sorted in ascending order. e.g.1 Find the median of the data set: 5, 3, 4, 7, 8, 8, 1. Arrange in ascending order = 1, 3, 4, 5, 7, 8, 8 Counting three numbers from each side median = 5

e.g.2 Find the median of the data set: 5, 3, 4, 7, 8, 9, 11, 2. Ascending = 2, 3, 4, 5, 7, 8, 9, 11. Counting three we find median = 5 and 7 5 7 6 Median = 2

The median = 6

Ans. 5; 6 3. Mode If a data set has a value that occurs more often than any of the others, then that value is called the mode. e.g. Find the mode of the data set: 1, 2, 3, 4, 5, 5, 6, 2, 2 Find the mode of the data set: 1, 2, 3, 4, 5, 5, 6, 2, 2 Two appears three times, five appears two times and other numbers appear once The mode = 2 Ans. 2

SPREAD (or VARIABILITY): 1. Range Range is the difference between the largest and the smallest values. Range = largest - smallest = Max - Min e.g. Find the range of the data set: 4, 5.3, 6.1, 4, 3, 6, 12. Range = largest - smallest = Max – Min Range = 12-3 =9

The range = 9 Ans. 9 2. Variance Variance is a way to measure how disperse the data are. (∑ x) x − 2 ∑ ( x− ¯ x )2 ∑ n Variance s = = n−1 n−1

2

2

where n is the data set size.

e.g. Find the variance of the data set: 1, 3, 5, 7, 9, 11.

 1  6 S2 =

2

2

2

2

2

2

  3  6    5  6    7  6    9  6   11  6  70  1 4 6 1 5

The variance = 14

Ans. 14 3. Standard Deviation Simply the square root of the variance. Standard Deviation s =

√ Variance

e.g. Find the standard deviation of the data set: 11, 13, 15, 17, 19, 25.

 11  16.6667 

2

  13 16.6667   15 16.6667   17 16.6667   19 16.6667   25 16.6667  6 1 2

2

2

2

123.3333  24.6667 5

S  24.6667 4.9666

The Standard deviation = 4.9666 Ans. Variance 24.67 standard deviation 4.9666 RELATIVE POSITION: 1. Z-score The z-score measures how far one data point is away from the central location of the data set. It indicates the number of standard deviations a data point is

2

away from the mean.

z=

x−¯x s

e.g. Given a data set: 51, 62, 73, 74, 87, 95, find (a) the z-score of the data point 87. (b) the z-score of 51.

 51  73.6667 

2

2

2

2

2

  62  73.6667    73  73.66 67    74  73.6667    87  73.6667   9 5  73 .6667  61

1283.3333 S2  256.66667 5 S  256.66667 S 16.02082 51 62  73  74  87  95 X  73.66667 6 x X z s 87  73.66667 0.83225 z 16. 02082

The Z-score 87= 0.83225 a. the z-score of 51. x X s 51 73.66667 z -1.41483 16.02082 z

The Z-score 51 = -1.41483 Ans. 0.83; -1.42 (mean 73.67, std. dev. 16.02) 2. Percentile This measure is used only when the data set is large. The k-th percentile Pk is the value such that k% of data in this data set have value smaller than Pk. The 25th percentile P25 is called the first quartile Q1, th 50th percentile P50 is called the second quartile Q2, the 75th percentile P75 is called the third quartile Q3. e.g. What is the relationship between these three measures: median, Q2 , P50 ? Use the data set below to answer questions 3. and 4.

3. Tukey’s 5-Number Summary (page 199-200) The 5-Number Summary includes Min, Q1 , Q2 , Q3 , Max.

2

Ans. 14, 30, 35, 40, 50 4. Boxplot (Numerical and Graphical) A boxplot displays a data set's 5-Number Summary.

To measure Spread (Variability) of a data set, we may also use the Interquartile Range IQR. IQR = Q3 – Q1

Ans. 10 What is an outlier? Page 203 5. Is there outlier(s) in the following data set?

Ans. Lower fence 15, upper fence 55; 2 outliers 14, 14 Use TI-83 calculator or Minitab to analyze the following data Data Display (original) Test 3 score 40 39 45 38 29 44 29 33 41

14 30 33

40 14 34

35 43 31

40 32 31

50 40

30 30

25 15

36 40

43 37

49 28

30 32

38 49

28 36 49

29 37 49

29 38 50

30 38

30 39

30 40

30 40

31 40

31 40

32 40

32 41

Data Display (sorted) Test 3 score 14 14 15 33 33 34 43 43 44

25 35 45

I n di v i dua l V a l ue P l o t o f Te s t 3

10

20

30 Te s t 3

40

50

H i s t o g r a m of Te s t 3 12

10

Fr e q u e n c y

8

6

4

2

0 15

20

25

30

35 Te s t 3

Figure 1

40

45

50

B o x p l o t o f Te s t 3 50

Te s t 3

40

30

20

10

S u mm mma a r y f o r Te s t 3 A nderson-Darling N ormality T est

20

30

40

A -Squared P -V alue

0.61 0.103

M ean S tD ev V ariance S kew ness Kurtosis N

34.784 8.798 77.396 -0.629842 0.627270 37

M inimum 1st Q uartile M edian 3rd Q uartile M aximum

50

14.000 30.000 35.000 40.000 50.000

95% C onfidence Interv al for M ean 31.851

37.717

95% C onfidence Interv al for M edian 31.101

39.899

95% C onfidence Interv al for S tD ev

9 5 % C o nf i d e nc nce e I nter v als

7.154

11.428

Mean

Median 30

32

34

36

38

40

Chebyshev’s inequality says that at least 1-1/K2 of data from a sample must fall within K standard deviations from the mean, where K is any positive real number greater than one. We can also state the inequality above by replacing the phrase “data from a sample” with probability distribution. This is because Chebyshev’s inequality is a result from probability, which can then be applied to statistics.

To illustrate the inequality, we will look at it for a few values of K:   

For K = 2 we have 1 – 1/K2 = 1 - 1/4 = 3/4 = 75%. So Chebyshev’s inequality says that at least 75% of the data values of any distribution must be within two standard deviations of the mean. For K = 3 we have 1 – 1/K2 = 1 - 1/9 = 8/9 = 89%. So Chebyshev’s inequality says that at least 89% of the data values of any distribution must be within three standard deviations of the mean. For K = 4 we have 1 – 1/K2 = 1 - 1/16 = 15/16 = 93.75%. So Chebyshev’s inequality says that at least 93.75% of the data values of any distribution must be within two standard deviations of the mean.

Applying Chebyshev’s Theorem (Inequality): Q1. According to the Chebyshev’s theorem, at least what percentage of the data fall within 2 standard deviations either side of the mean?

from Chebyshev’s theorem; percentage of values k standard deviation from mean = (1-(1/k2))*100 = (1-(1/22)) *100 = (3/4) *100 =75.0%

Q2. What’s the actual percentage of the data fall within 2 standard deviations either side of the mean? 34/37 of the data are within two standard deviation = (34/37) * 100 = 0.91892 * 100 = 91.892%

Q3. According to the Chebyshev’s theorem, at least what percentage of the data fall within 3 standard deviations either side of the mean? from Chebyshev’s theorem; percentage of values k standard deviation from mean = (1-(1/k2)) *100 = (1-(1/32)) *100 = (0.888889) *100 = 88.89% or 8/9

Q4. What’s the actual percentage of the data fall within 3 standard deviations either side of the mean? All the data are within 3 standard deviation (37/37) * 100% = 100%...