Module 3 - Descriptive Statistics and Data Presentation PDF

Title Module 3 - Descriptive Statistics and Data Presentation
Author Amethyst Lee
Course Statistic
Institution Polytechnic University of the Philippines
Pages 11
File Size 522.7 KB
File Type PDF
Total Downloads 331
Total Views 488

Summary

ACTIVITIES/ASSESSMENTS: Which one do you think is more informative? Why? I think that both graphs are informative, but if I would to choose one it would be the graph on the left side. For me, it is very informative and it is very easy to compare the different parts and understand just by using only ...


Description

ACTIVITIES/ASSESSMENTS: 1. Which one do you think is more informative? Why?

-

I think that both graphs are informative, but if I would to choose one it would be the graph on the left side. For me, it is very informative and it is very easy to compare the different parts and understand just by using only two legends compare to graph on the right side which uses four legends which could cause some confusion.

2. What feature of the ‘Good Presentation’ make it better than the ‘Bad Presentation’? -

The feature of the ‘Good Presentation’ that make it better than the “Bad Presentation’ is by using Graphical Presentation. By using graphical presentation, it is very efficient visual tool as it displays data at a glance, facilitates comparison, and can reveal trends and relationships within the data such as changes over time and correlation or relative share of a whole.

3. Review the table and consider questions such as the following. Needs Satisfactor Origin / Poor Improvemen V Good Rating y t External 0% 2% 12% 19% Internal 4% 8% 15% 23% Grand 4% 10% 27% 42% Total

Excellent

Total

8% 9%

41% 59%

17%

100%

What percentage of the employees originated from within the organization? 59% What percentage of the employees are both internal and rated ‘Very Good’? 23% What percentage of the employees received ‘Needs Improvement’ or ‘Poor’? 10% What category contains the greatest number of employees? Employees from Internal and rated ‘Very Good’ E. Do you see any notable differences in the percentage by category? Even though the internal have more employees than the external the percentage of rating of the external employees is much better than the internal employees.

A. B. C. D.

4. Consider the above Frequency Distribution of Salaries.

Salary 41,000 – 50,000 51,000 – 60,000 61,000 – 70,000 71,000 – 80,000 81,000 – 90,000 91,000 – 100,000 101,000 – 110,000 Total

Frequency 1 20 53 43 26 6 1 150

Percentage 1% 13% 35% 29% 17% 4% 1% 100%

A. What percentage of the employees earns less than or equal 80,000? 78% B. What is the salary range of values? R = Xmax. − Xmin. R = 110,000 – 41,000 R = 69,000 C. What salary categories have percentage less than 5? Salaries that earn from 41,000 – 50,000, 91,000 – 100,000, and 101,000 – 110,000 have percentage less than 5. D. What salary category includes the most employees? Salary that earns from 61,000 – 70,000 includes the most employees. 5. The length of life of an instrument produced by a machine has a normal distribution with a mean of 12 months and standard deviation of 2 months. Find the probability that an instrument produced by this machine will last A. less than 7 months. Given: X 270 = 266 = 16 Area = P(X > 270)

P(X > 270) = P(Z > z)

= 0.4013 Area = P(Z > 0.25) The proportion of pregnancies that lasts more than 270 days is 0.4013 or 40.13%. B. What proportion of pregnancies lasts less than 250 days?

Given: X < 250 = 266 = 16 Area = P(X < 250) P(X < 250) = P(Z < z)

= 0.1587 Area = P(Z < –1) The proportion of pregnancies that lasts less than 250 days is 0.1587 or 15.87%. C. What proportion of pregnancies lasts between 240 and 280 days? Given: 240 ≤ X ≤ 280 = 266 = 16 Area = P(240 ≤ X ≤ 280) P(240 ≤ X ≤ 280) = P(z ≤ Z ≤ z)

= 0.4484 + 0.3106 = 0.759 Area = P(–1.63 ≤ Z ≤ 0.88)

The proportion of pregnancies that lasts between 240 and 280 days is 0.759 or 75.9%. D. What is the probability that a randomly selected pregnancy? lasts more than 280 days? Be sure to draw a normal curve with the area corresponding to the probability shaded.

Given: X > 280 = 266 = 16 Area = P(X > 280)

P(X > 280) = P(Z > z)

= 0.1894 Area = P(Z > 0.88)

The proportion of pregnancies that lasts more than 280 days is 0.1894 or 18.94%.

7. Construct frequency distribution table based on the scores of 75 randomly selected students.

Scores 26 to 30 31 to 35 36 to 40 41 to 45 46 to 50 Total

Frequency 13 10 16 18 18 75

Percentage 17.33% 13.33% 21.33% 24.00% 24.00% 100.00%

A. Based on the frequency distribution, compute measures of central tendency, measures of variation, Q1, D9, P10, Skewness and kurtosis. Measure of Central Tendency: MEAN Class Interval Frequency (f) x fx 46 - 50 18 48 864 41 – 45 18 43 774 36 – 40 16 38 608 31 – 35 10 33 330

26 - 30 Total

13 n = 75

28

364 2,940

= 2,940 75 x = 39.2 Measure of Central Tendency: MEDIAN AND MODE Class Interval 46 - 50 41 – 45 36 – 40 31 – 35 26 - 30 Total n = 2

 75 2

Frequency (f) 18 18 16 10 13 n = 75 =

LB 45.5 40.5 35.5 30.5 25.5

< cf 75 57 39 23 13

37.5

xM = 35.5 + (37.5 - 23)5 = 40.03 16

`

Class Interval 46 - 50 41 – 45 36 – 40 31 – 35 26 - 30 Total

Frequency (f) 18 18 16 10 13 n = 75

LB 45.5 40.5 35.5 30.5 25.5

d1 = 18 – 16 = 2 d2 = 18 – 18 = 0 x = 40.5 + (   2  ) 5 = 2+0 Measure of Variation: R = Xmax. − Xmin. R = 50 – 26 R = 24

45.5

< cf 75 57 39 23 13

Q1, D9, P10 Class Interval 46 - 50 41 – 45 36 – 40 31 – 35 26 - 30 Total

Class Interval 46 - 50 41 – 45 36 – 40 31 – 35 26 - 30 Total

Frequency (f)

x

fx

18 18 16 10 13 n = 75

28 33 38 43 48

364 330 608 774 864 2,940

Frequency (f) 18 18 16 10 13 n = 75

(xi - x )2 125.44 38.44 1.44 14.44 77.44

LB 45.5 40.5 35.5 30.5 25.5

< cf 75 57 39 23 13

LB 45.5 40.5 35.5 30.5 25.5

< cf 75 57 39 23 13

The quartile class is 31 to 35. Qk = Q1 = Class Interval 46 - 50 41 – 45 36 – 40 31 – 35 26 - 30 Total

Frequency (f) 18 18 16 10 13 n = 75

The decile class is 46 to 50. Dk = D9 =

f(xi - x )2 1,630.72 384.40 23.04 23.04 1,393.92 3,692.00

Class Interval 46 - 50 41 – 45 36 – 40 31 – 35 26 - 30 Total

Frequency (f) 18 18 16 10 13 n = 75

LB 45.5 40.5 35.5 30.5 25.5

< cf 75 57 39 23 13

The percentile class is 26 to 30. Pk = P10 =

Sk = Sk =

Q1 = 33.38 The quartile class is 41 to 45. Qk = Q3 =

P10 = 28.38

The percentile class is 46 to 50. Pk = P90 =

B. Based on the raw data, compute measures of central tendency, measures of variation, Skewness and kurtosis using Excel. Scores Mean Standard Error Median Mode Standard Deviation

38.94666667 0.82306685 39 46 7.127968008

Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count

50.80792793 -1.047697133 -0.304175895 24 26 50 2921 75

C. Compute Skewness and kurtosis of grouped and ungrouped data. Make sure to describe the shape of the distribution. UNGROUPED DATA (using excel): Skewness (Sk): -0.304175895 Kurtosis (k): -1.047697133

For skewness it is negatively skewed (skewed left) because the mean and median are less than the mode. For the kurtosis it is platykurtic since the computed kurtosis is less than 3. Its tails are shorter and thinner, and often its central peak is lower and broader compared to a normal distribution. GROUPED DATA: Sk = Sk =

Q1 = 33.38 The quartile class is 41 to 45. Qk = Q3 = P10 = 28.38

The percentile class is 46 to 50. Pk = P90 =

For skewness it is negatively skewed (skewed left) because the mean and median are less than the mode. For the kurtosis it is platykurtic since the computed kurtosis is less than 3. Its tails are shorter and thinner, and often its central peak is lower and broader compared to a normal distribution. D. Do you think that computed value for grouped and ungrouped data are the same? In computing the measures of central tendency, yes, the computed data for grouped and ungrouped data will be the same. In computing the measures of variation, no, there is a slight difference based from the grouped and ungrouped data. In computing the skewness and kurtosis, no; there will be a big difference based from what I have computed. 8. Begin with the following set of data, call it Data Set I. 5, −2, 6, 14, −3, 0, 1, 4, 3, 2, 5 Dataet I 5 -2 6 14 -3 0 1 4 3 2 5

Data Set II 8 1 9 17 0 3 4 7 6 5 8

Data Set III -1 -8 0 8 -9 -6 -5 -2 -3 -4 -1

A. Compute the sample standard deviation and sample mean of Data Set I. Data Set I Standard Deviation Mean

4.622081389 3.181818182

B. Form a new data set, Data Set II, by adding 3 to each number in Data Set I. Calculate the sample standard deviation and sample mean of Data Set II. Data Set II Standard Deviation Mean

4.622081389 6.181818182

C. Form a new data set, Data Set III, by subtracting 6 from each number in Data Set I. Calculate the sample standard deviation and sample mean of Data Set III. Data Set III

Standard Deviation Mean

4.622081389 -2.818181818

D. Comparing the answers to parts (a), (b), and (c), can you guess the pattern? State the general principle that you expect to be true. For the standard deviation it will remain the same. As the addition or subtraction of the data sets are the same amount. The standard deviation will measure how far each value is observed from the mean. For the mean, it changes with the same amount that has been added or subtracted from the set....


Similar Free PDFs