Chapter 6Chapter Complete - Mathematic GEC4 PDF

Title Chapter 6Chapter Complete - Mathematic GEC4
Course Mathematics in the Modern World
Institution President Ramon Magsaysay State University
Pages 38
File Size 1.1 MB
File Type PDF
Total Views 53

Summary

Mathematics in the Modern WorldChapter 6The Statistical ToolsChapter 6: THE STATISTICAL TOOLSIntroduction Statistics involves the collection, organization, summarization, presentation, and interpretation of data. It has two branches: descriptive statistics and inferential statistics. Descriptive sta...


Description

Mathematics in the Modern World Chapter 6

The Statistical Tools

Chapter 6: THE

STATISTICAL TOOLS

Introduction Statistics involves the collection, organization, summarization, presentation, and interpretation of data. It has two branches: descriptive statistics and inferential statistics. Descriptive statistics is the term given to the analysis of data that helps describe, show or summarize data in a meaningful way. When using descriptive statistics, it is useful to summarize a group of data using a combination of tabulated description (i.e., tables), graphical description (i.e., graphs and charts) and statistical commentary (i.e., a discussion of the results). The branch that allows to make predictions (“inferences”) from the data is called inferential statistics. With inferential statistics, it takes data from samples and make generalizations about a population. For instance, you might stand in a mall and ask a sample of 100 people if they like shopping at SM. You could make a bar chart of yes or no answers (that would be a descriptive statistics) or you could use your research (and inferential statistics) to reason that around 75-80% of the population (all shoppers in all malls) like shopping at SM. Testing the significance of the difference between two means, two standard deviations, two proportions, or two percentages, is an important area of inferential statistics. Comparison between two or more variables often arises in research or experiments and to be able to make valid conclusions regarding the results of the study, one has to apply an appropriate test statistic. This chapter deals with the discussion of the different test statistics that are commonly used in research studies under inferential statistics. Learning Objectives At the end of this chapter, the student is expected to:  apply a variety of statistical tolls to process and manage numerical data;  use the methods of linear regression and correlations to predict the value of a variable given certain conditions; and  recognize the importance of testing of hypotheses in making decisions. Duration Topic 1: Testing of Hypothesis Topic 2: Correlation and Regression Analysis

= 6 hours = 3 hours

Lesson Proper

6.1 Hypothesis Testing In Statistics, decision-making starts with a concern about a population regarding its characteristics denoted by parameter values. We might be interested in the population parameter like the mean or the proportion. For instance, you are deciding to put up a business selling cars. Your first course before spending money in business is to know which car sells the most these days. Before you open a business of selling Toyota, Mitsubishi, Hyundai, Honda, Nissan, or Suzuki, you need to gather information which among these get the most number of sales. How many existing distributors of

these cars are out there? Do you want to compete? To answer these questions, you need to gather data. What type of data? And where will you get them? You simply need to do a survey. These concerns can be addressed in a procedure in Statistics called hypothesis testing. Hypothesis A hypothesis is a conjecture or statement which aims to explain certain phenomena in the real world. Many hypotheses, statistical or not, are products of man’s curiosity. To seek for the answers to his questions, he tries to find and present evidences, then tests the resulting hypothesis using statistical tools and analysis. In statistical analysis, the truth of which will be either accepted or rejected within a certain critical interval. The hypothesis that is subjected to testing to determine whether its truth can be accepted or rejected is the null hypothesis by Ho. This hypothesis states that there is no significant relationship or no significant difference between two or more variables, or that one variable does not affect another variable. In statistical research, the hypotheses should be written in null form. For example, suppose you want to know whether method A is not more effective than method B in teaching high school mathematics. The null hypothesis for this study will be: “There is no significant difference between the effectiveness of method A and method B.” Another type of hypothesis is the alternative hypothesis , denoted by Ha. This is the hypothesis that challenges the null hypothesis. The alternative hypothesis for the example above can be: “There is a significant difference between the effectiveness of method A and method B.” or “Method A is more effective than method B,” or Method A is less effective than method B,” depending on whether the type of test is either onetailed or two-tailed. These will be discussed in the succeeding lessons. Significance Level To test the null hypothesis of no significance in the difference between the two methods in the above example, one must set the level of significance first. This is the probability of having a Type I error and is denoted by the symbol 𝛼 . A Type I error is the probability of accepting the alternative hypothesis when, in fact, the null hypothesis is true. The probability of accepting the null hypothesis when, in fact, it is false is called a Type II error and it is denoted by the symbol 𝛽. The most common level of significance is 5%. Table 1. Four Possible Outcomes in Decision-Making Decisions about the Ho

Ho is true.

Reject

Do not Reject Ho (or Accept Ho)

Type I error

Correct Decision

Reality Ho is false. Correct Decision

Type II error

If the null hypothesis is true and accepted, or if it is false and rejected, the decision is correct. If the null hypothesis is true and reject, the decision is incorrect and this is a Type I error. If the null hypothesis is false and accepted, the decision is incorrect and this is a Type II error. For instance, Sarah insists that she is 31 years old when, in fact, she is 35 years old. What error is Sarah committing? Mary is rejecting

the truth. She is committing a Type I error. Another example, a man plans to go hunting the Philippine monkey-eating eagle believing that it is a proof of his mettle. What type of error is this? Hunting the Philippine eagle is prohibited by law. Thus, it is not a good sport. It is a Type II error. Since hunting the Philippine monkey-eating eagle is against the law, the man may find himself in jail if he goes out of his way hunting endangered species. In decisions that we make, we form conclusions and these conclusions are the bases of our actions. But this is not always the case in Statistics because we make decisions based on sample information. The best that we can do is to control the probability with which an error occurs. This is the reason why we are assigning small probability values to each of them. One-Tailed and Two-Tailed Tests A test is called a one-tailed test if the rejection region lies on one extreme side of the distribution and two-tailed if the rejection region is located on both ends of the distribution.

Figure 1. Two-tailed (A) and One-tailed (A & B) tests In figure 1.A (two-tailed), the rejection region is the areas to the extreme left and right of the curve marked by the two vertical lines. In figure 1.B&C (both onetailed), the rejection region is the area to the left (left tail) and to the right (right tail) of the vertical line under the bell curve, respectively. Steps in Testing Hypothesis Below are the steps when testing the truth of a hypothesis. 1. Formulate the null hypothesis. Denote it as Ho and the alternative hypothesis as Ha. 2. Set the desired level of significance (𝛼). 3. Determine the appropriate test statistic to be used in testing the null hypothesis. 4. Compute for the value of the statistic to be used. 5. Compute for the degrees of freedom. 6. Find the tabular value using the table of values for different tests from the appendix tables. 7. State the Decision Rule: If the computed value is less than the tabular value, accept the null hypothesis. If the computed value is greater than the tabular value, reject the null hypothesis. 8. Compare the computed value to the tabular value. Make a conclusion using the result of the comparison.

Degree of Freedom (df) The degree of freedom gives the number of pieces of independent information available for computing variability. For any statistical tool used in testing hypothesis, the number of degrees of freedom required will vary depending on the size of the distribution. For a single group of population, the number of degrees of freedom is N – 1, where N is the population. For two groups, the formula for df is: N1 + N2 – 2 for ttest and N – 2 for Pearson r. These test statistics will be discussed later in this chapter. 6.1.1 Tests Concerning Means 6.1.1.1 z-test on the Comparison between the Population Mean and the Sample Mean

If the population mean (𝜇) and the population standard deviation (𝜎 ) are known, and 𝜇 will be compared to a sample mean (𝑥 ), use the formula below. 𝑧=

(𝑥 −𝜇) 𝜎

∙ √𝑛, where n is the number of sample.

The tabular values of 𝑧 can be obtained from the following table: Table 2. Summary Table of Critical Values Level of Significance 0.10 0.05 0.025 0.01 Test Type One-tailed Test

±1.28

Two-tailed Test ±1.645

±1.645 ±1.96 ±2.33 ±1.96

±2.33 ±2.58

Decision Rule: Reject Ho if |𝑧| ≥ |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟|.

Example 1 A company, which makes a battery-operated toy car, claims that its products have a mean life span of 5 years with a standard deviation of 2 years. Test the null hypothesis that 𝜇 = 5 years against the alternative hypothesis that years if a random sample of 40 toy cars was tested and found to have a mean life span of only 3 years. Use a 5% level of significance. Solution: 1. Ho : The mean lifespan of battery-operated toy cars is 5 years. (𝜇 = 5) Ha : The mean lifespan of battery-operated toy cars is 5 years. (𝜇 ≠ 5) 2. 𝛼 = 0.05, two-tailed 3. Use z-test as test statistic. 4. Computation: Given 𝑥 = 3, 𝜇 = 5, 𝑛 = 40, 𝜎 = 2 (𝑥 − 𝜇) 𝑧= ∙ √𝑛 𝜎 (3−5) = 2 ∙ √40 = −6.32

5. Critical Value: 𝑧 < −1.96 𝑎𝑛𝑑 𝑧 > 1.96 6. Decision Rule: Reject Ho if |𝑧| ≥ |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |. 7. Since the computed|𝑧|, which is 6.32, is greater than |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |, which is 1.96, therefore, reject Ho. Hence, there is a significant difference between the population and sample mean lifespan of battery-operated toy cars. Example 2 A manufacturer of bicycle tires has developed a new design which he claims has an average lifespan of 5 years with a standard deviation of 1.2 years. A dealer of the product claims that the average lifespan of 150 samples of the tires is only 3.5 years. Test the difference of the population and sample means at 5% level of significance. Solution: 1. Ho : There is no significant difference between the population and sample mean of bicycle tires’ lifespan. (𝑥 = 𝜇) Ha : There is a significant difference between the population and sample mean of bicycle tires’ lifespan. (𝑥 < 𝜇) 2. 𝛼 = 0.05, one-tailed, left tail 3. Use z-test as test statistic. 4. Computation: Given 𝑥 = 3.5, 𝜇 = 5, 𝑛 = 150, 𝜎 = 1.2 (𝑥 − 𝜇) 𝑧= ∙ √𝑛 𝜎 (3.5−5) = ∙ √150 1.2 = −15.31 5. Critical Value: 𝑧 < −1.645 6. Decision Rule: Reject Ho if |𝑧| ≥ |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |. 7. Since the computed|𝑧|, which is 15.31, is greater than |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |, which is 1.645, therefore, reject Ho. Hence, there is a significant difference between the population and sample mean of bicycle tires’ lifespan.

6.1.1.2 t-test on the Comparison between the Population Mean and the Sample Mean The t-test can be used to compare the means when the population mean (𝜇) is known but the population standard deviation (𝜎) is unknown. When the population standard deviation is unknown but the sample standard deviation can be computed, the t-test can also be used instead of the z-test. The formula is given below: 𝑡=

(𝑥 − 𝜇) ∙ √𝑛 𝑠

The denominator of the formula, s, divided by the √𝑛 for t is called the standard error of the statistic. It is the standard deviation of the sampling distribution of a statistic for random samples n. Decision Rule: Reject Ho if |𝑡| ≥ |𝑡𝑡𝑎𝑏𝑢𝑙𝑎𝑟|.

Example 1 The average length of time for people to vote using the old procedure during a presidential election period in precinct A is 55 minutes. Using computerization as a new election method, a random sample of 20 registrants was used and found to have a mean length of voting time of 30 minutes with a standard deviation of 1.5 minutes. Test the significance of the difference between the population mean and the sample mean. Solution: 1. Ho : There is no significant difference between the population and sample mean of length of time for people to vote using the old and new procedure. (𝑥 = 𝜇) Ha : There is a significant difference between the population and sample mean length of time for people to vote using the old and new procedure. (𝑥 < 𝜇) 2. 𝛼 = 0.05, one-tailed, left tail 3. Use t-test as test statistic. 4. Computation: Given 𝑥 = 30, 𝜇 = 55, 𝑛 = 20, 𝑠 = 1.5 (𝑥 − 𝜇) 𝑡= ∙ √𝑛 𝑠 (30−55) = ∙ √20 1.5 = −74.54 5. df = n – 1 = 20 – 1 = 19 6. Tabular Value: t = 1.729 (from Appendix) 7. Decision Rule: Reject Ho if |𝑡| ≥ |𝑡𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |. 8. Since the computed|𝑡|, which is 74.54, is greater than |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |, which is 1.729, therefore, reject Ho. Hence, there is a significant difference between the population and sample mean length of time for people to vote using the old and new procedure. It implies that using computerization method in election gives short period of time to vote compare to the old procedure. Example 2 An experiment study was conducted by a researcher to determine if a new time slot has an effect on the performance of pupils in Mathematics. Fifteen randomly selected learners participated in the study. Toward the end of the investigations, a standardized assessment was conducted. The sample mean was 85 and the standard deviation of 3. In the standardization of the test, the mean was 75 and the standard deviation was 10. Based on the evidence at hand, is the new time slot effective? Use 5% level of significance. Solution: 1. Ho: There is no significant difference between the population and sample mean of performance in Mathematics in a new time slot. (𝑥 = 𝜇) Ha : There is a significant difference between the population and sample mean of performance in Mathematics in a new time slot. (𝑥 > 𝜇) 2. 𝛼 = 0.05, one-tailed, right tail 3. Use t-test as test statistic. 4. Computation: Given 𝑥 = 85, 𝜇 = 75, 𝑛 = 15, 𝑠 = 3

(𝑥 − 𝜇) ∙ 𝑛 √ 𝑡 = (85−75) 𝑠 = ∙ √15 3

5. 6. 7. 8.

= 12.91

df = n – 1 = 15 – 1 = 14 Tabular Value: t = 1.761 (from Appendix) Decision Rule: Reject Ho if |𝑡| ≥ |𝑡𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |. Since the computed|𝑡|, which is 12.91, is greater than |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |, which is 1.761, therefore, reject Ho. Hence, there is a significant difference between the population and sample mean of performance in Mathematics in a new time slot. It implies that there is an effect of students’ performance in Mathematics when it changed the time slot.

6.1.1.3 t-test Concerning Means of Independent Samples When two samples are drawn from normally distributed population with the assumption that their variances are equal, the t-test with the given formula should be used. 𝑡=

where

1 − 𝑥 2 𝑥

(𝑛 − 1)𝑠12 + (𝑛2 − 1)𝑠2 2 𝑛1 + 𝑛2 √[ 1 ] ][ 𝑛1 + 𝑛2 − 2 𝑛1 𝑛2

, 𝑥2 = means 𝑥1 

𝑛1 , 𝑛2 = sample sizes 𝑠1 , 𝑠2 = variances

Decision Rule:

Reject Ho if |𝑡| ≥ |𝑡𝑡𝑎𝑏𝑢𝑙𝑎𝑟|.

Example 1 A course in Physics was taught to 10 students using the traditional method. Another group of students went through the same course using another method. At the end of the semester, the same test was administered to each group. The 10 students under method A got an average of 82 with a standard deviation of 5, while the 11 students under method B got an average of 78 with a standard deviation of 6. Test the null hypothesis of no significant difference in the performance of the two groups of students at 5% level of significance. Solution: 1. Ho: There is no significant difference between the average scores of the two groups of students. (𝑥  ) 1=𝑥 2 Ha : There is a significant difference between the average scores of the two groups of students.

) (𝑥   1>𝑥 2 2. 𝛼 = 0.05, one-tailed, right tail 3. Use the t-test as test statistic. 4. Computation: 𝑥2 = 78, 𝑛1 = 10, 𝑛2 = 11, 𝑠1 = 5, 𝑠2 = 6 Given:  𝑥1 = 82,  𝑥 1 − 𝑥 2 𝑡= (𝑛 − 1)𝑠12 + (𝑛2 − 1)𝑠2 2 𝑛1 + 𝑛2 ] ][ √[ 1 𝑛1 + 𝑛2 − 2 𝑛1 𝑛2 =

= =

5. 6. 7. 8.

82 − 78

(10 − 1)(5)2 + (11 − 1)(6)2 10 + 11 √[ ][ 10 + 11 − 2 (10)(11 4

(9)(25) + (10)(36) 21 √[ ] [110] 19 4

2.4245

= 1.65

df = 𝑛1 + 𝑛2 − 2 = 10 + 11 – 2 = 19 Tabular Value: t = 1.729 (from Appendix) Decision Rule: Reject Ho if |𝑡| ≥ |𝑡𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |. Since the computed|𝑡|, which is 1.645, is less than |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟|, which is 1.729, therefore, accept Ho. Hence, there is no significant difference between the average scores of the two groups of students. It implies that there is no significant difference in using method A and method B in their students’ performance in Physics. 6.1.1.4 t-test on the Significance of the Difference Between Two Correlated Means When comparing two correlated means, the t-test is the appropriate statistic. A typical example is when comparing the results of the pre-test and post-test administered to group of individuals. The two tests must be the same and the given formula should be used. 𝑡=

where

∑𝑑

√(𝑛 ∑ 𝑑 ) − (∑ 𝑑) 𝑛−1 2

2

d = difference between the pre-test and post-test scores n = number of samples

Example 1 To determine whether the students’ performance in College Algebra improved after enrolling in the subject for one term, a 60-item pre-test and post-test were

administered to them on the first and the last days of classes, respectively. The same test was given as pre-test and post-test. The results are as follows: Student Pre-Test Score Post-Test Score A 34 45 B 23 32 C 40 46 D 31 57 E 24 39 F 45 48 G 27 27 H 32 33 I 12 18 J 45 45

d -11 -9 -6 -26 -15 -3 0 -1 -6 0

∑ 𝑑 = −77

𝒅𝟐 121 81 36 676 225 9 0 1 36 0

∑ 𝑑 2 = 1,185

Solution: 1. Ho: There is no significant difference between the pre-test and post-test of the students’ performance in College Algebra. (𝜇1 = 𝜇2 ) Ha : There is a significant difference between the pre-test and post-test of the students’ performance in College Algebra. (𝜇1 < 𝜇2 ) 2. 𝛼 = 0.05, one-tailed, left tail 3. Use the t-test as test statistic. 4. Computation: 𝑡= = =

∑𝑑

√(𝑛 ∑ 𝑑 ) − (∑ 𝑑) 𝑛−1 −77 2

2

(−77)2 √10(1,185 ) − 10 − 1 −77

√5,921 9 −77 = 25.65 = −3.002 5. 6. 7. 8.

df = n – 1 = 10 – 1 = 9 Tabular Value: t = 2.821(from Appendix) Decision Rule: Reject Ho if |𝑡| ≥ |𝑡𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |. Since the computed|𝑡|, which is 3.002, is greater than |𝑧𝑡𝑎𝑏𝑢𝑙𝑎𝑟 |, which is 2.821, therefore, reject Ho. Hence, there is a significant difference between the pre-test and post-test of the students’ performance in College Algebra. It implies that the performance of the students in Algebra is significantly improved.

6.1.1.5 z-test on the Significance of the Difference Between Two Independent Proportions There are certain situations when the data to be analyzed involve population proportions or percentages. For instance, a shoe company may want to know the proportions of defective shoes to be delivered in other countries. To determine if there is a significant difference between proportions of two variables, the z-test will be used. 𝑝1 − 𝑝2 𝑧= 𝑝𝑞 𝑝𝑞 √ 1𝑛 1 + 𝑛2 2 1 2 where 𝑝1 = proportion of first sample

𝑝2 = proportion of second sample

𝑞1 = 1 - 𝑝1

𝑞2 = 1 - 𝑝2

𝑛1 = number of cases in the first sample

Example 1

𝑛2 = number of cases in the ...


Similar Free PDFs