1743 Chapter 5 Estimation and Hypothesis Testing PDF

Title 1743 Chapter 5 Estimation and Hypothesis Testing
Course Mathemathic
Institution Tunku Abdul Rahman University College
Pages 25
File Size 890 KB
File Type PDF
Total Downloads 164
Total Views 777

Summary

BAMS1743 QUANTITATIVE METHODSCHAPTER 5: ESTIMATION AND HYPOTHESIS TESTINGEstimation of parameters  The statistical technique of estimating unknown population parameters based on a value of the corresponding sample statistic.The estimation procedure involves the following steps: Select a sample Coll...


Description

BAMS1743 QUANTITATIVE METHODS CHAPTER 5: ESTIMATION AND HYPOTHESIS TESTING Estimation of parameters  The statistical technique of estimating unknown population parameters based on a value of the corresponding sample statistic. The estimation procedure involves the following steps: 1. Select a sample 2. Collect the required information from the members of the sample 3. Calculate the value of the sample statistic 4. Assign value(s) to the corresponding population parameter Estimate  The value(s) assigned to a population parameter based on the value of a sample statistic. Estimator  The sample statistic that is used to estimate a population parameter. Two types of estimates 1. Point estimate  The value (single number) of a sample statistic that is used to estimate a population parameter.  Example: ˆ  x  77 , ˆ 2  s 2  6 2. Interval estimate / Confidence Intervals  An estimate of a population parameter given by two numbers between which the parameter may be considered to lie on.  An interval that is constructed with a given confidence level.  Example: 66    88 The confidence level associated with a confidence interval states how much confidence we have, that this interval contains the true population parameter. The confidence level is denoted by (1   )100% . Consider a population with unknown parameter θ. If we can find an interval (a, b) such that P(a < θ < b) = 0.95, we say that (a, b) is a 95% confidence interval for θ. In this case, 0.95 is the probability that the interval includes θ.

1

Point estimate An estimate of a population parameter given by a single value and calculated from sample data is called a point estimate of the population parameter. (a) x is a point estimate for  . (b) s is a point estimate for  where s= or

s=

X

 X   x n (for ungrouped data)    n 1  n  2

2

n

 fX f

2

  fX    f

2

  x  

f  f 1

(for grouped data)

Note: In a question on statistical inference, the standard deviation given is taken to be s. (c) Given 2 samples from the same population. Sample 1 of size n1 , sample mean X 1 and sample standard deviation s1 Sample 2 of size n2 , sample mean X 2 and sample standard deviation s2 Then the point estimate for the population mean  is X

n1 X 1  n2 X 2 n1  n2

and the point estimate for the population standard deviation  is ( n1  1) s1  ( n2  1) s2 n1  n2  2 2

Sp =

2

E.g. A sample of 5 measurements of the diameter of a sphere recorded by a scientist is as follows:6.36mm, 6.32mm, 6.37mm, 6.33mm, 6.37mm Determine a point estimate for (i) the population mean,  . (ii) the population standard deviation  .

2

Solution:

X2____

Diameter X 6.36

40.4496

6.32

39.9424

6.37

40.5769

6.33

40.0689

6.37 31.75

40.5769____ 201.6147

(i)

The point estimate for  is

(ii)

The point estimate for  is

CONFIDENCE INTERVAL FOR THE POPULATION MEAN 

The 100(1- )% confidence interval for the population mean,  when 2 the population variance  is known is given by    x  Z    x Z or x  Z  2



n

2

The maximum error of estimate for

n

2

n

 is Z α 2  . n

Example: To determine the mean waiting time for his customers, a bank manager took a random sample of 50 customers and found that the mean waiting time was 7.2 minutes. Assuming that the population standard deviation is known to be 5 minutes, find the 90% confidence interval of the mean waiting time for all of the bank’s customers. Solution: Population mean waiting time of customers in minutes Let  x n = 50,  7.2 ,   5 ,   0.10 ,  / 2  0.05 , Z   Z 0.05  1.6449



2

The 90% confidence interval for



is

3

Example: In a random sample of 70 students in a large university, a dean found that the mean weekly time spent doing homework was 14.3 hours. If we assume that homework time is normally distributed with a standard deviation of 4.0 hours, find the 99% confidence interval estimate of the weekly time spent doing homework for all the university’s students. Solution:

The 100(1- )% confidence interval for the population mean,  when the population variance  2 is unknown and the sample size n is large ( n  30 ) is given by S S S   x Z  x  Z or x  Z  n n n 2 2 2  where S is the sample standard deviation





The maximum error of estimate for

 is

Zα 2

S n

Example: Measurements of the diameters of a random sample of 200 ball bearings made by a certain machine during 1 week showed a mean of 8.24 mm and a standard deviation of 0.42 mm. Find the 95% confident interval for the mean diameter of all the ball bearings. Solution: Let  Population mean diameter of ball bearings in mm n = 200, x  8.24 , s = 0.42,   0.05 ,  / 2  0.025 , Z 0.025 1.96 The 95% confident interval for  is



4

Example: A random sample of 35 drums of a wax-base floor cleaner, has a standard deviation of 12 pounds and a mean weight of 240 pounds. Construct a 95% confidence interval for the actual mean weight of all these drums. Solution:

CONFIDENCE INTERVAL FOR THE POPULATION PROPORTION Let 

X  the number of ‘successes’ in n trials p  the probability of success at each trial X ~ Bin (n, p)

Sample proportion, pˆ 

x n



X ~ N (np, np(1 p)) approximately for large n



 p(1  p)  Pˆ ~ N  p,  approximately for large n n  





The 100(1  )% confidence interval for the population proportion, p with large sample size ( n  30 ) is given by ˆp(1  ˆp) pˆ (1  pˆ ) pˆ (1 pˆ ) pˆ  Z   p  ˆp  Z  or pˆ  Z n n n 2 2 2 The maximum error of estimate for p is Z  2

pˆ (1  ˆp ) n

Example: A manufacturer wants to assess the proportion of defective items in a large batch produced by a particular machine. He tests a random sample

5

of 300 items and finds that 45 are defective. Calculate a 98% confidence interval for the proportion of defective items in the complete batch. Solution:

Example: In an opinion poll conducted with a sampled of 1000 people chosen at random, 30% said that they support a certain political party. Find a 95% confidence interval for the actual proportion of the population who supports this party. Solution: HYPOTHESIS TESTING In hypothesis testing, we test a certain given theory or belief about a population parameter. Using some sample information, we may want to know whether a given claim (or statement) about a population parameter is true or not. Then, we can either accept or reject the given theory or belief. This chapter discusses how to make such tests of hypothesis about the population mean,  and the population proportion, p . Statistical Hypothesis  A statement, assumption or belief about parameter(s) of one or more populations.  Experimental /sample evidence is required to verify the statement Null Hypothesis ( H 0 )  A claim (or statement) about a population parameter that is assumed to be true until it is declared false  Is the hypothesis that we hope to reject  Specifies the value of the population parameter to be tested Alternative Hypothesis ( H 1 )  A claim (or statement) about a population parameter that will be true if the null hypothesis is false

6



The rejection of H 0 means to accept the H 1

There are only four possible results when we test a given hypothesis. 1. We accept a true hypothesis  a correct decision 2. We reject a false hypothesis  a correct decision 3. We reject a true hypothesis  an incorrect decision  known as Type I error (denoted by α --- “alpha”) 4. We accept a false hypothesis  an incorrect decision  known as Type II error (denoted by β --- “beta”) Significance Levels (  )  The maximum probability of making Type I error in hypothesis testing  Usually specified before a hypothesis test is made  The value of 5% (   0.05 ) or 1% (  0.01) is frequently used e.g. If we select 5% significance level, we will expect that the probability of making an error of rejecting the hypothesis when it is true is 5%. In other words, we are about 95% confidence that we will make a correct decision although we could be wrong with a probability of 5%. A test statistics is a number calculated from the sample data that determines the acceptance or rejection of H 0 . Critical Region / Rejection Region  The region which corresponds to a predetermined levels of significance  If the test statistic falls in the acceptance region, H 0 is accepted. Otherwise, H 0 is rejected. Critical value(s)  Value(s) that separates rejection region from the acceptance region  Examples: Z  , Z 2 etc The critical region may be represented by a portion of the area under the normal curve in two ways:

7

1. 2.

Two tails under the curve One tail under the curve which is either the right tail or left tail

Two-tailed test  The test of hypothesis which are based on critical regions represented by both tails under the normal curve, T-curve, etc

 2

 2 *

Rejection Region

* Acceptance Region

* Critical Value

One-tailed test  The test of hypothesis which are based on a critical region represented by only one tail under the normal curve. 

Right-tailed test

 * 

Left-tailed test

 * Rejection Region Sign of H 1 >  <

Acceptance Region

* Critical Value

Type of test Right-tailed test Two-tailed test Left-tailed test

The key problem in a hypothesis test is to decide when to use a onesided test and when to use a two-sided test. In deciding which test to use, first we have to know two key characteristics of a hypothesis test.

8

1. In conducting a formal statistical hypothesis test, we always test the null hypothesis, whether it corresponds to the original claim or not. Sometime the null hypothesis corresponds to the original claim and sometimes it corresponds to the opposite of the original claim. Since we always test the null hypothesis, we will be testing the original claim in some cases and the opposite of the original claim in other cases. 2. The null hypothesis always has a statement of equality in it. Hence, the statement of a hypothesis can be of three types: (i) H0 :   123 (ii) H0 :   123 (iii) H0 :   123 The corresponding alternatives hypothesis would be: (i) H1 :   123 (Two-sided test)

H1 :   123 (One-sided test to the left) (iii) H1 :   123 (One–sided test to the right) (ii)

EXAMPLE: For each of the statement given below, identify H 0 and H 1 . (a) The mean height of females in a country is 156cm. (b) The mean annual household income is at least $12,000. (c)

The mean life of a car battery is not more than 40 months.

(d) The mean life of a car battery is above 40 months. (e) A television executive claims that the majority of teenagers are in favor of sport shows on television. (f)

A maximum of 3% of mailing handled by mail order companies will be returned as “address unknown” or “not known at this address”.

Steps to perform a hypothesis testing 1. Identify the specific claim or hypothesis to be tested. State the null and alternative hypothesis. 2. Determine the significance level  and the critical value. 3. Select the distribution (test statistic) to use.

9

4. 5. 6.

Determine the rejection and non-rejection regions. Set up a decision rule based on the critical value. Draw a distribution curve if necessary. Calculate the value of the test statistic Make a decision (reject H 0 or fail to reject H 0 ).

Test Statistic  can be defined as a rule/criterion to determine acceptance/rejection of H 0 .

Hypothesis Testing about a Population Mean: Large Sample 

The null and alternative hypothesis

H0

  0    0 or    0    0 or    0

(i) (ii) (iii)  (i)

H1   0

  0   0

Type of test Two-tailed test Left-tailed test Right-tailed test

Test statistic X  0 Z if  is known  n

(ii)

(i) (i)

Z

X  0 S n

if  is unknown and n  30

Critical value and rejection region Critical Value H0 H1

  0

  0

 Z / 2

Critical Region

Z  Z / 2 or Z  Z  / 2

10

(ii) (iii)

  0   0

  0   0

 Z Z

Z  Z Z  Z

EXAMPLE: A company markets car tires. Their lives are normally distributed with a mean of 40,000 km and standard deviation of 3,000 km. A change in the production process is believed to result in a better product. A test sample of 64 new tyres has a mean life of 41,200 km. Can you conclude that the new product is significantly better than the current one? (   0.05 ) Solution: Given 0  40, 000 ;   3,000 ; n  64 ; X  41, 200 Let   true mean life of the new tires. H0 :   40,000 (Mean life of the new tires and old tires are the same) H1 :   40,000 (Mean life of the new tires is better than the old tires) At   0.05 , critical value = Z  Z 0.05  1.6449 rejection region: Z  1.6449 X  0 Z   n

Since Z  3.2  1.6449, we reject H 0 and accept H1 at 5% significance level and conclude that the new tires is better than the old tires.

EXAMPLE: The expected lifetime of electric light bulbs produced by a given process was 1500 hours. To test a new batch, a sample of 40 was taken which showed a mean lifetime of 1410 hours. The standard deviation is 90 hours. Test the hypothesis that the mean lifetime of the electric light bulbs has not changed, using a level of significance of 0.05. Solution: Given 0  1500 ; n  40 ; X  1410 ; s  90 Let   the true mean lifetime of the electric light bulb. H0 :   1500 ( Mean lifetime of the electric light bulbs has not changed ) H1 :   1500 ( Mean lifetime of the electric light bulbs has changed ) At   0.05, critical values =  Z / 2   Z 0.05/ 2  Z 0.025  1.96

11

rejection regions: Z  1.96 or Z  1.96 Z

X  0  S n

Since Z  6.325  1.96, we reject H0 and conclude that at the 5% level of significance, there is some evidence to suggest that the mean lifetime of the electric light bulbs has changed.

EXAMPLE: A large retailer wants to determine whether the mean income of families living within two miles of a proposed building site exceeds RM14400. What can he conclude at the 0.05 level of significance, if the mean income of a random sample of 60 families living within two miles of the proposed site is RM14524 and the standard deviation is RM763?

EXAMPLE: It is thought that a certain Normal population has a mean of 1.6. A sample of 50 gives a mean of 1.51 and a standard deviation of 0.3. Does this provide evidence, at the 5% level, that the population mean is less than 1.6?

12

Hypothesis Testing on a Population Proportion: Large Sample The null and alternative hypothesis



H0 P  p0

(i) (ii) (iii)

P  p0 or P  p0 P  p0 or P  p0

H1

Type of test

P  p0

Two-tailed test

P  p0

Left-tailed test

P  p0

Right-tailed test



Test statistic pˆ  p0 Z p0 (1  p0 ) n



Critical value and rejection region Critical Value H1 H 0

Critical Region

(i)

P  p0

P  p0

 Z / 2

(ii)

P  p0

P  p0

 Z

Z  Z / 2 or Z  Z / 2 Z  Z

(iii)

P  p0

P  p0

Z

Z  Z

13

Note: p0 1. 2. 3.

 the population proportion (predetermined constant) pˆ  the sample proportion Population proportion P , is used instead of sample proportion

because the population proportion is known EXAMPLE: In an investigation into ownership of calculators, 200 randomly chosen school students were interviewed, 163 of them owned a calculator. Using the evidence of this sample, test at the 5% level of significance, the hypothesis that the proportion of school students owning a calculator is more than 80%. Solution: x 163   0.815 n 200 Let p  true proportion of students owning a calculator. H 0 : p  0.8 ( Proportion of school students owning a calculator is 80% )

Given n  200 ; x  163 ; p0  0.8

H1 : p  0.8 At   0.05,

ˆp 

( Proportion of school students owning a calculator is more than 80% ) critical value = Z  Z 0.05  1.6449 rejection region: Z  1.6449

Z= Since Z  0.53  1.6449 , we do not reject H0 and conclude that there is not sufficient evidence to suggest that the proportion of school students owning a calculator is more than 80%. EXAMPLE: An election candidate claims that 60 percent of the voters support him. A random sample of 2500 voters show that 1400 support him. Test his claim at 0.10 level of significance.

14

EXAMPLE: A coin is tossed 100 times and 38 heads are obtained. Is there evidence, at the 2% level that the coin is biased in favour of tails?

CHI-SQUARE TEST  ‘Chi’ is the Greek letter  , pronounced ‘kye’.  The chi-square distribution is a continuous distribution and it has a positive integer parameter v , which determines its shape.  As its name implies,  cannot take a negative value.  The parameter v is known as the degrees of freedom of the distribution and we refer to a ‘chi-square distribution with v degrees 2 of freedom’. For simplicity, we write this as  . 2

 There are many  distributions; one for each degree of freedom. As the degrees of freedom become fewer, the distribution becomes more positively skewed. Conversely as the number of degrees of freedom is increased, the distribution becomes approximately normal. 2

Chi-square distribution curve

15

 The  statistic plays an important role in many business problems dealing with count data where information is obtained by counting rather than by measuring. 2

Example: 1. In market research, we count the number of people who prefer a particular brand of detergent powder 2. In quality control, we count the number of defectives produced by a machine during a certain period  There are many situations of this type where measurements are made by counting the numbers or frequency in each category.  The  test is applied to such frequency of occurrences as against the expected ones. 2

The  test is used broadly for: Test of goodness-of-fit  For one-way classification or for one variable only  Test whether a given set of data actually follows an assumed distribution or not 1. Test of independence  For more than one row or column in the form of a contingency table concerning several attributes  Test for dependence between two variables 2

CONTINGENCY TABLE ANALYSIS (Test of independence)  The chi-square test can be used in more ...


Similar Free PDFs