2019- Statun 1101 Introduction to Statistics-lecture notes Sampling Distribution PDF

Title 2019- Statun 1101 Introduction to Statistics-lecture notes Sampling Distribution
Course Introduction To Statistics
Institution Columbia University in the City of New York
Pages 3
File Size 151 KB
File Type PDF
Total Downloads 39
Total Views 130

Summary

Sampling Distribution...


Description

Sampling Distribution (Ch.17)

We will focus on the sampling distribution of the sample proportion in this section.

1

Sampling Distribution

Statistic and sampling distribution • Example: you flip a coin 1000 times and let X be the number of heads in the 1000 trials. Then X Bin 1000, p , where p is the probability of getting a head. • The sample proportion is pˆn

X n.

• Definition: A statistic is a function of the samples (your data). E.g. pˆn

X n.

• The sampling distribution of a statistic is simply the distribution of the statistic. For example, the sampling distribution of pˆn is the distribution of pˆn . • Since X

Bin 1000, p , we of course know the distribution of X n.

Approximating the sampling distribution • In many situations, we may not know (or it is difficult to obtain) an exact sampling distribution. • Therefore, we may use an approximation instead, which is usually justified by the central limit theorem (either implicitly or explicitly). • We know that if X Bin 1000, p , then X Y1 . . . Y1000 , where Yi where iid means they are independently and identically distributed.

iid

Bernoulli p ,

1000 1 • Hence pˆn i 1 Yi , which is just a sample mean. From the central limit theorem, 1000 we would expect pˆn to be close to a normal distribution:

pˆn p 1

p p n

N 0, 1

pˆn

N p,

p1

p n

.

Therefore, we have an approximation of the sampling distribution of pˆn when n is large. • Conditions for using the normal approximation for the sample proportion: (i) The samples are independent (ii) You do not sample more than 10% of the population 1

2. EXAMPLES

(iii) nˆ pn

10 and n 1

pˆn

10.

Illustration • We simulate X1 , . . . , XB independently from Bin 1000, 0.5 and compute Xb 1000. • R code (optional): B = 10000 n = 1000 p = 0.5 X = rbinom(B, n, p) hist(X/n, breaks=100, freq = FALSE) y =seq(0, 1, len = 1000) points(y, dnorm(y, p, sqrt(p*(1-p)/n)), col = 2, type = "l") • The normal approximation is quite good.

15 0

5

10

Density

20

25

Histogram of X/n

0.44

0.46

0.48

0.50

0.52

0.54

0.56

X/n

2

Examples

Example 2.1. Suppose that 22% of 18-year-old women in the US have a body mass index of 30 or more. In a large college, 200 females are randomly selected to report their heights and weights (from which their BIMs could be calculated). 31 of these students had BMIs greater than 30. Check if the conditions for using normal approximation are satisfied. Is this proportion of high-BM students unusually small? 2

2. EXAMPLES

Solution: we first check the conditions are satisfied to use normal distribution for approximating the sampling distribution of X, where X is the proportion of respondents with BMIs above 30: (i) The females are randomly selected, OK. (ii) The college is large, OK (iii) 31 “successes” and 169 “failures”, OK. pˆn

31 200

0.155;

p 1

p n

P X

pˆn

0.02929. P

X 0.22 0.02929

0.155 0.22 0.02929

0.0132.

Therefore, this proportion of high-BMI students is unusually small. In Excel, type =NORMDIST((0.155-0.22)/0.02929,0,1,TRUE) In R, type pnorm((0.155-0.22)/0.02929) Example 2.2. Suppose that about 13% of the population is left-handed. A 200-seat school auditorium has been built with 15 “lefty seats”, seats that have the built-in desk on the left rather than the right arm of the chair. In a class of 90 students, what’s the probability that there will not be enough seats for the left-handed students (using normal approximation). Solution: Using normal approximation without continuity correction: P Bin 90, 0.13

15

Bin 90, 0.13 90 0.13 15 90 0.13 0.13 0.87 90 0.13 0.87 90 1.034335 0.1505. P N 0, 1

P

Using normal approximation with continuity correction: P Bin 90, 0.13

15

15.5 P Bin 90, 0.13 Bin 90, 0.13 90 0.13 15.5 90 0.13 P 0.13 0.87 90 0.13 0.87 90 1.191053 0.1168. P N 0, 1

Exact solution: P Bin 90, 0.13

15

0.1191.

Remark: The book is rounding up numbers in intermediate steps. Never round up numbers in intermediate steps to 2 significant figures. You need more digits to get an accurate final answer. Also, it does not use continuity correction, resulting in a less accurate approximation. Example 2.3. After hearing of the national result that 44% of students engage in binge drinking (5 drinks at a sitting for men, 4 for women), a professor surveyed a random sample of 244 students at his college and found that 96 of them admitted to binge drinking in the past week. Should he be surprised at this result? Explain. 0.44 0.56 0.03177. With normal approxSolution: The SD of the sampling distribution is 244 imation, with 95% chance, the sampling proportion will lie in 0.376, 0.504 . Therefore, as pˆn 96 244 0.393, he should not be surprised.

3...


Similar Free PDFs