Stats 250 Study Guide 2 PDF

Title Stats 250 Study Guide 2
Author John Bergeson
Course Introduction to Statistics and Data Analysis
Institution University of Michigan
Pages 7
File Size 180 KB
File Type PDF
Total Downloads 117
Total Views 150

Summary

Download Stats 250 Study Guide 2 PDF


Description

Statistics: Exam 2 Study Guide

HW 3-7, all lab material. Chapter 9 









Section 1 o Use a confidence interval to estimate the value of a population parameter o Hypothesis testing uses sample data to attempt to reject a hypothesis about the population Section 2 o Big Five Scenarios For Categorical Variables  One population proportion  Difference in two population proportions  For Quantitative Variables  One population mean  Population mean of paired differences (dependent)  Difference in two population means (independent)  Section 3 o The distribution of all possible values of a statistic for repeated samples of the same size from a population is called the sampling distribution of the statistic Section 4 o Normal Approximation to the Binomial Distribution If X is a binomial random variable based on n trials with success probability p, and n is large,  then the random variable X is also approximately N(np, √np(1-p)) Conditions: the approximation works well when both np and n(1-p) are at least 10  If n is small, we would need to convert the question to a count and use the binomial distribution  to work it out If n is large, we would convert the question to a count and use the normal approximation for a  count, or use a related normal approximation for a sample proportion o Sampling Distribution of p If the same size n is large enough (namely, np>10 and n(1-p)>10), then p is approximately N(p,  √(p(1-p))/n)) o Standard Deviation of p s.d.(p) = √((p(1-p))/n)  Interpretation: approximately the average distance of the possible p values (for repeated  samples of the same size n) from the true population proportion p o Standard Error of p s.e.(p) = √(( p(1- p ))/n))  Estimate of the standard deviation of p  Interpretation: estimating, approximately, the average distance of the possible p values  (for repeated samples of the same size n) from the true population proportion p Section 5 o Two samples are said to be independent samples when the measurements in one sample are not related to the measurements in the other sample Generated in a variety of ways  Random samples are taken separately from two populations  One random sample is taken and a variable is recorded for each individual, but the units  are categorized as belonging t one population or another Participants are randomly assigned to one of two treatment conditions  If the response variable is categorical, a researcher might compare two independent groups by  looking at the difference between two proportions o Sampling Distribution of the Difference in Two (Independent) Sample Proportions

If the two sample proportions are based on independent random samples from the two populations and if all of the quantities n1p 1, n2p 2, etc. are at least 10, then p 1- p 2 is (approximately) N(p1-p2, √((p1(1-p1)/n1)+(p2(1-p2)/n2)) o Standard Error of the Difference in sample Proportions s.e.(p 1-p 2) = √((p 1(1-p 1)/n1)+(p 2(1-p 2)/n2))  Estimates, roughly, the average distance of the possible p 1-p 2 values from p1-p2  Section 6 o Response being measured is quantitative o If all possible random samples of the same size n are taken and x is computed for each, then… The average of all of the possible sample mean values (from all possible random samples of the  same size n is equal to the population mean μ Thus the sample mean is an unbiased estimator of the population mean  The standard deviation of all of the possible sample mean values is equal to the original  population standard deviation divided by √n Standard deviation of the sample mean is given by: s.d.( x)= σ/√n  Shape of the sampling distribution  If the parent (original) population has a normal distribution, then the distribution of  the possible values of x, the sample mean, is normal If the parent (original) population is not necessarily normally distributed but the  sample size n is large (approximately 25 or 30) then the distribution of the possible values of x, the sample mean is approximately normal o The Central Limit Theorem o Standard Deviation of x Interpretation: approximately the average distance of the possible sample mean values (for  repeated samples of the same size n) from the true population mean μ If the sample size increases, the standard deviation decreases, which says the possible sample  mean values will be closer to the true population mean (on average) The s.d.( x) is a measure of the accuracy of the process of using a sample mean to estimate the  population mean o Standard Error of the Mean S.E.(x) s.e.(x)=s/√n  Interpretation: estimating, approximately, the average distance of the possible x values (for  repeated samples of the same size n) from the true population mean μ Section 7 o Ways that paired data can occur Each person or unit is measured twice and the two measurements of the same characteristic or  trait are made under difference conditions Similar individuals or units are paired prior to an experiment and during the experiment, each  member of a pair receives a different treatment o For paired designs, it is the differences that we are interested in examining o Population parameter: μd=population mean of the differences in the two measurements 





o o

Sample estimate: =the mean of the differences for a sample of the two measurements Distribution of the Sample Mean Difference If the population of differences is normal, and a random sample of any size is obtained, then the  distribution of the sample mean difference 

is also normal, with a mean of μd and a standard

deviation of s.d.( )=σd/√n If the population of differences is not normal, but a large random sample of size n is obtained, then the distribution of the sample mean difference

and a standard deviation of s.d.( )=σd/√n An arbitrary level for what is ‘large’ enough has been 25 Standard error of the sample mean difference 

o

is approximately normal, with a mean of μd

 





s.e.( )=sd/√n Interpretation: estimating, approximately, the average distance of the possible repeated samples of the same size n) from the population mean difference μd

values (for

Section 8 o Two samples are said to be independent samples when the measurements in one sample are not related to the measurements in the other sample Generated in a variety of ways  Random samples are taken separately from two populations and the same response  variable is recorded for each individual One random sample is taken and a variable is recorded for each individual, but then units  are categorized as belonging to one population or another Participants are randomly assigned to one of two treatment conditions and the same  response variable Look at the difference between the two means  o Sampling Distribution of the Difference in Two (Independent) Sample Means If the two populations are normally distributed (or sample sizes are both large enough), then x1-x2  is approximately: N(μ1-μ2, √((σ12/n1)+(σ22/n2)) o Standard Error of the Difference in Sample Means s.e.(x1-x 2) = √((s12/n1)+(s22/n2))  Interpretation: estimates, roughly, the average distance of the possible x 1-x2 values from μ1-μ2  Section 9 o If we replace the population standard deviation σ with the sample standard deviation s, then t=( x-μ)/(s/√n) won’t be approximately N(0,1) o Instead it has a t distribution with n-1 degrees of freedom o Family of t-distributions Symmetric, unimodal, centered at 0  Flatter with heavier tails compared to the N(0,1) distribution  As the degrees of freedom (df) increases, the t distribution approaches the N(0,1) distribution 

Chapter 10 



Section 2 Sampling Distribution of p o  If the sample size n is large and np>10 and n(1-p)>10, then p is approximately N(p, √((p(1-p))/n)) Confidence level o  Interpretation: the probability that the procedure that is used to determine the interval will provide an interval that includes the population parameter Section 3 We have two populations or groups from which independent samples are available o The response variable is also categorical and we are interested in comparing the proportions for the two o populations Two Independent-Samples z Confidence Interval for p1-p2 o  Conditions  Sample proportions are based on independent random samples from the two populations  All of the quantities n1p 1, n2p 2, etc. must be at least 10

Chapter 11 

Section 1 A confidence interval provides a range of reasonable values for the parameter with an associated high o level of confidence The 95% confidence level describes our confidence in the procedure we used to make the interval o Population parameter o μ=population mean maximum distance to read the sign for all drivers 

Sample estimate x=sample mean maximum distance to read the sign for the sampled drivers  Sampling Distribution of the sample mean o If x is the sample mean for a random sample of size n from a population with a normal model,  then the distribution of the sample mean is: N(μ,σ/√n) Central Limit Theorem   If x is the sample mean for a random sample of size n from a population with any model, with mean, μ, and standard deviation σ, then when n is large, then the sampling distribution of the sample mean is approximately: N(μ,σ/√n) The standard deviation of the sample mean, σ/√n, is roughly the average distance of the possible sample o mean values from the population mean μ Standard Error of the Sample Mean o s.e.(x) = s/√n where s=sample standard deviation  The standard error of x estimates, roughly, the average distance of the possible x values from μ  One-sample t Confidence Interval for μ o x plus/minus t*s.e.(x)  Conditions   Random sample from a normal population  If the sample size is large (n>30), the assumption of normality is not so crucial and the result is approximate Section 3 For paired data designs, it is the differences that we are interested in analyzing o Have just one sample of observations (the differences)  One-sample t Confidence Interval for the Population Mean Difference μd o o



 



plus/minus t*s.e.( ) Conditions  Differences can be considered a random sample from a normal population  If the sample size is large, the assumption of normality is not so crucial and the result is approximate

Section 4 The General (Unpooled) Case o μ1 = mean for the first population; μ2 = mean for the second population  Parameter of interest: the difference in the population means μ 1-μ2  Sample estimate: the difference in the sample means x1-x2  Standard error: s.e.(x 1-x 2)=√((s12/n1)+(s22/n2))  General Two Independent-Samples t Confidence Interval for μ1-μ2   (x1-x2) plus/minus t*(s.e.(x1-x2))  Conditions Independent random samples from normal populations o If the sample sizes are large (both >25), the assumption of normality is not o crucial and the result is approximate The Pooled Case o Common population variance: σ2=σ12=σ22  Pooled standard deviation   sp=√(((n1-1)s12+(n2-1)s22)/(n1+n2-2)) Pooled standard error   sp√((1/n1)+(1/n2)) Pooled Two Independent-Samples t Confidence Interval for μ1-μ2   (x1-x2) plus/minus t*(pooled s.e.(x1-x2))  Conditions Independent random samples from normal populations with equal population o variances

o



If the sample sizes are large (both >25), the assumption of normality is not so crucial and the result is approximate

Levene’s test  Ho: σ12=σ22

Chapter 12 







Section 1 When the alternative hypothesis specifies a single direction, the test is called a one-sided or one-tailed o hypothesis test When the alternative hypothesis includes values in both directions from a specific standard, the test is o called a two-sided or two-tailed hypothesis test Data is summarized via a test statistic (z) o  (Sample statistic – null value)/(null) standard error The p-value is the probability of seeing a test statistic as extreme or more extreme than observed given o the null hypothesis is true  The smaller the p-value, the stronger the evidence is against the null hypothesis Section 2 Hypotheses About a Population Proportion o  Conditions  Data are assumed to be a random sample  Check if npo>10 and n(1-po)>10 What if n is small? o  Concert to count  Use X is approximately Bin(n,p) Section 3 Possible null and alternative hypotheses o  Ho: p1-p2=0  Ha: p1-p2>0, p1-p2μo, μ0, μd0, μ1-μ2...


Similar Free PDFs