Exam 3 notes - Professor Lock Morgan - virtual course - exam 3 summary PDF

Title Exam 3 notes - Professor Lock Morgan - virtual course - exam 3 summary
Course Elementary Statistics
Institution The Pennsylvania State University
Pages 7
File Size 372 KB
File Type PDF
Total Downloads 15
Total Views 141

Summary

Professor Lock Morgan - virtual course - exam 3 summary...


Description

Inference with Normal and t-distributions ● Chapter 5-6 ● Normal distribution ● Central limit theorem ● Formulates with SE given - statkey ● Inference on minitab ● Analyzing paired vs unpaired data Regression ● 2.6, 9.1, 10.1 ● Simple linear regression ● Test for correlation ● Multiple regression Normal and t-distributions Two ways to use N(0,1) 1. Given the standard error ● N(0,1) on statkey ● Z = (stat - null) / SE or stat +/- z* x SE 2. Given the raw data or summary statistics ● Normal or t based inference in minitab Given the Standard Error ● Testing (asking for p value) ○ Calculate z = (stat - null) / SE ○ Use statkey to compare N(0,1) to find p value ● Interval ○ Use statkey and N(0,1) to find z* ■ z* would be the number on the right on the x axis ○ Calculate stat +/- z* x SE Given raw data or summary statistics ● Minitab → basic statistics → proportion, sample t, etc

Minitab settings/options ● For single proportion use the normal approximation method ● For difference in proportions use the pooled estimate of the proportion ● For confidence intervals make sure it’s set to =/

Mean

Correlation

Proportion

Sample - x bar

Sample - r

Sample - p hat

Population - mu

Population - p

Population - p

Appropriate Distributions ● For proportions (categorical variables), the normal distribution is used if countries are at least 10 in each category ● For means (quantitative variables), the t distribution is used if n>30 or the data appears to be approximately normal (normal if it’s symmetrical) ○ The larger the sample size, the closer the t distribution will be to normal

Mean difference (ud) -

Matched pairs 2 measurements from each study (or 2 comparable units)

Difference in means (u1-u2) -

Independent samples Same quantitative from 2 separate groups

Regression ● Predicted Y for given X values ● Actual - predicted = residential ● Intercept = predicted Y when X value is 0 ● Slope = predicted change in Y for every unit change in X (keeping other X variables constant, if more than one X) ● Inference for slope coefficients ● R^2 = proportion of variability in Y explained by Xs ● Multiple regression: coefficients and p values depend on what other X variables are in the model

Example question 1. Calculate the standardized test statistic given difference of proportions: 90/15000 and 5/15000 and SE = .0065 ● Use z = (stat - null) / SE ● P1 - p2 - 0 / .0065 = 8.73 2. Find the p value ● Click on normal distribution in statkey ● Keep N(0,1) ● Click on right tail - look in question to see what tail to use ● Input the z statistic to the bottom right point ● P value given up top 3. Find z* for 90% Confidence interval ● Use normal distribution in statkey ● Keep N(0,1) ● Click two tail and adjust number ● z* is the bottom right number ● z* = 1.645 ● Use formula stat +/- z* x SE to find confidence interval if needed or can be done in minitab ○ Stat → basic stat → 2 proportions → summarized data → plug in values → adjust CI 4. Flee = 43.459 - 3.517twitches ; what is the predicted time to flee for a lizard does average 2 twitches? ● Plug 2 in for twitches and solve To find area above: ● Normal distribution ● Right tail and plug in # on the x axis Central Limit Theorem ● The larger the sample sizes, the more normal the distribution will be P value < a ● ● ● ● ● ● ●

Reject the null We have convincing evidence that the alternative is true Results are statistically significant

P value > a ● ● ●

Do not reject the null Test is inconclusive Results are not statistically significant

If a P% CI contains the parameter, do not reject If a P% CI does not contain the parameter, reject ○ Significance “Most significant” = means the lowest p value (closest to 0) “Least significant” = means the largest p value

● “Significant at 5%” = means the p values that are less than .05 GSG notes Chapter 5

Normal distribution - centered around the mean Standard deviation - measured by the gap between x axis coordinates Example questions 1. What would our z score be if we had a sample stat of 80, a mean of 100, and a standard deviation of 10 ● Use (stat - mean) / SE ● (80-100) / 10 = -2 2. Using this z score, what would the statistical significance be if we were conducting a hypothesis test? ● To find the p value, click left tail since it’s negative ● Input -2 to the bottom value, p value is .023 < .05 ● It is significant For normal distribution: N(mu, sd) For hypothesis test: N(null value, sd) Example question 1. If we have 2 sprinters and sprinter 1 beats sprinter 2, 60% of time, is it statistically significant to say that sprinter 1 is faster then sprinter 2? Standard deviation is .5 seconds ● Sample stat = .6 ● Null value =.5 ○ Because there’s 2 sprinters, there’s a 50/50 chance of each of them winning ● N(.5 , .5) ● Plot sample statistic on distribution ● Right tail and plug in .6 = p value is .421 - not statistically significant If we want to calculate a CI, we will center our normal distribution around our sample data: N (sample statistic, SE)



Once we have a normal distribution, we can calculate a CI using: ○ Sample stat +/- z* (SE)

z* found by using N(0,1) and looking at the right tail Example questions 1. What is the 95% confidence interval for a standard normal distribution? N(0,1) ● Click 2 tail and .95 ● CI = (-1.96, 1.96) ● OR use formula 0 +/- 1.96(1) 2. What is the 90% CI for N (45,12) ● Input 45 for mean and 12 for SE ● Click 2 tail and .90 ● CI = (25.262, 64.738) ● OR use formula 45 +/- 1.645(12) Chapter 6 Normal distributions of proportions ● We do not have our SE, so we use minitab to find it ● Need to meet requirements: n*p > 10 and n(1-p) > 10 ○ n = sample size ○ p = population parameter ● Normal distribution will be centered around the population parameter ● Create normal distribution N (mu, SE) Example 1. When considering all drivers about 3 in 10 indicate that they doze off while driving. Is there evidence to suggest that with stat 200 students this rate may be higher ● This is a hypothesis test ○ Ho: p = .3 ○ Ha: p > .3 2. 560 students took survey, 190 students said they have dozed off while driving ● Single proportion in minitab ● Find the p value → .021 ● .021 < .05 ● Significant

Distribution of mean ● Our mean of the sample data will be the center of the distribution ● Most of the time we do not have our population standard deviation ○ Instead we use the sample standard deviation ● Because we are using a sample SD, we must plot our distribution on a t distribution T distribution ● Bell shaped and symmetric but it has heavier tails, which means it tends to produce values that fall far from the mean (center) ● Requirements: ○ n > 30 ○ If n < 30, then the distribution must be symmetric ■ If not symmetric, we can’t use the t distribution Df = n-1 CI for t distribution ● CI formula: sample statistic +/- t* (SE) ○ t* will be given or calculated ● Remember that you need to check if n>30 or if it’s symmetric ● Using statkey all you need is sample size and the CI Example 1. A study needs a 95% CI and has a sample size of 214 ● Fits requirements: 214>30 ● Create a t distribution and input 213 as df Hypothesis Test for mean ● n>30 or symmetric ● Plot t stat on t distribution to find the p value Example 1. What if we wanted to test if the average time between geyser eruptions was not 71 minutes? What would the hypothesis be? ● Ho: mu=71 ● Ha: mu=/71

A: paired t B: 2 individual samples ** look at the people being studied and if they are related or not X = explanatory variable Y = response variable 95% confidence interval → 99% confidence interval ● Center stays the same ● Standard error stays the same ● Multiplier will increase ● Margin of error will increase ● Interval width with increase...


Similar Free PDFs