AP Stats - Meaningful Interpretations PDF

Title AP Stats - Meaningful Interpretations
Author cefurey6
Course Prob & Stat Ii Statistics
Institution Michigan State University
Pages 4
File Size 228.4 KB
File Type PDF
Total Downloads 12
Total Views 154

Summary

Download AP Stats - Meaningful Interpretations PDF


Description

Exploring Data – Giving Meaning to Numbers Ch. 1

Terminology SOCS

Example w/ Meaning When asked to “describe the distribution” of a quantitative variable shown in a graphical display such as a histogram, comment in context on “Shape, Outliers, Center, and Spread (or Variability)”.

1

interquartile range

A college reports that the IQR of the SAT scores for incoming freshmen is 215. This means that the range of the middle 50% of scores is 215 points.

1

standard deviation

A college reports that the standard deviation of the SAT scores for incoming freshmen is 70. This means that a typical freshman’s SAT score deviates from the average score of all freshmen by 70 points.

2

percentile

Your SAT score report indicates that you scored in the 83rd percentile. This means that 83% of the scores are at or below yours.

2

z-score

A college reports that the mean and standard deviation of the SAT scores for incoming freshmen is 600 and 70 points, respectively. A certain freshman scored 740. This means that this freshman’s SAT score is two standard deviations above the average score.

3

DOFS

When asked to “describe the association between two quantitative variables” shown in a scatterplot, comment in context on “Direction, Outliers/Unusual Points, Form, and Strength”.

3

(Ch 3) After investigating quiz grades (out of 100 points) and time studied (in minutes) for her 15 students, a statistics teacher found the following: 𝑦 = 73 + 0.52𝑥; r = 0.913; r2 = 0.834; s = 4.8

3

correlation coefficient

r = 0.913 This means that there is a strong, positive, linear association between grades and minutes studied. In general, as the number of minutes studied increases, so does the grade.

3

method of least squares

Refers to the idea that the slope and y-intercept of the best fitting line are found in such a way as to minimize the sum of squared residuals.

3

slope

b1 = 0.52 This means that grades are predicted to increase by 0.52 points per oneminute increase in time studied.

3

y-intercept

b0 = 73 This means that the predicted grade for a student who studied 0 minutes is 73.

3

coefficient of determination

r2 = 0.834 This means that 83.4% of the variation in grades is explained by the linear relationship between grades and minutes studied. (A high value of r2 is also an indicator that the linear model is “appropriate” for making predictions.)

3

residual

One student in the class studied for 20 minutes and got an 80 on the quiz. To get the residual for this student first find y-hat𝑦 = 73 + 0.52(20) = 83.4; Then, residual = 𝑦 − 𝑦  = 80 − 83.4 = −3.4 This means that the predicted grade overestimates the actual grade by 3.4 points.

3

std. dev. of residuals

s = 4.8This means that the average prediction error is 4.8 points when using the LSRL to predict the grade for a given amount of minutes studied.

3

residual plot

When asked to comment on a residual plot, say whether or not the residuals are: 1)relatively small; 2)randomly scattered around zero with no clear pattern. If yes to both, then conclude that the linear model is appropriate for making predictions.

Sampling Design and Experimental Design Q & A Q A

How do you select a Simple Random Sample (SRS)? Assign a numerical label to each individual. Read xxx digits at a time from the table of random digits, ignoring repeats and numbers over xxx. Stop when xxx numbers have been found. Identify the selected individual based on the numbers selected.

Q A

Why is a stratified sample not an SRS? Because all possible samples of size xxx are not possible. A stratified sample will always result in a set number of individuals from each stratum. But, an SRS of size 100 could randomly result in any number of individuals, for example 90 females and 10 males, selected from the population.

Q A

What advantage does a stratified design have over an SRS? Stratification will result in a more precise estimate of the true population parameter since there will be less variability from sample to sample.

Q A

What is the advantage of having a large rando m sample in an observational study? Randomness will reduce bias because the individuals will be representative of the population, and increasing the sample size will reduce sampling error and make it more likely that the statistic will be an accurate estimate of the population parameter.

Q A

What is the difference between an experiment and an observational study? In an experiment, there are treatments that are deliberately imposed upon the subjects. In an observational study (or survey) the researcher simply observes and/or measures without attempting to influence the response.

Q A

What is the purpose of “control” in an experimental design? Controlling the effects of all lurking variables on the response is needed so that the only thing that can cause a change in the response is the treatment.

Q A

What is the purpose of “replication” in an experimental design? Replication of the treatment among many subjects in an experiment reduces the impact of chance variation on the results, thereby increasing our ability to detect a treatment effect.

Q A

What is the purpose of “randomization” in an experimental design? Randomization produces groups of experimental units which are similar in all respects before the treatments are applied. Randomization “evens out” the potential effects of possible confounding variables. Randomization reduces bias and helps ensure that different responses are actually caused by different effects of the treatments.

Q

Why should an experiment be double blind? If possible, an experiment should not allow either the subject or the researcher to know which subjects are in which treatment groups. The purpose is to reduce the possibility of response bias.

Q A

Can the results of an experiment be generalized to some larger population? Yes, but ONLY if the subjects in the experiment were randomly selected from that population. In most experiments this does not happen, so the answer is usually “No”.

Q A

Can an experiment establish cause and effect between the explanatory and response variables? Yes, but ONLY if the subjects are randomly assigned to the treatment groups (which is usually the case in good experiments). Note that the answer remains “yes” even if the results of the study are not “statistically significant”.

Q A

What is a statistically significant treatment effect? The difference between two groups is too large to be explained by chance alone.

Confidence Interval Summary (Single Proportion or Single Mean)

Define 𝑝 = 𝑠𝑜𝑚𝑒 𝑝𝑜𝑝𝑢𝑙. 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 or 𝜇 = 𝑠𝑜𝑚𝑒 𝑝𝑜𝑝𝑢𝑙. 𝑚𝑒𝑎𝑛 Focus on whether you are estimating a proportion or a mean.

P

Parameter

A

Assumptions

Check the Random, Normal, and Independent Conditions. Know that checking the Normal condition sounds different for each inference procedure.

N

Name the interval

“One Proportion z Interval” “One Sample t Interval for a mean”

I

Interval

Use the given sample statistics to calculate the interval. Write down the formula and plug in the values. Let 𝛼 = 1 − 𝐶. 𝑝 ± 𝑧 ∗ 

()





where 𝑧∗ = 𝑖𝑛𝑣𝑁𝑜𝑟𝑚 󰇡1 −  ,0 ,1󰇢

C

Conclusion

OR

 ± 𝑡∗ 𝑋 



√

𝑡 ∗ = 𝑖𝑛𝑣𝑇(1 −  , 𝑛 − 1)

We can be (95%) confident that the interval from ____ to _____ captures the (true proportion….) Or (true mean…….)

The Confidence Level sentence: use context to fill in the parentheses in the statement If we repeatedly take samples of size (n) from the (population in this study), and each time construct a (95%) confidence interval, then approximately (95%) of the constructed intervals will capture the (true proportion of …) or (true mean….). The Margin of Error sentence: use context to fill in the parentheses in the statement If we repeatedly take samples of size (n) from the (population in this study), then with (95%) confidence, we can expect the (sample proportion of …) or (sample mean…) to differ from the (true proportion) or (true mean) by at most (value of margin of error). Sample Size Calculations: In order to estimate a proportion or a mean using a given level of confidence and a given margin of error, solve for n in these equations. (Remember to use p*=0.5 if no information about p is given in the problem.) 𝑧∗  Width of CI:

∗ ( ∗ ) 

≤ 𝑀𝐸

OR

𝑧∗



√

≤ 𝑀𝐸

Width = twice the margin of error Width decreases if confidence level decreases or if n increases. To de-construct the CI: Point estimate = midpoint of interval = (left endpoint + right endpoint)/2 Margin of Error = (half of the width of interval) = (right endpoint – left endpoint)/2 The Standard Error sentence: use context to fill in the parentheses in the statement 𝑠 = 

 () 

OR

𝑠 =



√

If we repeatedly take samples of size (n) from the (population in this study), then the difference between a (sample proportion of …) or (sample mean…) and the (true proportion) or (true mean) would be, on average, (value of standard error). Is there evidence against a claim that population parameter is equal to a value? Yes (no), since the claimed value (p = …) or ( = …) is not (is) within the C% confidence interval, it is not (is) a plausible value for the parameter.

Hypothesis Testing Summary (Single Proportion or Single Mean) P

Parameter

Define 𝑝 = 𝑠𝑜𝑚𝑒 𝑝𝑜𝑝𝑢𝑙. 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 or 𝜇 = 𝑠𝑜𝑚𝑒 𝑝𝑜𝑝𝑢𝑙 . 𝑚𝑒𝑎𝑛

Focus on whether the claim is about a proportion or a mean; use present tense.

H

Hypotheses

A

Assumptions

N T

Name the test Test Statistic

State the null and alternative hypotheses. Use appropriate symbols. Focus on whether the claim (H a) has a direction (one-sided test) or no direction (two-sided test) Check the Random, Normal, and Independent Conditions. Know that checking the Normal condition sounds different for each inference procedure. “One Proportion z Test” “One Sample t Test for a mean” Use the given sample statistics to calculate the test statistic. Write down the formula and plug in the values. 𝑧=

 

  (  ) 

OR

t=

   √

Observe the P-value The P-value is the tail area to the right or left of the test statistic; or twice the area for a two sided test. Draw and shade a standard normal (z) or a t-distribution with df=n-1. Calculate the area using Table A (z) or Table B (t) or use the normalcdf or tcdf calculator commands. Or use menu to get both the test statistic and P -value calculations done. Reject H0 because the P-value is less than α. M Make a decision Fail to reject H 0 because the P-value is greater than α. S State the conclusion There is convincing evidence to conclude whatever H a says. There is not convincing evidence to conclude whatever H a says. O

The P-value sentence: use context to fill in the parentheses in the statement There is a (P-value chance) of observing a (sample proportion or sample mean) as or more extreme than (my value) through random sampling variation if (state H 0 in words) is true. Another P-value interpretation: The P-value represents the smallest level of significance that can be chosen and still Reject H0. The Power sentence: use context to fill in the parentheses in the statement There is a (power) chance of finding convincing evidence for H a if the alternative value of the parameter is (some value that’s given). Power should be very high b/c it’s the probability of “correctly rejecting a false H 0.” Five ways to increase power: increase n; increase α; reduce 𝛽; increase effect size by picking an alternative value for the parameter further from the null value tha n the one given; decrease σ. Type I Error: Reject H0 when H0 is actually true (it’s the mistake of rejecting a true null; or of putting an innocent 𝛼 = 𝑃(𝑇𝑦𝑝𝑒 𝐼 𝐸𝑟𝑟𝑜𝑟) = 𝑃 (𝑅𝑒𝑗𝑒𝑐𝑡 𝐻| 𝐻 𝑖𝑠 𝑡𝑟𝑢𝑒) person in jail) Type II Error: Fail to Reject H0 when Ha is actually true (it’s the mistake of failing to reject a false null; or of setting 𝛽 = 𝑃(𝑇𝑦𝑝𝑒 𝐼𝐼 𝐸𝑟𝑟𝑜𝑟) = 𝑃(𝐹𝑎𝑖𝑙 𝑡𝑜 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻| 𝐻 𝑖𝑠 𝑓𝑎𝑙𝑠𝑒 ) a guilty person free. Use a level (1- )% CI to make a decision for a two-sided test at level : Reject H0 if the value of the parameter specified in H0 is outside the confidence interval....


Similar Free PDFs