Stats Semester 1 Final Study Guide PDF

Title Stats Semester 1 Final Study Guide
Author Kristin Sellers
Course AP Statistics
Institution High School - USA
Pages 6
File Size 191.8 KB
File Type PDF
Total Downloads 28
Total Views 133

Summary

final study guide...


Description

Stats Semester 1 Final Study Guide Univariate Data Distribution for univariate data: Center- median or mean When the mean > median, the data is right skewed When the mean < median, the data is left skewed Shape- unimodal, biomodal, symmetry, right/left skewed Spread- IRQ (Q3 - Q1) and standard deviation Standard deviation- if it’s a sample S, if it’s a population σ Variance = (standard deviation)^2 5 number summary: min, Q1, median, Q3, and max Outlier (unusual y-value) if > Q1 – 1.5(IQR) < Q3 + 1.5(IQR) Influential point- unusual x-value 65-95-99.7 Rule (normality)- 68% of the data falls between one standard deviation of the mean, 95% of the data falls between two standard deviations of the mean, 99.7% of the data falls between three standard deviations of the mean Dot Plot- place a dot above its value for every observation Frequency

Data

Histogram- a dot plot, but the dots are replaced with boxes Bin width= 0.5

Data

Stem-andLeaf Plot- a table with the tens digit in the left column, and the ones in the right (each ones digit represents an observation

Histogram if you turn your head sideways!

Bar Chart- used for categorical data, order of bars don’t matter

Box Plots- graph that displays the Q3, Q1, and median of the data with a box (long tail in the right direction  right-skewed, long tail in the left direction  left-skewed)

Z-score- number of standard deviations away from the mean

Bivariate Data Scatterplot- used for quantitative bivariate data

Bivariate distribution: Trend- positive or negative Shape- linear, nonlinear, hetero/homoscedastic (variation in spread of observations) Strength- weak, moderate, strong Correlation (R)- the strength of the linear relationship (if |R| is close to 1, the data is strongly linear)

y i− ^y i = error (residual) 2 R =

(total ∑ of squares)– (∑ of errors squared) total ∑ of squares

Total sum of squares = sum of squared distances from an observation to the mean Interpreting R2 : (100 x explained by the model

2 R )% of the variation in [y] can be

Residual Plot- access the linearity of a relationship by plotting the residuals ^y ( y=0 ) , it’s linear If observations are equal distant from Residual = y -

^y

Regression Equation:

^y =mx + b

Probability Simulations 1. Choose x digits (0, (x-1)) 2. Assign each outcome a digit 3. Record the number of digits not observed On a calculator… Math  Prob  randInt (lower, upper, number of integers) Mutually exclusive (disjoint)- no two events can happen at the same time (NOT independent of each other) If two events are disjoint… P(A U B) = P(A) + P(B) P(A|B) = P(A) or P(B) P(A and B) = 0

Random Variables Discrete random variables- can take on a limited number of values Continuous random variables- can take on ay value (including decimals) μ= E ( x ) =∈ x i pi=np



σ = ∈ ( x i−μ) ∗pi=√ np (1− p) 2

μax =a μ x σ ax =√ a(σ x )

μx +x =2 μ x σ ax =2 σ x 2 2 2 μx + y =μ x + μ y σ x+ y =σ x + σ y 2 2 2 μx− y =μ x −μ y σ x− y =σ x +σ y

Binomial Distribution: Something can be distributed binomially if there are two outcomes X ~ B(n,p) P(x ≤ k)  binompdf P(x ≥ k) = 1 - P(x ≤ k) P(x = k)  binomcdf If np ≥ 10 and n(1-p) ≥ 10  normal distribution

Geometric Distribution: X = # trail of first success k−1 P ( x =k )=( 1− p ) ∗p μx =1/ p

Sampling Distributions: A distribution of frequencies of a range of different outcomes that could occur for a set number (n) of the population μx´ =μ x σ ´x =

σx √n

(standard error)

Central Limit Theorem (CLT)- as n (# of observations) approaches infinity, the sampling distribution ( ´x ) becomes normally distributed If the population is normal or n = 1 or n > 30, the sampling distribution ( ´x ) becomes normally distributed P = population proportion ^ P = sample proportion Mean of Sampling Distribution for Sample Proportion: μ ^P = p Standard Deviation of Sampling Distribution for Sample Proportion: p (1− p) σ^P= n





p(1−p) ) if np ≥ 10 and n(1- p) ≥ 10  if not then I’s n distributed binomially

^ P

~ N(p,

Designing a Survey Population- the set of people/things that you want to survey

Experimental Units- subjects Sample- the set of units that you study Census- a sample that is the entire population Sampling Frame- the source (list) from which a sample is drawn Types of Bias: Size Bias Convenience Bias Questionnaire Bias Measurement bias

Different Types of Surveying: Simple Random Sample (SRS)- a sample where each unit is equally likely to be selected for treatment Stratified Random Sample- divide the population into strata (units grouped by similarity) and randomly select from each strata Cluster Sample- view the population as clusters (randomly assigned groups) and randomly select clusters to be surveyed Multistage Sample- SRS within strata or clusters Systematic Sample with Random Start- drawing a uniform sample across the population at specific intervals (ex. 10 th, 20th, 30th, etc. on a list) Designing a Survey 1. Define the population 2. Specify the sampling frame 3. Specify the sampling method 4. Determine the sample size 5. Implement sampling plan/collect data

Designing an Experiment Controls- positive/negative used to interpret accuracy of the results Replication Randomization Blocking- splitting experimental units into groups then randomly select subjects If blocks = size 1  use repeated measures (i.e. randomize test order) If blocks = size 2  matched pairs Experiment = assigning people treatments Observation = recording observations of people’s everyday tendencies...


Similar Free PDFs