Title | BMGT230 Final Exam Info |
---|---|
Course | Business Statistics |
Institution | University of Maryland |
Pages | 2 |
File Size | 5 MB |
File Type | |
Total Downloads | 44 |
Total Views | 163 |
All the info for Final Exam...
BMGT230B Final Exam – DIAGNOSTICS ON Chapter 6 – Correlation & Linear Regression • Scatterplot: plots one quantitative variable against another; explanatory x-axis, response y-axis o Direction – positive, negative, no direction o Strength – how much variation or scatter there (weak, strong, somewhat strong…) o Form – linear, curved, clustered, no pattern o Outliers – data that has low probability of occurrence (unusual/unexpected) • Standardizing using Z scores !!! OR (value-mean)/SD o 𝑍 =! !
•
• • • •
(!!!)(!!!) !! !!
!! !!
Correlation Coefficient 𝑟 = !
= !!! !!! o between -1 and +1; measures direction and strength; 0 is no relationship o calculated using mean and SD of both x and y variable; only 2 quantitative variables o no units of measurement; sensitive to unusual observations Ecological Correlations – based on rates/averages, tend to overstate strength of associations Correlation is NOT causation – not enough evidence that change in x causes a change in y Linear Model – equation of a straight line through the data o Residual: 𝑒 = 𝑦 − 𝑦 or the distances from each point to the LSRL and yhat is the predicted value Least Squares Regression Line/Line of Best Fit 𝑦 = 𝑏! + 𝑏! 𝑥 o Sum of residuals always 0 !! o b1 is the slope of the line or estimated change in y with one unit change of x: 𝑏! = 𝑟 !!
• •
o b0 is the y intercept, average value of y when x is 0: 𝑏! = 𝑦 − 𝑏! 𝑥 Correlation determination, r2, is the percentage of variance in y that can be explained by changes in x Residual Plot – if CURVED, your relationship is not linear; change in variability across plot (ex. Increasing spread) is a warning o BEWARE of lurking, AVOID extrapolating, REMEMBER correlation does not mean causation, LOOK for outliers
Chapter 11 – Confidence Intervals & Hypothesis Tests for Means ! • Sampling Distribution SD !, Mean is 𝑋, SE(𝑋) •
Central Limit Theorem: When randomly sampling from any population with mean 𝜇 and SD s and when n is large enough, ! 𝑤ℎ𝑒𝑟𝑒!𝑛 ≥ 40!𝑜𝑟!25, larger sample = better approximations the sampling distribution is approximately normal. 𝑁 𝜇,
•
∗ Confidence Interval: 𝑋 ± 𝑡!!! ∗ 𝑆𝐸 𝑌 !𝑜𝑟!𝑋 ± 𝑀𝐸
!
!
o
Standard Deviation 𝑠 =
o
Standard Error of Mean!!!
(𝑥! − 𝑥)! STAT-Edit-L1 then STAT-CALC-1-VarStats
!!! ! !
∗ and Margin of error: 𝑚 = 𝑡!!!
!
•
T distributions
•
Hypothesis: 𝐻! : 𝜇! = 𝑛𝑜!𝑒𝑓𝑓𝑒𝑐𝑡!𝑎𝑛𝑑!𝐻 ! : 𝜇 ≠≤≥ 𝑒𝑓𝑓𝑒𝑐𝑡! ! !∗! ∗ Sample Size: 𝑀𝐸 = 𝑍 ∗ or 𝑛 = ( )! ; if n is LESS THAN 60, 𝑀𝐸 = 𝑡!!!
! !
!
, use tcdf(p/-e^99, p/e^99, df)
o
!
!"
! !
∗ !!!! ! ! ) !"
!or!𝑛 = (
Chapter 12 – Comparing Two Groups - 𝑝!∝!𝑓𝑎𝑖𝑙!𝑡𝑜!𝑟𝑒𝑗𝑒𝑐𝑡!𝑡h𝑒!𝑛𝑢𝑙𝑙!h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠! • Two Sample Z o
•
𝑍=
!! !!! !(!! !!! ) ! !! !!!! !! !!
; 𝑆𝐸 =
!!! !!
+
!!! !!
and df is equal to the smallest of (𝑛! − 1, 𝑛! − 1)
Two Sample T o Hypothesis: 𝐻! : 𝜇! − 𝜇! = 0!𝑎𝑛𝑑!𝐻! : 𝜇 ≠≤≥ 0! o
o
𝑡!" =
!! !!! !! ! ! !! !! ! !! !!
, use tcdf, with df being the smallest of (𝑛! − 1, 𝑛! − 1)
∗ Confidence Interval: (𝑥! − 𝑥! ) ± 𝑡!"
!!!
!! ! !(! !!)! ! !! !! !! ! !
!!
+ !!
!
•
Pooled Standard Deviation 𝑠!! =
•
USE POOLED WHEN IT SAYS “Normally distributed with equal variances” o
CI: 𝑥! − 𝑥! ± 𝑡 ∗ 𝑠!
!! !!!!! !
!!
+
! !!
where df = 𝑛! + 𝑛! − 2
and hypothesis test 𝑡!" =
(!! !!!) !!
! ! ! !! !!
•
Paired T Test o Hypothesis: 𝐻! : 𝜇! = 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒!𝑜𝑟!𝑛𝑜!𝑒𝑓𝑓𝑒𝑐𝑡!𝑎𝑛𝑑!𝐻 ! :!𝜇 ≠≤≥ 𝑒𝑓𝑓𝑒𝑐𝑡; Reject null means there’s an effect o o o
! !!!!!
Mean Paired Difference: 𝑑 = Confidence Interval: 𝑑
!
!! ∗ ± 𝑡!!! ! !!!!
Hypothesis Testing: 𝑡 =
! (! !!)! !!! !
where 𝑠! =
!!!
!! !
Chapter 14 – Inference for Regression (degrees of freedom for critical t n-2) • 𝐹𝐼𝑇:!𝜇! = 𝑏! + 𝑏! 𝑥 and Data = fit + residual (ei) •
Regression Standard error, 𝒔𝒆 =
(𝒚𝒊 !𝒚𝒊 )𝟐 𝒏!𝟐
, granted it’s independent, linear,
normal variation, SD same for all values of x !
!!
•
CI for b0, 𝑏! ± 𝑡 ∗ 𝑆𝐸(𝑏! ) where 𝑆𝐸 𝑏! = 𝑠!
•
CI for b1, 𝑏! ± 𝑡 ∗ 𝑆𝐸(𝑏! ) where 𝑆𝐸 𝑏! =
•
Significance test for the slope o We can test the hypothesis H0: b1 = 0 versus a 1 or 2 sided alternative. o We calculate t = b1 / SE(b 1) which has the t (n – 2) distribution to find the p-value of the test. Testing the Hypothesis of no relationship o test the hypothesis that the regression slope parameter β is equal to zero. o H0: β1 = 0 vs. H0: β1 ≠ 0 IF YOU REJECT THE NULL, It means it is significant o Testing H0: β1 = 0 also allows to test the hypothesis of no correlation between x and y in the population. o Slope b1 = r (s(y) / s(x)) Confidence interval for µy = y ± tn − 2 * SE(µ) The standard error of the mean response µy is:
•
• •
SE(µˆ ) = SE 2 (b1 )(x * −x )2 +
o •
!! (!! !!)!
(!! !!)! !!
=
!! !!!
1 (x * −x )2 se2 + = se n n ∑(xi − x )2
The standard error for predicting an individual response ŷ is:
ˆ = SE 2 (b1 )(x * −x )2 + SE(y)
o • • •
+
!
1 (x * −x )2 se2 2 + se = se 1+ + n n ∑(xi − x )2
CI used to measure accuracy of mean response of all individuals in a population (Parameter) Prediction Intervals – measure accuracy of single individual’s predicted value (single random variable) Outlier – large residual/distance, influential if it gives regression model different slope without it
Chapter 15 – Multiple Regression • ANOVA table, k is the number of predictor variables df Sum of squares Mean of squares F P value Model (regression) k SSR (explained) MSR=SSR/k MSR/MSE Error (residual) n-k-1 SSError MSError=SSError/(n(unexplained) k-1) Total n-1 SST=SSError+SSR MST=SST/(n-1) !!" !!" =1− • 𝑅! = tells what percentage of the variation in the response is explained by the linear relationship between y !!"
!!"
•
and predictions !"# decreases when you add predictors that aren’t significant Adjusted 𝑅 ! = 1 −
•
Testing the Model 𝐻! : 𝐵! = 𝐵! … 𝐵! , !𝐻! : 𝐴𝑡!𝑙𝑒𝑎𝑠𝑡!1!𝛽!𝑖𝑠!𝑛𝑜𝑡!0; REJECT NULL = LINEAR RELATIONSHIP
•
∗ CI: 𝑏! ± 𝑡!!!!! 𝑆𝐸 𝑏! !𝑤ℎ𝑒𝑟𝑒!𝑆𝐸!𝑖𝑠! !
•
𝑒 =𝑦−𝑦
•
𝑆! =
!"#
!!
! !!!
!!"##$# !!!!!
and 𝑆! =
(!!!)! !!!
𝐻! : 𝐵! = 0, !𝐻! :!𝐵! ≠ 0
𝑡=
!! !"(!!)
!𝑜𝑟! 𝑀𝑆𝐸𝑟𝑟𝑜𝑟!
EXCESS • If 0 is within the confidence interval, it means THERE IS NO SIGNIFICANCE IN PROPORTIONS BETWEEN X AND Y...