Stats 101 Cheatsheet PDF

Title	Stats 101 Cheatsheet
Course	Introduction to statistic
Institution	Singapore Management University
Pages	2
File Size	371.9 KB
File Type	PDF
Total Downloads	47
Total Views	134

Preview

CLICK TO PREVIEW PDF

Summary

Stats 101 Cheatsheet...

Description

CATEGORICAL/QUALITATIVE/DEP (Y) √ observed (index, blood pressu readings), X measured Nominal (X order- sex) VS Ordinal (√ order- months) UNIVARIATE CATEGORICAL (1 VARIABLE) 1) Bar Charts, 2) Pie Charts, 360◦ 3) Pareto Diagram (many categories) a. Bars arranged highest  lowest b. Cumulative percentage polygon (line) Description: Highest, 2nd highest, lowest BIVARIATE CATEGORICAL (2 VARIABLES) 1) Side by Side Bar Charts  HYPO TEST FOR CATS a. Easier to compare when all values v. similar

NUMERICAL/QUANTITATIVE/INDEP (X) √ measured (height, weight, area, temp) Discrete (√ countable) VS Continuous (X countable) UNIVARIATE NUMERICAL (1 VARIABLE) 1) Histogram X = Number line  cannot rearrange bars Meaningful intervals with equal widths Freq Table: find min, max, range first Class Bin Freq 𝑟𝑎𝑛𝑔𝑒 Upper limit n = 10 Width = 𝑛

2) • • • •

• •

9.5 – 12.0 12.0 Uniform: Median = Mean Right-skewed/+ve tailed: Median < Mean Left-skewed/-ve tailed: Median > Mean Box and Whisker Plot / 5-number summary Arithmetic 𝜮𝒙 Mean:  𝒙= 𝒏

Median: Q2 (50%); lowest to highest  mid value Mode: most freq (can be >1) Range = Max - Min

(𝜮𝒙)𝟐 *SD is non-negative!! 𝜮𝒙𝟐 − 𝒏 SD = s = reasonable variation = √

𝒏−𝟏

Interquartile Range = Q3 (75%) - Q1 (25%) The middle 50% of X has an average diff of (IQR).

BIVARIATE NUMERICAL (2 VARIABLES) - relationship #1: Find 𝛴𝑥 ; 𝛴𝑦 ; 𝛴𝑥 2 ; 𝛴𝑦 2 ; 𝛴𝑥𝑦 ; 𝑛 draw table #2: let X = ____ (units) ; let Y = ___ (units) #3: Linear Correlation Coefficient/multiple r (4 dp) (𝜮𝒙)(𝜮𝒚) [𝜮𝒙𝒚 − ] *Correlation 𝒏 𝒓= does NOT imply causation!!

√[𝜮𝒙𝟐 − (𝜮𝒙) ] [𝜮𝒚 𝟐 − (𝜮𝒚) ] 𝒏 𝒏 𝟐

𝟐

Intervening factors: ↑ go out, ↑ accidents #4: Interpret r/slope (“trend” / properties) number r close to -1 V. strong -ve linear correlation btwn X & Y 0.0-0.2 V. weak, > negligible rs, 0=X sig correlated 0.2-0.4 Weak/moderate r/s

0.4-0.6 Fairly +ve r/s 0.6-0.8 Strong +ve r/s 0.8-1.0 V. strong +ve linear correlation btwn X&Y The scatterplot/line of best fit/regression line/normal prob plot shows a strong +/-/negligible r/s btwn X & Y, with a good degree of linearity/strong normality of data (X major deviation from plotted points). We can conclude that there is a strong +/- correlation btwn X & Y. The longer the X, the longer Y. ∴, we are @ liberty to apply least sqs mtd to find eqn of the regression line. #5: Coefficient of Determination, 𝐫 𝟐 (%/proportion) Since 𝑟 2 = __% , __% of the variation in 𝑦 is explained by the variability of 𝑥. The remaining (100 - __) = __ % of the variability in 𝑦 is due to factors other than what the linear regression model can explain. i.e. the variability in predictions this model yields is ↓ by 𝑟 2 %. #6: Linear Regression Eqn (predict): 𝒚 = 𝒃𝟏 𝑿 + 𝒃𝟎

#7: Gradient: 𝒃𝟏

=

(𝜮𝒙)(𝜮𝒚) ] 𝒏 (𝜮𝒙)𝟐

[𝜮𝒙𝒚−

[𝜮𝒙𝟐 − 𝒏 ] *regression coefficients

#8: y-intercept: 𝒃𝟎 = ( ) − 𝒃𝟏 ( 𝜮𝒚 𝒏

𝜮𝒙 𝒏

*n = sample pairs X = √ control Y = X control (dep)

)

#9: write linear eqn: pulse in bpm = -3.5 (time) + 4.5 #10: The sample regression slope, 𝑏1 , represents the estimated expected ↑/↓ in 𝑦 per unit ↑ in 𝑥. In context, for every (unit) ↑/↓ in 𝑥, 𝑦 drops by 𝑏1.

C H A N C E/ P R OB A B I L I T Y Classical √ assumption Qn: X numbers P(2B,1G) = (*BBG, BGB, GBB) 1 1 1

1 3

*BBG = x x = ( ) 2 2 2

Empirical X assumption, “survey” Qn: √ numbers P(2B,1G) =

2

89 232

PASCAL’S ∆: 𝑛 2 𝐶 = 10 combi / ways to select 2 out of n coins MONTY HALL PARADOX: switch to x2 chances of win 1 2 (𝑚𝑦 𝑐ℎ𝑜𝑖𝑐𝑒) 𝑉𝑆 (𝑟𝑒𝑚𝑎𝑖𝑛𝑖𝑛𝑔) 3 3 SIMPSON’S/ONE PROBABILITY PARADOX: Reversal of effect materialises when 1 factor is omitted/included. 𝑥 is a lurking variable, one which may exert dramatic influences if omitted from a study, cos an association may look quite diff after adjusting for the effect of this 3rd variable by grouping the data according to its values. ∴, we cannot summarily state that __, but rather the reverse. GENERAL ADDITION RULE:

P(A or B) = P(A∪B) = P(A) + P(B) – P(A∩B)

MULTIPLICATION RULE (TEST FOR INDEPENDENCE): • Outcome of 1 event DOES NOT affect probability of another event occurring  multiply! #1: P(A and B) = (A∩B) = #2: P(A) X P(B) =

35 70

X

35

70

23+33 70

OR *P(A|B) OR P(A)

= √ indep ≠ dep

BAYES’ RULE:

*𝐏(𝐀|𝐁) =

𝐏(𝐁|𝐀)𝐏(𝑨)

𝐏(𝐁|𝐀)𝐏(𝐀)+𝐏(𝐁|𝐀′)𝐏(𝐀′)

𝐏(𝐀∩𝐁)

OR

*P(A’) = 1 – P(A) OR P(win) = 1–P(lose) *P(F’|W’) = cannot find if/given not there = 1 *P(>2 share birthdays) = 1 – P (no one shares) =1–(

365

165

𝐏(𝐁) P(B|A)P(𝐴)

x 365 x … 365) 364

316

or

P(B)

CONTINUOUS DISTRIBUTIONS “CHANCE” NORMAL DIST (pop) / SAMPLE NORM DIST X count (height, weight, time, temp) • Bell VS well curve (retail, hotel, income- avg @ bottom) • How to det normal behv? See Scatter Plot. #1:

DISCRETE DISTRIBUTIONS “CHANCE”

#2: #3:

√ count (number of…)

DISCRETE “expected winnings, true/weighted avg” #1: let X = player’s winnings in $ ; m = amt paid to play #2: Outcome X ($) Prob xP 1 *E(X) # hits 1 win – 1 amt to play •

•

Mean = E(X) *For a fair game, E(X) = 0 “Expected Winnings” = win – lose = – $100 “Expected Loss” = + $100  opp signs!! SD(X) = √𝑬(𝒙𝟐 ) − [𝑬(𝒙)]𝟐

BINOMIAL 2 outcomes only! head/tail, Y/N, M/F p=0.5 P(success) + P(failure) = 1 CONSIDER IF X CAN BE -VE! Mean = E(X) = 𝒏𝒑 x = 0 to n (√ limit) SD(X) = √𝒏𝒑(𝟏 − 𝒑) full dp! p = prob of success let X = number of (qn) 1 - p = prob of failure X~B(n,p) n = ∞  p = 100% 𝒏 #3: P(0≤ X ≤1) = ( ) 𝒑𝒙 (𝟏 − 𝒑)𝒏−𝒙 𝑥 𝒙 • • • #1: #2:

𝒙

( )=1 ( )=𝒙 1 0 𝑥0 = 1

POISSON units of measure “per__/in__yrs”, ∞ X limits **∆ if necessary!! Mean = E(X) = 𝝀/unit • SD(X) = √𝝀

•

#1: let X = number of … every 10 mins/year #2: X ~ 𝝋 (𝝀) #3: P(X≥3) = 1 – P(X = 0, 1, 2) =1–

𝒆−𝝀𝝀𝒙 𝒙!

OR

𝝀𝒙 𝒆−𝝀 [ 𝟎!

**Poisson (discrete) VS Exponential (continuous; T btwn…)

𝝀𝒙

+ 𝟏! +

𝝀𝒙 𝟐!

]

APPROX BINOMIAL POISSON “using suit approx” #1: let X = number of (qn) #2: X ~ B (n , p)

#3: Since n = __ is v. large and p = 𝒆−𝝀 𝝀𝒙 𝒙!

is v. small, 4000

𝜆𝑥

𝜎

√𝑛

= __ , n = __

let X = ____ *might have to draw table Since n ≥ 30, nPs ≥ 5 and n(1-Ps) ≥ 5, by CLT, 𝝈 𝟐 √𝒏

X ~ N (𝝁, 𝝈𝟐 ) OR 𝒙 ~ N (𝝁, ( ) ) 𝒙 −𝝁

𝑥−𝜇 P(0≤ 𝑥 ≤ 14) = 𝑃 ( 𝜎 < 𝒁 < 𝝈 / 𝝈 ) ) 2 dp ( 𝒙−𝝁

√𝒏

** ≈ 0.5 – 0.5 (if too far off graph) ≈ 0.0000

If X 𝜎  P(-2𝜎 < X – 𝝁 < 2𝜎) = P(-2 < Z < 2) = see table

**“__% fall within/beyond 2SD above norm/mean?”

#5: Although population was not described as normal, √ CLT came into effect as √ 𝝈 was known and random samples n=36 > 30, which means 𝒙 appox N dist. OR A2) CI of 𝜇 if X 𝜎 is unknown 0.0001% chance that getting at least as extreme a result as this. If mean were ___g, the result is more likely (not) due to random variation/by chance. Rather, mean was likely to be below ___g.

SAMPLE PROPORTION, Ps  √ count! Ps = Mean SD

Population 𝝅 = true proportion 𝛔= true variance

𝒙

𝒏

(0 ≤ Ps ≤ 1) Sample E(Ps) = 𝝅 (when dk 𝜋, use Ps) *might have to draw table 𝐬=√

𝝅(𝟏 − 𝝅) 𝒏

#1: 𝜋/Ps = __ (%) , n = __ #2: let Ps = sample proportion of ___ #3: Since n𝜋/Ps = __≥ 5 , n(1- 𝜋/Ps) = __ ≥ 5, by CLT,

Ps ~ N ( 𝝅 , (√

#4: P (Ps > 0.74) = P(𝒛

>

𝝅(𝟏−𝝅) 𝒏

𝑷𝒔−𝝅

√

𝟐

) )

) *if qn didn’t state,

𝝅(𝟏−𝝅) 𝒏

0.74 > 0.5, so >

#5: __% chance of … (>/< 5%?  see table for descrip)

PROBABILITY/P-VALUE

1

X ~ 𝝋 (𝝀) **𝝀 = 𝒏𝒑 = mean #3: P(X≤2) = P(X=0,1,2)

≈

#4:

𝜇 /𝑥 ∗ = __ , 𝜎/

𝜆𝑥

OR 𝑒 −𝜆 [ 0! + 1! +

𝜆𝑥

2!

] 𝑥0 = 1 0! = 1

*KEYWORDS: within/no more than/@most ≤, no less ≥ *SMALL N: skewed distribution which violates the necessary assumption of normality.

>/< lowest sig lvl commonly employed? p-value < 5% or 1%  reject H0 p-value > 5% or 1% Stat SIG diff esp life & death Statistically INsig Small, v. rare occurrence Likely occurrence √ statistically unusual, extreme X statistically unusual X due to random √ due to random variation/chance variation/chance Beyond reasonable doubt Very common

UNIFORM “symmetrical/rectangle/equal probability” intervals of equal length (alarm every 5 min)

#1: #2: #3: #4:

𝒂+𝒃

𝟏𝟐 (𝒃−𝒂) Mean = E(X) = , SD(X) = √ let X = ____ 𝟐 X ~ Uni ( a , b ) P(X < 20) = length x height (area)

#7: Test: P (Z >/< #4) = P (Z > 2 dp) = p-value 4dp < 𝛼? 𝛼 P (𝑍 > |#4|) = p-value < ? (2-tailed) 𝛼 2 𝛼 ) H0 #8: ∴ We do not (> 𝛼/ ) / reject (< 𝛼/ 2 2 #9: We do not / have enough evidence, at __ % sig lvl, to say that H0 / H1 .

𝟐

𝜶 sig lvl (2-tailed) C.I. 100 (1-𝜶) 𝒁𝜶

#5: Limit = (lower [mean - SD] , upper [mean + SD])

EXPONENTIAL “waiting time (T)” #1: Mean = E(X) = SD(X) = E(T) =

𝟏

𝝀

#4: *P(T > arrival time) = 𝒆 *if P(T ≤ 10) = 1 – P(T > 10) = 1 – 𝒆−𝝀𝒕 −𝝀𝒕

FIND ING n, e, C. I. , SI G L VL 𝜶 Since n ≥ 30 / nPs ≥ 5 and n(1-Ps) ≥ 5, by CLT, 𝑥 /𝑃𝑠 ~ N,

𝒏=

(𝒁𝜶)𝟐 𝐗 𝟐 𝒆𝟐

𝝈𝟐

𝒆 = (𝒁𝜶 𝟐

𝝈

√𝒏

0.10 90% 1.645

𝟐

“mean T btwn…”

𝝀 = poisson avg “mean/avg per unit” #2: let T = time between ____ in sec/mins/hours #3: T ~ exp (𝝀)

MEAN

#6: Under H0 , ___

) “within ±5”

*If σ unknown, estimate from past studies/surveys

PROPORTION, Ps

true %, √ count (study, vote, poll) (𝒁𝜶 )𝟐 𝐱 𝑷𝒔(𝟏 − 𝑷𝒔) 𝑷𝒔(𝟏 − 𝑷𝒔) 𝟐 𝒏= ) 𝐞 = (𝒁𝜶 √ 𝒏 𝟐 𝒆𝟐

*If Ps unknown, assume Ps = 0.5 (max value) ↑e, ↓n = allow/tolerate more error for less work e has more effect on error than on CI, bcos 𝑒 2 X guilty ≠ innocent X guilty = not, not innocent (insuff evi to prove innoce)

𝑥 = ___, 𝝈 / s = __ , n = __ let 𝜇 = true mean ___ #1: H0 : 𝜇 ≤/≥/= __ “at least 0.4”  H1 : 𝝁 < 0.4 #2: H1 : 𝜇 >//< z ) < 𝛼 OR if P ( Z > |𝑧| ) <

OR Reject H0 if t >/< t𝛼 ; n-1 = critical t (see table!!) Reject H0 if |𝑡 | > 𝑡𝛼 ; n-1 = see table (2-tailed) 2

𝛼

2

0.01 99% 2.576

• C.I. of 𝝁 when √ 𝝈 known (pop)/ X 𝝈 UNknown (s) let 𝜇 = true mean ___ Since n ≥ 30, by CLT, 𝑥 ~ normal, Confident coefficient

 ± 𝐙𝛂 a __% C.I. * for 𝝁 = (𝒙

𝛔

𝟐 √𝐧

) = (– , +) units 4dp

Assuming 𝜇 is not too skewed & since 𝜎 is unknown, a __% C.I. * for 𝝁 = (𝒙 ± (𝒕𝜶;𝒏−𝟏 ) ( 𝟐

𝑺

√𝒏

))

** Conclusion:  draw intervals! 1) We are __% confident that the true mean of (𝜇 ) lies btwn __ and __ (units). 2) Valid claim? 0 falls within CI = X sig diff btwn X & Y 3) X overlap  Since the lower confi limit of the prop of X exceeds the upper limit of that of Y, the true mean X is significantly bigger than that of Y, with __% confidence. 4) √ overlap  Since X & Y overlap, cannot claim that 1 (has more)  no sig diff btwn 2 5) lower Y = upper X  equal chances of winning 6) “> chance can explain?”  lower limit exceeds 50% 7) C.I. = net that encompasses 𝜇

ONE-SAMPLE HYPO TESTS “SUFFICIENT EVI?” MEANS • √ 𝝈 known (pop) OR X 𝝈 UNknown (sample)

0.02 98% 2.326

0.05 95% 1.96

Evi of a diff btwn true PROP of __

PROPORTIONS (Ps) 1 SAMPLE / 2 SAMPLE

Ps = 6 of 10, once every 7 days, n=_ OR 𝑃𝑠1, 𝑃𝑠2, 𝑛1, 𝑛2 𝑥 +𝑥 let 𝜋 = true %/proportion of ___ 𝜋 = 𝑛1 +𝑛2 1

2

OR H0 : 𝜋1 = 𝜋2 #1: H0 : 𝜋 = __ #2: H1 : 𝜋 >/30, / Assuming 𝜇 not too skewed,

𝒁=

(𝒙 𝟏 −𝒙 𝟐 )−(𝝁𝟏 −𝝁𝟐 ) √

𝝈𝟏 𝟐 𝝈𝟐 𝟐 + 𝒏𝟏 𝒏𝟐

∼ N (0, (0,1) 1)

𝟏 − 𝒙𝟐 ) ± 𝒁∝√ • __% C.I.* for 𝝁𝟏 − 𝝁𝟐 = ((𝒙 𝟐

•

𝝈𝟏 𝟐 𝒏𝟏

+

𝝈𝟐 𝟐 𝒏𝟐

X 𝝈𝟏 & 𝝈𝟐 UNknown (F test  T tests) F-test: “test equality/assumption of variances”

)

let 𝜎1= true variance of ___ in (units) ; 𝜎2 =___ #1: H0 : 𝜎1 2 = 𝜎2 2  not UNequal #2: H1 : 𝜎1 2 ≠ 𝜎2 2  not equal, sig diff #3: 𝛼 = 0.05 (2-tailed since ≠, X direction)

1)

#4: Assuming 𝜎 are approx. N, 𝑭 = #5: Reject H0 if F > 𝐹∝ ; 𝑛 2

𝑺𝟏 𝟐 ~ 𝑺𝟐 𝟐

𝑭𝒏𝟏 −𝟏,𝒏𝟐−𝟏

= critical F (see table!) 1 −1,𝑛2 −1

2) T-tests: “2 better” = lesser error let 𝜇1= true mean (error) of __ in (units) ; 𝜇2 = _ …#3: 𝛼 = 0.05 (1 tailed) *See F-test 𝛼 1/2 tailed? ÷2? #4: Since 𝑛1= 𝑛2 >30,CLT, /Assuming 𝜇 not too skewed, & since 𝜎1 and 𝜎2 are unknown & not UNequal/equal, do not reject; 𝒔𝒑 𝟐; pooled SD/not UNequal variance 𝟏 −𝒙 𝟐)−(𝝁𝟏−𝝁𝟐) (𝒙 𝟏 𝟏 + ) 𝒏𝟏 𝒏𝟐

√𝑺𝒑 𝟐(

~ 𝒕𝒏𝟏+ 𝒏𝟐−𝟐

reject; 𝒔𝟏 𝟐, 𝒔𝟐 ; separate SD/not equal variance

𝒕=

 𝟐 )−(𝝁𝟏 −𝝁𝟐 ) (𝒙𝟏 −𝒙 𝟐 𝑺 𝟐 𝟐 𝒏𝟐

𝑺 √ 𝒏𝟏 + 𝟏

~ 𝒕𝒅.𝒇. (

#5: Reject H0 if t > 𝑡∝ ;𝑛1+ 𝑛2 −2 = critical t (4dp) table!! 2 **|2-tailed| #6: Under H0 , 𝜇 = 𝜇 1

#7: Test:

𝒔𝒑 𝟐 =

2

𝑺𝟏𝟐(𝒏𝟏 −𝟏)+ 𝑺𝟐𝟐(𝒏𝟐−𝟏) (𝒏𝟏+𝒏𝟐−𝟐)

#4 = __ >/< #5? (4dp)

• C.I. of diff 𝝁 when X 𝝈𝟏 & 𝝈𝟐 UNknown a __% C.I.* for 𝝁𝟏 − 𝝁𝟐 = *Previous 𝛼 1/2 tailed? ((𝒙𝟏 − 𝒙𝟐 ) ± 𝒕∝; 𝒏 𝟐

√𝒔𝒑 𝟐 (

𝟏 + 𝒏𝟐 −𝟐

𝟏/𝒔𝟏 𝟐 𝒏𝟏

+

𝜮𝑫

𝜮𝑫𝟐

…#4: … 𝒕 =

•

 −𝝁𝑫 𝑫 𝒔𝑫 ) √𝒏𝑫

(

~ 𝒕 𝒏−𝟏

*refer to ONE SAMPLE HYPO TEST!

C.I when X 𝝈𝑫 UNknown “C.I. of true mean diff” let 𝝁𝑫 = true mean “X-Y” difference in units.  ± 𝒕∝ ; 𝒏 −𝟏 ( 𝑫 )) a __% C.I.* for 𝝁𝑫 = (𝑫 𝑫 √𝒏𝑫 𝟐 𝒔

Significant F #6: Under H0 , 𝜎1 = 𝜎2 #7: Test: F = #4 = p-value (4 dp) >/< #5? #8: ∴ We do not reject (< #5) / reject (> #5) H0 #9: We do not/have enough evi, at % sig lvl, to say true variances btwn 𝜎1 & 𝜎2 are H0 / H1 (see red #1 , #2)

𝒕=

𝒏𝑫

Find 𝛴𝐷 ; 𝛴𝐷2 ; 𝑛𝐷 ; 𝐷 ; 𝑠𝐷 *Assume X-Y (higher) = -ve let 𝝁𝑫 = true mean “X-Y” difference in units.

𝟏/𝒔𝟐 )) 𝒏𝟐 𝟐

HYPO TESTS FOR CATS

“SIG LV L, EVI”

𝝌𝟐 GOODNESS OF FIT TEST (1 variable)

#1: H0 : ___ ok/fine/unbiased “sum of diff” #2: H1: ___ NOT ok/NOT fine/biased  “disobey law” …#4.1: (𝑶 − 𝑬)𝟐 Outco O Prob E 𝑬 mes (A/F) =Prob x Y 18 Red Cal “A” 38/37 18 38/37

Black Green Total

Y

2 38

/

Cal “B”

1

37

Cal “C” Y

1

𝝌𝟐 = ∑

(𝑶 − 𝑬)𝟐 𝑬

e.g. Roulette: American (38 +2G) VS French/EU (37 +1G)

𝝌𝟐 TEST OF INDEPENDENCE (2 variables)

#1: H0 : X & Y are independent, i.e. no r/s evi of r/s #2: H1: X & Y dep, i.e. ∃ r/s  X direction, only related …#4.1: 1) Add on to qn: Row TT, Column TT, Grand TT 2) c1 c2 (𝑂 − 𝐸)2 (𝑂 − 𝐸)2 O *E O *E 𝐸

𝐸

r1 “A” “C” “A” “C” r2 0 “B” = “B” “B” “D” cTT Q Q #7: “X” W W #7: “Y” *E = (r TT X c TT) / grand TT (n) #4.2: Since each E > 5 , & sample size is large, 𝝌𝟐 = ∑

(𝑶 − 𝑬)𝟐 ~𝝌𝟐(𝒓−𝟏)(𝒄−𝟏) 𝑬

2 =critical 𝜒 2 (see table!!) #5: Reject H0 if 𝜒 2 > 𝜒(𝑟−1)(𝑐−1) #6: Under H0 , independence

#7: Test: 𝝌𝟐 = ∑ (𝑶−𝑬) 𝑬

𝟐

= “X” + “Y” in table >/< #5?...