ST2132 Finals Cheatsheet PDF

Title ST2132 Finals Cheatsheet
Author Jingxuan Zhang
Course Mathematical Statistics
Institution National University of Singapore
Pages 4
File Size 195.7 KB
File Type PDF
Total Downloads 326
Total Views 475

Summary

ST2132 Finals Cheatsheet (AY18/19 Semester 1)1 ST2131 stuff1 Probability Baye’s theorem:P(B) =P(B|A)P(A) +P(B|Ac)P(Ac) Baye’s 1stFormula:P(B) = ∑P(B|Ai)P(Ai) Baye’s 2ndFormula:P(Ai|B) = ∑P(B|Ai)P(Ai) P(B|Aj)P(Aj) 1 Random Variables1.2 Binomial DistributionDefinition 1 variableXis the total no of suc...


Description

ST2132 Finals Cheatsheet (AY18/19 Semester 1) 1

ST2131 stuff

1.1

• I(θ) =

Probability

• Baye’s 2nd Formula: P (Ai |B) =

1.2.6

• E(X ) =

Definition 1. The Binomial random variable X is the total no of successes in n Bernoulli trials with probability of success p   n k n−k , k = 0, 1, ..., n P DF : P (X = K) = k p (1 − p) V ar(X ) = np(1 − p),

I(θ) =

n p(1−p)

• When n large, p < 0.1, np moderate: X ∼ Bin(n, p) ≈ P o(np)

Poisson Distribution

Definition 2. X ∼ P o(λ) if P (X = k) = • E(X ) = λ, −λ

• CDF: e

1.2.3

λk k!

λk , k!

I(θ) =

e

MGF: Mx (t) = e

1−p , p2

E(X ) =

1 λ,

Z

V ar(X ) =

x

f (u)du = −∞ 1 λ2

,

( 1 − e−λx 0

I(θ) =

x≥0 x ǫ) → 0 P (| ˆ

X

¯ ( X−µ) √ S/ n

• An unbiased estimate whose variance achieves this lower bound is said to be efficient. • Since we hae a constraint for the pi = 1, we apply Lagrange Multiplier and maximise: • Since the asymptotic variance of a MLE is equal to the lower bound, MLE X X X said to be asymptotically efficient. L(p1 , . . . , pm , λ) = log(n!) − log xi ! + xi log pi ! + λ( pi − 1)

Note that the sampling distribution of X¯ ∼ N(µ, σ 2 /n) and nˆ σ 2 /σ 2 ∼ 2 ¯ and σ . Furthermore, X χn−1 ˆ 2 are independent

• Gamma: λˆ =

P

no closed form for α ˆ

P

2. Make subject of formula

¯, • Normal: µ ˆ =X

P

l(p1 , . . . , pm ) = log(n!) −

3 steps of MOM

¯)= SE = SD( X

xi logλ − nλ −

• X i not independent since they sum to n,

Method of Moments

¯, ˆ =X • Poisson: λ

SE = SD( ˆθ0 )

• The loglikelihood function is:

1. Calculate µ1 , µ2 , find expressions for the moments in terms of the parameters

4.1.2

P

CI of MLES

• Since

¯ ˆ0 = X logxi ! MLE: λ P (xi − µ)2 • Normal: l(µ, σ) = −nlog(σ) − 2nlog(2π) − 2σ12 P ¯, ¯ )2 σ ˆ 2 = n1 (xi − X µ ˆ=X P P • Gamma: l(α, λ) = nαlog(λ) + (α − 1) log(xi ) − λ xi − n log[Γ(α)] • Poisson: l(λ) =

• With SRS, the Expectation of R is given approximately by: 1

f (xi |θ)

• An experiment has m possible outcomes E1 , . . . , Em with probabilities p1 , . . . , pm . Let X i be the number of times Ei occurs in total n independent runs of the experiments.

Estimation of Ratio

E(R) ≈ r +

Q

• The loglikelihood function is defined as: l(θ) =

ˆ λ=

• With SRS, the X i are not independent thus the above CLT doesnt apply. ¯ : (X ¯ ± zα/2 )σ ¯ , • CI for X X

4.3

MLE

• Likelihood function: L(θ) =

as n → ∞

Large Sample Dist of MLE is ≈ Normal

Theorem 2. Let θˆ denote the MLE of θ0 . Under smoothness conditions on f , the probablity distribution of q ˆ − θ0 ) → Z(0, 1) nI(θ0 )( θ

The asymptotic variance of the MLE is

1 nI(θ)

= − El′′1(θ

0)

5.2.1

Rao-Blackwell Theorem

Theorem 5. Let θˆ be an estimator of θ with finite second moment ˆ ). E( ˆ θ2 ) < ∞ for all θ. Suppose that T is sufficient for θ, and letθ ˜= E(θ|T Then for all θ, 2 2 ˆ E(θ˜ − θ) ≤ E(θ − θ)

6

6.1 6.1.1

Hypothesis Testing

Neyman-Pearson Paradigm Terminology & Likelihood Ratio

• Type I error: Rejecting H0 when it is true Probability of type I error = significance level= α • Type II error: Accepting H0 when it is false. Probability of type II error= β P (H0 rejected | H0 is f alse) = 1 − β is called the power of the test • The decision rule is based on a test statistic. The set of values of a test statistic that leads to rejection of H0 is the rejection or critical region. • The set of values that leads to accepting H0 is the acceptance region. • The probability distribution of the test statistic when H0 is true is call the null distribution. • The likelihood ratio is defined as the ratio of the two likelihoods (one under H0 and the other under H1 ).

6.1.2

Neyman-Pearson Lemma

6.5

Generalised Likelihood Ratio Tests

7.1.2

• Suppose that the observations X = (X 1 , . . . , Xn ) have a joint density or frequency function f (x|θ). Let Ω be the set of all possible values of θ. • Let ω0 , ω1 be subsets Ω such that ω0 is disjoint from ω1 and Ω = ω0 ∪ ω1

7.2

• The generalised likelihood ratio test statistic is: Λ∗ =

¯

¯

|X− Y | or large √S +S xx

yy

2 SX n

+

2 SY m

,

¯ ¯ Y X− ¯ Y ¯) V ar( X−

Test statistic: √

• The null distribution of this statistic can be closely approximated by the t

maxθ∈ω [lik (θ)] 0 maxθ∈ω [lik(θ)] 1

distribution with: df =

Small values of Λ∗ tend to discredit H0 maxθ∈ω [lik(θ)] 0 maxθ∈Ω [lik(θ)]

¯ Y ¯ )2 mn ( X− m+n Sxx Syy

Unequal Variance based on Normal Dist

¯ − Y¯ ) = • V ar(X

• The test is of the form H0 : θ ∈ ω0 vs H1 : θ ∈ ω1

• We define Λ =

Likelihood Ratio Test

The test rejects for large values of 1 +

which equals min{Λ∗ , 1}, so small values

7.3

2 /n+S 2 /m)2 (SX Y 2 /n)2 (S 2 /m)2 (SX Y + n−1 m−1

Factors Affecting Power

The power of a two-sample t test depends on four factors:

of Λ∗ correspond to small values of Λ.

• Then the generalized likelihood ratio test would reject H0 if Λ ≤ λ0 where λ0 is a constant determined by P (λ ≤ λ0 |H0 ) = α

• The larger the real difference ∆ = µX − µY , the greater the power. • The larger the significance level α the more powerful the test. • The smaller σ, the larger the power.

Theorem 9. Under smoothness conditions on the probability densities or • The larger the sample sizes n and m, the greater the power. frequency functions involved, the null distribution of −2 log Λ ≈ χ2 with df = dim(Ω) − dim(ω0 ) as n → ∞

7.4

Comparing Paired Samples

7.5

Non-Parametric Methods

Definition 17. Suppose that H0 and H1 are simple hypotheses and that the test rejects H0 whenever the likelihood ratio is less than c and signifiTheorem 10. For Multinomial Distribution: We can use the Pearson cance level α. Then any other test for which the significance level is less Paired Samples is more effective: Chi-Square Test, its test statistic is than or equal to α has power less than or equal to that of the likelihood ¯ ≤ V ar(X¯ − Y¯ ) Let Di = X i − Yi , then V ar( D) ratio test 2 2 2 X X ˆ θ )] ¯ = 2σ (1−ρ) , V ar( X (O − E ) ¯ − Y¯ ) = 2σ2 , then [x − np ( i i i i 2 When σX = σY = σ, V ar( D) Roughly, we can re-state by: among all tests with a given n n X = = ¯ ˆ V ar(D) Ei npi (θ) P(type I error), the likelihood ratio test minimizes P(type II error). relative efficiency: V ar( X− ¯ Y ¯) = 1 − ρ ¯ Q 100(1 − α)% CI for µD : D ± t n−1 (α/2)S ¯D f ˆ are the observed & expected cell counts i) where Oi = npˆi , Ei = npi ( θ) • The LR of H0 to H1 based on IID X 1 , . . . , Xn is Λ(x) = Q f0 (x 1 (xi ) 2 (α) Reject H0 if −2 log Λ = X 2 > χm−k−1 The smaller Λ(x), the more evidence X has against H0 . A reasonable critical region consists of X with small Λ(x). Neyman-Pearson Lemma only considers simple null hypothesis. Let X 1 , ..., X n be IID with cdf F and Y1 , ..., Ym be IID with CDF G We consider H0 : F = G

7

6.2

α and p − value

• α is the probability of rejecting H0 when it is false. • p − value is defined to be the smallest significance level at which H0 would be rejected. The smaller the p-value, the stronger the evidence against H0

6.3

Uniformly Most powerful tests

Comparing 2 samples

7.1

7.1.1

Equal Variance based on Normal Dist Some Facts

7.5.1

• A natural estimate of µX − µY

¯ − Y¯ and X ¯ − Y¯ is the MLE of µX − µY . is X

¯ − Y¯ ∼ N[µX − µY , σ 2 ( 1 + • X n

1 )] m

Theorem 6. If the alternative hypothesis H1 is composite, a test that is ¯ )− (µ − µ ) ¯ Y (X− X Y q ∼ N (0, 1) most powerful for every simple alternative in H1 is said to be uniformly • If σ 2 is known then Z = 1 σ 1 +m n most powerful. However, the likelihood ratio test is not uniformly most powerful when q H1 : µ 6= µ0 is two sided. ¯−Y ¯ ± z(α/2)σ 1 + • The 100(1 − α)% CI for muX − µY would be: X n

6.4

• Take the smaller sample, of size n = min(n, m) and compute the sum ranks R from that sample. • Let R ′ = n(m + n + 1) − R. The Mann-Whitney Test Statistic is R ∗ = min(R, R ′ ), and reject H0 if R ∗ is too small 1 m

Duality of CI & Hypo Tests

Theorem 7. Suppose that for every value in θ in Θ there is a test at level α of H0 : θ = θ0 Denote the acceptance region of the test by A(θ0 ). Then the set C(X ) = {θ : X ∈ A(θ)} is a 100%(1 − α) confidence region for θ Theorem 8. Suppose that C(X ) is a 100%(1 − α) confidence region for θ; i.e for every θ0 , P [θ0 ∈ C(X )|θ = θ0 ] = 1 − α Then an acceptance region for a test at level α of the hypothesis H0 : θ = θ0 is A(θ0 ) = {X |θ0 ∈ C(X )}

• If σ 2 is unknown, then it can be estimated from the data using the pooled sample variance

sp2:

2 sp

=

Indpt Samples: Mann-Whitney Test

• Group all (n + m) observations tgt as a pooled sample, and Rank(Z) = i if Z is the ith smallest value within the pooled sample. P P • Define rank sum scores: R X = Rank(X i ), R Y = Rank(Yi )

2 (n−1)S 2 +(m−1)SY X m+n−2

7.5.2

Paired: Wilcoxon Signed Rank Test

• Based on diff Di , H0 assumes Di is symmetrically distributed about 0 • Let Rank(D) = i if D has the ith smallest absolute value in the sample.

¯ − Y¯ is then: s X− ¯ = sp • The estimated SE of X ¯ Y

q

1 n

+

1 m

Theorem 11. X 1 , ..., X n be IID N (µX , σ2 ), Y1 , ..., Ym be IID N (µY , σ 2 ) The statistic t =

¯ Y¯)−(µ −µ ) (X− X Y s¯ ¯ X−Y

∼ t m+n−2

¯ − Y¯ ) ± t m+n−2 (α/2)sX− ¯ 100(1 − α)% CI for µX − µY is: (X ¯ Y

• Define W+ to be sum of ranks among positive Di and W− the sum of ranks among all negative Di , and reject H0 when either one is too large or small. • If zeros are present, they are discarded. If there are ties, then the Di ’s are given an average rank within all the ties. The signed rank test is to the paired t-test as the Mann-Whitney Test is to the two sample t-test....


Similar Free PDFs