Title | Summary Fundamentals of Mathematical Statistics |
---|---|
Course | Intro Prob Solv & Programming |
Institution | University of Texas at Austin |
Pages | 12 |
File Size | 560.9 KB |
File Type | |
Total Downloads | 102 |
Total Views | 148 |
Stastitics summary...
Fundamentals of Mathematical Statistics Definitions/Lemmas by Roman B¨ ohringer
Multinomial Distribution: P (N1 = n1 , . . . , Nk = nk ) =
ETH Z¨ urich January 2, 2021
Probability Theory Conditional Probability: P (A | B) =
n n1 · · · nk
Poisson Distribution:
:=
n n1 · · · nk
pθ (x) = exp
pn1 1 · · · pnkk
n! n1 ! · · · nk !
P (X = x) = e−λ P (A ∩ B) P (B)
The family is in canonical form if:
Distributions
k X j=1
θj Tj (x) − d(θ) h(x)
Where: Z
d(θ) = log
λx x!
k X exp h( x ) dν(x ) θj Tj (x) j=1
Estimation
Normal Distribution:
Estimator: An estimator T (X) is a function T (·) evaluated at the observations X. The function T (·) is not allowed to P (B) depend on unknown parameters. P (B | A) = P (A | B) P (A) Sum of Independent Normal / Poisson Variables: For Empirical Distribution Function: 2 Y ∼ N (µY , σY2 ) 1 Law of Total Probability: For a partition {Bj } (Bj ∩ Bk = X and Y independent: X ∼ N (µX , σ X ) and Fˆn (·) := # {Xi ≤ ·, 1 ≤ i ≤ n} with Z = X + Y , then Z ∼ N (µX + µY , σ 2X + σY2 ). n ∅ for all j 6= k and P (∪j Bj ) = 1): X ∼ P(λ), Y ∼ P(µ) ⇒ Z ∼ P(λ + µ). X Method of Moments: Given the first p moments of X : Chi-Square Distribution: Let Z1 , . . . , Zp be i.i.d. N (0, 1)P (A | Bj ) P (Bj ) P (A) = distributed and define the p-vector: j Z µj (θ) = Eθ X j = xj dFθ (x), j = 1, . . . , p Marginal Density: Z1 Z Z := ... fX (·) = fX,Y (·, y)dy And the map m with inverse m−1 : Zp Z m(θ ) = [µ1 (θ), . . . , µp (θ )] m−1 (µ1 , . . . , µp ) fX (x) = fX (x | y)fY (y)dy Z is N (0, I) distributed and the χ2 -distribution with p degrees of freedom is defined as (kZk22 ∼ χp2): We calculate: Conditional Density: Z n p 1 X j X µ ˆj := Xi = xj dFˆn (x), j = 1, . . . , p fX,Y (x, y) 2 2 Zj kZ k2 := n fX (x | y) := i=1 fY (y) j=1 And plug in: fY (y) Distribution of Maximum: Z := max{X1 , X2 } with X1 , fY (y | x) = fX (x | y) θˆ := m−1 (ˆ µ1 , . . . , µ ˆp ) fX (x) X2 independent having distribution F and density f . Maximum Likelihood Estimator: Given the likelihood Conditional Expectation: function: fZ (z ) = 2F (z )f (z ) Z n Y LX (ϑ) := pϑ (Xi ) , ϑ ∈ Θ E[g(X, Y ) | Y = y] := fX (x | y)g(x, y)dx Exponential Families: A k-dimensional exponential fami=1 ily is a family of distributions with densities of the form: Iterated Expectations Lemma: We calculate: Bayes Theorem:
E [E [g(X, Y ) | Y ]] = Eg(X, Y ) Law of Total Variance: var(Y ) = var(E(Y | Z)) + Evar(Y | Z )
1 1 x−µ 2 √ e− 2 ( σ ) σ 2π
pθ (x) = exp
k X j=1
c j (θ)Tj (x) − d(θ) h(x)
ˆ θ = arg maxlog LX (ϑ) = arg max ϑ∈Θ
ϑ∈Θ
n X i=1
log pϑ (Xi )
Minimal Sufficiency: Two likelihoods Lx (θ) and Lx˜ (θ) are And for γ = c(θ): proportional at (x, x˜) if
Sufficiency Sufficiency: Some given map S : X → Y is called sufficient for θ ∈ Θ if for all θ, and all possible s, the following conditional distribution does not depend on θ : P θ (X ∈ · | S(X) = x) Factorization Theorem of NeymanPR : Given densities pθ , S is sufficient if and only if there are some functions gθ (·) ≥ 0 and h(·) ≥ 0 such that we can write: pθ (x) = gθ (S(x))h(x) ∀x, θ Sufficiency for Exponential Families: For a k dimensional exponential family, the k-dimensional statistic S(X ) = (T1 (X ), . . . , Tk (X)) is sufficient for θ. For n i.i.d. samples, the following statistic is sufficient: S(X) = (
n n 1X 1X T1 (xi ), . . . , Tk (xi )) n n i=1
i=1
Lx (θ) = Lx˜ (θ)c(x, x˜) ∀θ for some constant c(x, x˜). A sufficient statistic S is called minimal sufficient if S(x) = S(˜ x) for all x and x˜ where the likelihoods are proportional. Completeness: Sufficient statistic S is called complete if (where h is a function not depending on θ): Eθ h(S ) = 0∀θ ⇒ h(S ) = 0, P θ − a.s.
∀θ
C := {(c 1 (θ ), . . . , c k (θ)) : θ ∈ Θ} ⊂ R
If C is truly k-dimensional, S := (T1 , . . . , Tk ) is complete.
Fisher Information
Expectation/Covariance of Sufficient Statistic for ExScore Function: ponential Families: Given an exponential family in canonical form and: d p˙θ (x) sθ (x) := log pθ (x) = 2 2 pθ (x) dθ ∂ ∂ ∂ ˙ ¨ := d(θ) = d(θ) d(θ) := d(θ) d(θ) ∂θ∂θ′ ∂θj1 ∂θj2 ∂θ Eθ sθ (X) = 0 Eθ T1 (X) T1 (X) For n i.i.d. observations: .. .. T (X) := n , Eθ T (X) := . . X sθ (x) = sθ (xi ) Eθ Tk (X) Tk (X) Cov θ (T (X)) := Eθ T (X)T ′ (X) − Eθ T (X)Eθ T ′ (X )
We have
PR
: ˙ ¨ Eθ T (X) = d(θ), Cov θ (T (X)) = d(θ)
If the family is not in canonical form: ˙ d(θ) Eθ T (X) = c(θ) ˙ 1 varθ (T (X)) = 2 [c(θ)] ˙
! ˙ ¨ − d(θ) c¨(θ) d(θ) c(θ) ˙
Higher-Dimensional Extensions (Score Vector & Fisher Information Matrix): ∂ log pθ /∂θ1 . .. sθ (·) := ∂ log pθ /∂θk I(θ) = Eθ sθ (X)sθ′ (X) = Cov θ (sθ (X))
Completeness for Exponential Families: Given a k-dimensional exponential family and k
i=1
Fisher Information: I(θ) := varθ (sθ (X)) I(θ) = −Eθ s˙θ (X) For n i.i.d. observations: I(θ) = nI (θ) Fisher Information for Exponential Families: ˙ ¨ − d(θ) c¨(θ) I(θ) = d(θ) c(θ) ˙
I(θ) I0 (γ) = ¨d 0 (γ) = 2 [c(θ)] ˙
Bias, Variance Bias: biasθ (T ) := Eθ T − g(θ ) T is unbiased if biasθ (T ) = 0 ∀θ . Mean Square ErrorP R : MSEθ (T ) := Eθ (T − g(θ ))2 MSEθ (T ) = bias2θ (T ) + varθ (T ) Uniform Minimum Variance Unbiased: Unbiased estimator T ∗ is UMVU if for any other unbiased estimator: varθ (T ∗ ) ≤ varθ (T ) ∀θ Conditioning on Sufficient Statistic: If T is unbiased, S sufficient, and T ∗ := E(T | S): Eθ (T ∗ ) = g(θ)
varθ (T ∗ ) ≤ varθ (T ) ∀θ
Lehmann-Scheff´ e Lemma: If T is an unbiased estimator of g(θ) with finite variance (for all θ) and S is sufficient and complete, T ∗ := E(T | S) is UMVU.
Cram´ er Rao Lower Bound: If the support of pθ does not depend on θ and pθ is differentiable in L2 , for an unbiased estimator T of g(θ) (with derivative g(θ)), ˙ we have: g˙θ (x) = cov(T , sθ (X))
Equivariant Statistics Location equivariant statistic: c ∈ R and x = (x1 , . . . , xn ):
For all constants
T (x1 + c, . . . , xn + c) = T (x1 , . . . , xn ) + c g˙ 2 (θ) varθ (T ) ≥ I(θ)
∀θ
CRLB for Exponential Families: If T is unbiased and reaches the CRLB, then there exist functions c(θ), d(θ ), and h(x) such that for all θ : pθ (x) = exp[c(θ)T (X) − d(θ)]h(x) x ∈ X ˙ c(θ) ˙ g(θ) = d(θ)/ Higher-Dimensional CRLB: For an unbiased estimator T of g(θ): varθ (T ) ≥ g(θ ˙ )′ I(θ )−1 g(θ) ˙
Comparison Risk: Given loss function L(·, ·): R(θ, T ) := Eθ (L(θ, T (X)) Risk and sufficiency: S sufficient for θ and d : X → A some decision. Then there is a randomized decision δ(S ) such that: R(θ, δ(S)) = R(θ, d ) ∀θ Rao-BlackwellPR : S sufficient for θ, A ⊂ Rp convex and a 7→ L(θ, a) convex for all θ. For decision d : X → A and d ′ (s) := E(d(X) | S = s): R (θ, d ′ ) ≤ R(θ, d ) ∀θ Sensitivity/Robustness: Influence function l(x) := (n+1) (Tn+1 (X1 , . . . , Xn , x) − Tn (X1 , . . . , Xn )) , x ∈ R
For m ≤ n:
ǫ(m) :=
sup x1∗,...,x∗m
|T
(x1∗, . . . , x∗m , Xm+1 , . . . , Xn )|
Break down point: ǫ∗ := min{m : ǫ(m) = ∞}/n
Location invariant loss function: For all constants c ∈ R: L(θ + c, a + c) = L(θ, a) (θ, a) ∈ R2 Risk for equivariant statistics/invariant loss functions: R(θ, T ) = Eθ L(0, T (X − θ)) = EL0 [T (ε)] Uniform Minimum Risk Equivariant: R(θ, T ) =
min
d equivariant
R(0, T ) =
R(θ, d) ∀θ
min
d equivariant
R(0, d )
Maximal Invariant: Map Y : Rn → Rn is maximal invariant if: Y(x) = Y (x′ ) ⇔ ∃c : x = x′ + c UMRE estimator construction: d(X) equivariant, Y := X − d(X). T ∗ (Y) := arg min E [L0 (v + d (ε)) | Y ] v
T ∗ (X) := T ∗ (Y) + d(X) is UMRE. UMRE estimator for quadratic loss: T is UMRE ⇔ E0 (T (X) | X − T (X)) = 0 Pitman estimator: T ∗ (X) = Xn − E (ǫn | Y)
Tests and Confidence Intervals Quantile Functions: F qsup (u) := sup{x : F (x) ≤ u} F qinf (u) := inf{x : F (x) ≥ u} := F −1 (u)
Test: For γ0 ∈ Γ, α ∈ [0, 1] a test for H0 : γ = γ0 is a statistic φ(X, γ0 ) ∈ {0, 1} such that P θ (φ(X, γ0 ) = 1) ≤ α for all θ ∈ {ϑ : g(ϑ) = γ0 } Pivot: Function Z(X, γ) such that for all θ ∈ Θ, this distribution does not depend on θ : Pθ (Z (X, g(θ)) ≤ ·) =: G(·) We can construct test for Hγ0 : α α G G 1− qL := qsup , qR := qinf 2 2
Basu’s lemma : Let X have distribution P θ , suppose T is sufficient/complete, and Y = Y (X) has a distribution that does not depend on θ. Then, T and Y are independent under P θ for all θ .
if Z (X, γ0 ) ∈ / [qL , qR ] else
Student’s test: Assume data is normal distributed with same variance. Then: r ¯ ¯ Y −X nm T := Z (X, Y , 0) = Z (X, Y, γ) := S n+m n m X 2 2 X 1 2 ¯ ¯ S := Xi − X + Yj − Y m+n−2 i=1
j=1
And one-sided test at level α for H0 : γ = 0 against H1 : γ < 0 is: φ(X, Y) := {
1 0
if T < −tn+m−2 (1 − α) if T ≥ −tn+m−2 (1 − α)
Wilcoxon’s test: Let Ri := rank(Zi ) among the pooled sample. Then: T :=
PR
1 0
φ (X, γ0 ) := {
n X i=1
Ri = #{Yj < Xi } +
n(n + 1) 2
And (the distribution is often tabulated): PH0 (T = t) =
# {r :
i=1 ri
Pn
= t}
N!
Uniformly Most Powerful Tests Level: φ is a test at level α if: sup Eθ φ(X) ≤ α
θ∈Θ0
A test φ is UMP if it has level α and for all tests φ′ with level α: Eθ φ′ (X) ≤ Eθ φ(X) ∀θ ∈ Θ1
Neyman Pearson LemmaP R : H0 : θ = θ0 and H1 : θ = θ1 . E φ(X ), θ = θ0 R(θ, φ) := { θ 1 − Eθ φ(X), θ = θ1 1 φNP := { q 0
if p1 /p0 > c if p1 /p0 = c if p1 /p0 < c
R (θ1 , φNP) − R (θ1 , φ) ≤ c[R (θ0 , φ) − R (θ0 , φNP )] One Sided UMP TestPR : Given n i.i.d. copies of a Bernoulli random variable with success parameter θ and with P T := ni=1 Xi as the number of successes, the following test is UMP for H0 : θ ≥ c, H1 : θ < c (and also the weaker hypothesis H0 : θ = c, H1 : θ = c − or H0 : θ = c, H1 : θ < c): 1 q φ(T ) := 0
if T < t0 if T = t0 if T > t0
Unbiased tests: Test φ is unbiased if for all θ ∈ Θ0 , ϑ ∈ Θ1 :
Decision Theory
Eθ φ(X) ≤ Eϑ φ(X)
Admissible Decision: A decision d ′ is strictly better than d if: R (θ, d ′ ) ≤ R(θ, d ) ∀θ
Uniformly Most Powerful Unbiased: Unbiased test φ is UMPU if it has level α and for all unbiased tests φ′ with level α, Eθ φ′ (X) ≤ Eθ φ(X) ∀θ ∈ Θ1 UMPU for a one-dimensional exponential family: P one-dimensional exponential family with c(θ) strictly increasing in θ. A UMPU test is then: 1 qL φ(T (x)) := qR 0
if T (x) < tL or T (x) > tR if T (x) = tL if T (x) = tR if tL < T (x) < tR
With constants tR , tL , qR , and qL such that: d Eθ φ(X ) Eθ0 φ(X) = α, =0 dθ θ=θ0
Confidence Intervals
Confidence Set: Subset I = I(X) ⊂ Γ, depending only on the data, is a confidence set for γ at level 1 − α if: Pθ (γ ∈ I) ≥ 1 − α, ∀θ ∈ Θ Confidence Interval:
I := [γ , γ¯] ¯ with γ = γ (X), γ¯ = γ¯ (X). ¯ ¯ Where t0 is chosen such that P θ0 (T ≤ t0 − 1) ≤ Confidence Sets / Tests: Given for each γ0 ∈ R α, P θ0 (T ≤ t0 ) > α and q such that P θ0 (H0 rejected ) = a test at level α for Hγ0 , the following is a (1 − α)P θ0 (T ≤ t0 − 1) + qP θ0 (T = t0 ) := α, i.e.: confidence set for γ : α − P θ0 (T ≤ t0 − 1) q= P θ0 (T = t0 ) I(X) := {γ : φ(X, γ) = 0}
UMP tests for exponential families: H0 : θ ≤ θ0 , H1 : θ > θ0 , and c(θ) strictly increasing. Then a UMP test is: 1 if T (x) > t0 φ(T (x)) := { q if T (x) = t0 0 if T (x) < t0
Given a (1 − α)-confidence set for γ, the following test is a test at level α of Hγ0 : γ = γ0 for all γ0 : φ (X, γ0 ) = {
1 0
if γ0 ∈ / I(X) else
∃θ : R (θ, d ′ ) < R(θ, d )
d is called inadmissible when there exists a d ′ that is strictly better than d . Admissibility for the Neyman Pearson Test: A Neyman Pearson test is admissible if and only if its power is strictly less than 1 or it has minimal level among all tests with power 1. Admissible Estimators for the Normal MeanP R : X ∼ N (θ, 1), Θ := R and R(θ, T ) := Eθ (T − θ)2 . If we consider estimators of the form T = aX +b, a > 0, b ∈ R, T is admissible if and only if one of the following cases hold: 1. a < 1 2. a = 1 and b = 0 Minimax Decisions: d minimax if sup R(θ, d) = inf′ sup R (θ, d ′ ) θ
d
θ
Minimax for the Neyman Pearson Test: A Neyman Pearson test is minimax if and only if R(θ0 , φN P ) = R(θ1 , φN P )
Bayes Decisions Bayes Risk: Given probability measure Π (prior distribution) of Θ, and density w := dΠ/dµ: Z r(Π, d) := R(ϑ, d)dΠ(ϑ) Θ
r(Π, d) =
Z
Extended Bayes Decision: T is called extended Bayes if there exists a sequence of prior densities ′ {wm }∞ ′ m=1 such that rwm (T ) − inf T rwm (T ) → 0 as m → ∞.
Admissibility, Extended BayesPR : Suppose T is ex′ tended Bayes and for all T ′ , R(θ, R T ) is continuous in θ . Furthermore, with Πm (U ) := U wm (ϑ)dµm (ϑ) being the probabilty of U under the prior Πm :
Bayes Estimator for Quadratic Loss: L(θ, a) := (θ−a)2 , then:
rwm (T ) − inf T ′ rwm (T ′ ) →0 Πm (U )
d Bayes (X ) = E(θ | X )
R(ϑ, d)w(ϑ)dµ(ϑ) := rw (d)
Θ
Bayes Decision: A decision d is called Bayes if:
Then, T is admissible.
For T = E(θ | X), the Bayes risk of an estimator T ′ is: ′ 2
′
r(Π, d) = inf′ r (Π, d ′ )
rw (T ) = E var(θ | X) + E (T − T )
A posteriori density: Given pθ (x) = p(x | θ), and the marginal density: Z p(·) := p(· | ϑ)w(ϑ)dµ(ϑ)
Bayes Estimator/MAP/MLE: For L(θ, a) := 1{|θ − a| > c} and c small, Bayes rule is approximately the maximum a posteriori estimator, which is equivalent to the MLE for a uniform prior. With quadratic loss, Bayes estimator is the expectation value of the posterior, whereas the MAP is the maximum.
d
Θ
Credibility Interval: A (1 − α)-credibility interval is:
The a posterior density of θ is: w(ϑ | x) = p(x | ϑ)
w(ϑ) , ϑ ∈ Θ, x ∈ X p(x)
h
i I := θˆL (X), θˆR (X)
Bayes Decision Construction: Let Z l(x, a) := E[L(θ, a) | X = x] = L(ϑ, a)w(ϑ | x)dµ(ϑ)
Z
Then Bayes decision is:
Constructing Estimators
θˆR (X ) ˆ θL (X )
w(ϑ | X)dϑ = (1 − α)
Θ
d Bayes (X) = arg min l(X, a) a∈A
d Bayes (X) = arg min a∈A
Z
L(ϑ, a)gϑ (S)w(ϑ)dµ(ϑ)
Bayes Test: Assume H0 : θ = θ0 , H1 : θ = θ1 , L(θ0 , a) := a, L(θ1 , a) := 1 − a, w(θ0 ) =: w0 , and w(θ1 ) =: w1 = 1 − w0 . Bayes test is then (for an arbitrary q):
φBayes =
1 q 0
if p1 /p0 > w0 /w1 if p1 /p0 = w0 /w1 if p1 /p0 < w0 /w1
MinimaxityP R : Suppose T is a statistic with risk R(θ, T ) = R(T ) not depending on θ. Then: 1. T admissible ⇒ T minimax 2. T Bayes ⇒ T minimax 3. T extended Bayes ⇒ T minimax AdmissibilityP R : Suppose T is Bayes for prior density w. Then 1. or 2. are sufficient for the admissibility: 1. T is unique Bayes (rw (T ) = rw (T ′ ) implies ∀θ, T = T ′ , P θ -almost surely) 2. For all T ′ , R(θ, T ′ ) is continuous inRθ and for all open U ⊂ Θ, the prior probability U w(ϑ)dµ(ϑ) of U is strictly positive.
The Linear Model Least Squares Estimator: Given (augmented) design matrix X ∈ Rn×p , the least squares estimator is the projection of Y on {Xb : b ∈ Rp }: βˆ := arg minp kY − Xbk22 b∈R
−1 T X Y βˆ = X T X Distribution of the Least Square EstimatorP R : −1 T For f = EY , let β ∗ := X T X X f and Xβ ∗ the best linear approximation of f . For EǫǫT = σ 2 I, ǫ := Y − f : ˆ = σ 2 X T X −1 1. Eβˆ = β ∗ , Cov(β) 2 2. E X βˆ − β ∗ = σ 2 p 2 2 3. EkX βˆ − f k = σ 2 p + kXβ ∗ − f k2 2
2
PR Least Squares Estimator Expectation : When −1 2 ∗ 2 ˆ ǫ ∼ N (0, σ I), we have β − β ∼ N 0, σ X T X 2
∗ ˆ )k2 kX (β−β ∼ χp2 A test for H0 : β = β0 is to reject σ2 ˆ H0 when kX β − β 0 k22 /σ02 > Gp−1(1 − α) where Gp
and
is the distribution function of a χ2p -distributed random variable.
Testing a Linear Hypothesis: Y = Xβ + ǫ with ǫ ∼ N (0, σ 2 I) and we want to test H0 : B β = 0. Under H0 , the following fraction is χ2q -distributed: ˆ0 k2 − kY − Xβk ˆ 22 kY − X β 2 σ2
Asymptotic Theory We assume an estimator Tn (X1 , . . . , Xn ) of γ is defined for all n, i.e. we consider a sequence of estimators. Markov’s/Chebyshev’s Inequality: For all increasing functions ψ : [0, ∞) → [0, ∞): P (kZk ≥ ǫ) ≤
P
D Slutsky’s TheoremP R : Assume that Zn −→ Z, An −→ a. Then:
D
ATn Zn −→ aT Z
n
Tn − γ =
Eψ (kZk) ψ(ǫ)
Almost Sure Convergence: Zn converges almost surely to Z if P( lim Zn = Z) = 1 n→∞
Convergence in Probability: Zn converges in probP ability to Z (Zn −→ Z) if for all ǫ > 0: lim P (kZn − Zk > ǫ) = 0
n→∞
Almost sure convergence implies convergence in probability, but not the other way around.
Central Limit Theorem: Let X1 , X2 , . . . be i.i.d. with mean µ, variance σ 2 . Then:
Stochastic Order Symbols: Let rn be strictly positive random variables. Zn = OP (1) (Zn bounded in probability) if: lim lim sup P (kZn k > M ) = 0
Convergence in Distribution: Zn converges in disD tribution to Z (Zn −→ Z) if for all continuous and bounded functions f :
Zn = OP (rn ) if Zn /rn = OP (1). When Zn converges in distribution, Zn = OP (1). If Zn converges in probability to zero, Zn = oP (1) and Zn = oP (rn ) if Zn /rn = oP (1).
lim Ef (Zn ) = Ef (Z)
n→∞
Convergence in probability implies convergence in distribution, but not the other way around. Portmanteau Theorem: The following statements are equivalent: D
1. Zn −→ Z (i.e., Ef (Zn ) → Ef (Z)∀f bounded and continuous) 2. Ef (Zn ) → Ef (Z )∀f bounded and Lipschitz (f Lipschitz if for a constant CL , |f (z) − f (˜ z )| ≤ CL kz − z˜k) 3. Ef (Zn ) → Ef (Z)∀f bounded an Q-a.s. continuous (where Q is the distribution of Z). 4. P (Zn ≤ z) → G(z) for all G-continuity points z (where G = Q(Z ≤ ·) is the distribution function of Z) Cram´ er-Wold Device: D
Zn −→ Z ⇔ aT Zn −→ aT Z∀a ∈ Rp
Consistent Estimators: Sequence of estimators Tn is consistent if:
P
θ Tn −→ γ
Asymptotically Normal Estimators: Sequence of estimators Tn is asymptotically normal with covariance matrix Vθ :
√ Dθ n (Tn − γ) −→ N (0, Vθ )
1X √ lθ (Xi ) + oPθ (1/ n) n i=1
The δ-Technique: Let h be differentiable at c and suppose: D (Tn − c) /rn −→ Z Then:
D √ n X¯n − µ −→ N 0, σ 2
M→∞ n→∞
D
Asymptotically Linear Estimators: Sequence of estimators Tn is asymptotically linear if for a (influence) function lθ : X → Rp with Eθ lθ (X) = 0 and Eθ lθ (X)lTθ (X) =: Vθ < ∞:
D ˙ T Z (h (Tn ) − h(c)) /rn −→ h(c)
˙ )T (Tn − c ) + oP (rn ) h (T...