MTL106-1-9 - Lecture notes 1-9 PDF

Title	MTL106-1-9 - Lecture notes 1-9
Author	Arpit Chauhan
Course	Introduction to Probability Theory and Stochastic Process
Institution	Indian Institute of Technology Delhi
Pages	9
File Size	223.2 KB
File Type	PDF
Total Downloads	47
Total Views	134

Preview

CLICK TO PREVIEW PDF

Summary

modes of convergence...

Description

PROBABILITY AND STOCHASTIC PROCESS (MTL106) ANANTA KUMAR MAJEE

1. Modes of Convergence Lemma 1.1 (Markov inequality). If X is a non-negative random variable whose expected value exists, then for all a > 0, P(X > a) ≤

E(X) . a

Proof. Observe that, since X is non-negative i h E[X] = E X1{X >a} + X1{X ≤a} ≥ E[X1{X >a} ] ≥ aP(X > a). Hence the result follows.



Corollary 1.2. If X is a random variable such that E[|X |] < +∞, then for all a > 0 P(|X| > a) ≤

E(|X|) . a

Lemma 1.3 (Chebyschev’s inequality). Let Y be an integrable random variable such that Var(Y ) < +∞. Then for any ε > 0 P(|Y − E(Y )| > ε) ≤

Var(Y ) . ε2

Proof. To get the result, take X = |Y − E(Y )|2 and a = ε2 in Markov inequality.



Example 1.1. Is there any random variable X for which   1 P µ − 3σ ≤ X ≤ µ + 3σ = , 2 2 where µ = E(X) and σ = Var(X). Solution: Observe that       P µ − 3σ ≤ X ≤ µ + 3σ = P |X − µ| ≤ 3σ = 1 − P |X − E(X)| > 3σ .   σ2 By Chebyschev’s inequality, we get that P |X − E(X )| > 3σ ≤ 9σ 2 , and hence   1 8 P µ − 3σ ≤ X ≤ µ + 3σ ≥ 1 − = . 9 9 Since 12 < 89 , there exists NO random variable X satisfying the given condition. Theorem 1.4 (Weak Law of Large Number). Let {Xi } be a sequence of iid random variables with finite mean µ and variance σ 2 . Then for any ε > 0  Sn  σ2 P | − µ| > ε ≤ 2 , n nε Pn where Sn = i=1 Xi . In particular,   Sn − µ| > ε = 0 . lim P | n→∞ n Proof. First inequality follows from Chebyschev’s inequality. Sending limit as n tends to infinity in the first inequality, we arrive at the second result.  An application to Real Analysis: 1

2

A. K. MAJEE

Theorem 1.5 (Weierstrass Theorem). Let f : [0, 1] → R be a continuous function. Let   n X k n k f( ) Bn,f (x) = x (1 − x)n−k , x ∈ [0, 1] n k k=0

be the Bernstein polynomial of order n for the function f . Then Bn,f → f uniformly on [0, 1]. Proof.PLet x ∈ [0, 1] be fixed. Let n {Xn } be a sequence of i.i.d Bernoulli(x) random variables. Set n Sn = i=1 Xi . Then P(Sn = k) = k xk (1 − x)n−k and hence   n n X X k n k k Sn f( ) f ( )P(Sn = k) = E[f ( )] = x (1 − x)n−k = Bn,f (x). n n n k k=0 k=0 Hence, for any δ > 0

 Sn  Sn )]| ≤ E |f (x) − f ( )| n n   Sn  = E |f (x) − f ( )| 1|x− Sn |>δ + 1|x− Sn |≤δ n n n Since f is continuous on [0, 1], it is uniformly continuous. Hence given ε > 0, there exists δ0 > 0 such that |f (x) − Bn,f (x)| = |f (x) − E[f (

|x − y| ≤ δ =⇒ |f (x) − f (y)| < ε .

Taking δ = δ0 in the previous inequality, we get

  Sn |f (x) − Bn,f (x)| ≤ 2kf kP |x − | > δ0 + ε. n

Observe that E( Snn ) = x and Var( Snn ) =

x(1−x) n

. Hence by Chebyschev’s inequality, we get

  x(1 − x) Sn 1 P |x − | > δ0 ≤ ≤ 4nδ 20 nδ02 n

∀ x ∈ [0, 1].

Combining these two inequality, we have, for all x ∈ [0, 1] |f (x) − Bn,f (x)| ≤

kf k + ε. 2nδ02

Sending n → ∞ and then letting ε → 0, we conclude the desired result, i.e., Bn,f → f uniformly on [0, 1].  We shall discuss various modes of convergence for a given sequence of random variables {Xn } defined on a given probability space (Ω, F , P). Definition 1.1 (Convergence in probability). We say that {Xn } converges to a random variable X

P defined on a the same probability space (Ω, F , P) in probability, denoted by Xn → X, if for every ε > 0,

lim P(|Xn − X| > ε) = 0.

n→∞

Example 1.2. Let {Xn } be a sequence of random variables such that P(Xn = 0) = 1 − n) =

1 n.

1 n

and P(Xn =

P

Then Xn → 0. Indeed for any ε > 0,

P(|Xn | > ε) =

(

1 n

0

if ε < n, if ε ≥ n .

Hence, limn→∞ P(|Xn | > ε) = 0.

Example 1.3. Let {Xn } be a sequence of i.i.d. random variables with P(Xn = 1) = 12 and P(Xn = Pn −1) = 12 . Then n1 i=1 Xi converges to 0 in probability. Indeed for any ε > 0, thanks to weak law of large number, we have 1 Var(X1 ) P(| Sn − µ| > ε) ≤ nε2 n

PROBABILITY AND STOCHASTIC PROCESS

3

where µ = E(X1 ). Observe that µ = 0 and Var(X1 ) = 1. Hence n 1 1X Xi | > ε) ≤ 2 → 0 as n → ∞. P(| nε n i=1

P

 |Xn −X |  Theorem 1.6. Xn → X if and only if limn→∞ E 1+|X = 0. n −X |

P

Proof. With out loss of generality, take X = 0. thus, we want to show that Xn → 0 if and only if  |Xn |  limn→∞ E 1+|X = 0. n| P Suppose Xn → 0. Then given ε > 0, we have limn→∞ P(|Xn | > ε) = 0. Now,

|Xn | |Xn | |Xn | 1|Xn |>ε + = 1|Xn |≤ε ≤ 1|Xn |>ε + ε 1 + |Xn | 1 + |Xn | 1 + |Xn |  |Xn |   |Xn − X| ≤ P(|Xn | > ε) + ε =⇒ lim E ≤ ε. =⇒ E n→∞ 1 + |Xn | 1 + |Xn − X|  |Xn −X |  Since ε > 0 is arbitrary, we have limn→∞ E 1+|X = 0. n −X |  |Xn −X |  x Conversely, let limn→∞ E 1+|Xn −X | = 0. Observe that the function f (x) = 1+x is strictly increasing on [0, ∞). Thus, |Xn | |Xn | ε 1|Xn |>ε ≤ 1|Xn |>ε ≤ 1 + |Xn | 1+ε 1 + |Xn |  |Xn |  ε P(|Xn | > ε) ≤ E =⇒ 1+ε 1 + |Xn |  |Xn − X|  1+ε P lim E =⇒ lim P(|Xn | > ε) ≤ = 0 =⇒ Xn → 0. n→∞ ε n→∞ 1 + |Xn − X|



P

  Exercise 1.1. Show that Xn → X if and only if limn→∞ E 1 ∧ |Xn − X| = 0.

Definition 1.2 (Convergence in r-th mean). Let X, {Xn } be random variables defined a given probability space (Ω, F , P) such that for r ∈ N, E[|X |r ] < ∞ and E[|Xn |r ] < ∞ for all n. We say that r {Xn } converges in the r-th mean to X, denoted by Xn → X, if the following holds: lim E[|Xn − X|r ] = 0.

n→∞

Example 1.4. Let {Xn } be i.i.d. random variables with E[Xn ] = µ and Var(Xn ) = σ 2 . Define Yn = Pn 2 1 n i=1 Xi . Then Yn → µ. Indeed   Pn σ2 1  i=1 Xi − nµ2 ] = 1 2 E[|Sn − E(Sn )|2 ] = 2 Var(Sn ) =  E[|Yn − µ| ] = E[ 2 n n n n Pn 2 where Sn = i=1 Xi . Hence Yn → µ.

Theorem 1.7. The following holds: r

P

i) Xn → X =⇒ Xn → X for any r ≥ 1. P P ii) Let f be a given continuous function. If Xn → X, then f (Xn ) → f (X).

Proof. Proof of (i) follows from Markov’s inequality. Indeed, for any given ε > 0, E[|xn − X|r ] 1 P(|Xn − X| > ε) ≤ =⇒ lim P(|Xn − X| > ε) ≤ r lim E[|xn − X|r ] = 0. n→∞ ε n→∞ εr Proof of (ii): For any k > 0, we see that {|f (Xn ) − f (X)| > ε} ⊂ {|f (Xn ) − f (X)| > ε, |X| ≤ k} ∩ {|X| > k} .

Since f is continuous, it is uniformly continuous on any bounded interval. Therefore, for any given ε > 0, there exists δ > 0 such that |f (x) − f (y)| ≤ ε if |x − y| ≤ δ for x and y in [−k, k]. This means that {|f (Xn ) − f (X)| > ε, |X| ≤ k} ⊂ {|Xn − X| > δ, |X | ≤ k} ⊂ {|Xn − X| > δ} .

4

A. K. MAJEE

Thus we have {|f (Xn ) − f (X)| > ε} ⊂ {|Xn − X| > δ} ∩ {|X| > k }

=⇒ P(|f (Xn ) − f (X)| > ε) ≤ P(|Xn − X | > δ) + P(|X| > k ).

P

Since Xn → X and limk→∞ P(|X| > k) = 0, we obtain that limn→∞ P(|f (Xn ) − f (X)| > ε) = 0. This completes the proof.  In general, convergence in probability does not imply convergence in r-th mean. To see it, consider the following example. P

Example 1.5. Let Ω = [0, 1], F = B([0, 1]) and P(dx) = dx. Let Xn = n1(0,1/n) . Then Xn → 0 but r Xn 9 0 for all r ≥ 1. To show this, observe that 1 P P(|Xn | > ε) ≤ =⇒ lim P(|Xn | > ε) = 0 i.e., Xn → 0. n→∞ n On the other hand, for r ≥ 1 Z 1 n E[|Xn |r ] = nr dx = nr−1 9 0 as n → ∞. 0

Definition 1.3 (Almost sure convergence). Let X, {Xn } be random variables defined on a probability space (Ω, F , P). We say that {Xn } converges to X almost surely (or with probability 1), denoted by a.s Xn → X, if the following holds: P( lim Xn = X) = 1. n→∞

Example 1.6. Let Ω = [0, 1], F = B([0, 1]) and P(dx) = dx. Define ( 1 if ω ∈ (0, 1 − 1n ) Xn (ω) = n, otherwise. It is easy to check that if ω = 0 or ω = 1, then limn→∞ Xn (ω) = ∞. For any ω ∈ (0, 1), we can find n0 ∈ N such that ω ∈ (0, 1 − 1n ) for any n ≥ n0 . As a consequence, Xn (ω) = 1 for any n ≥ n0 . In other words, for ω ∈ (0, 1), limn→∞ Xn (ω) = 1. Define X(ω) = 1 for all ω ∈ [0, 1]. Then a.s.

P(ω ∈ [0, 1] : {Xn (ω )} does not converge to X(ω)) = P({0, 1}) = 0 =⇒ Xn → 1. Let {An } be a sequence of events in F . Define     lim sup An = ∩∞ ∪m≥n Am . n=1 ∪m≥n Am = lim n→∞

n

This can be interpreted probabilistically as

lim sup An = “ An occurs infinitely often”. n

We denote this as {An i.o.} = lim supAn . n

Theorem 1.8 (Borel-Cantelli lemma). Let {An } be a sequence of events in (Ω, F , P). P∞ i) If n=1 P(An ) < +∞, then P(An i.o.) = 0. P∞ ii) If An are mutually independent events, and if n=1 P(An ) = ∞, then P(An i.o.) = 1.

P∞ Remark 1.1. For mutually independent events An , since n=1 P(An ) is either finite or infinite, the event {An i.o.} has probability either 0 or 1. This is sometimes called zero-one law. As a consequence of Borel-Cantelli lemma, we have the following proposition.

Proposition 1.9. Let {Xn } be a sequence of random variables defined on a probability space (Ω, F , P). P∞ a.s. If n=1 P(|Xn | > ε) < +∞ for any ε > 0, then Xn → 0.

PROBABILITY AND STOCHASTIC PROCESS

5

P∞ Proof. Fix ε > 0. Let An = {|Xn | > ε}. Then n=1 P(An ) < +∞, and hence by Borel-Cantelli lemma, P(An i.o.) = 0. Now c  lim sup An = {ω : ∃ n0 (ω) such that |Xn (ω)| ≤ ε ∀ n ≥ n0 (ω )} := Bε . n

∞ Bc =⇒ B c = ∪r=1 Moreover, since P(Bε= r1 ) = 1, we have 1 .

Thus, P(Bε ) = 1. Let B = ∩∞ r=1 B r1 c P(B 1 ) = 0. Observe that

r

r

{ω : lim |Xn (ω )| = 0} = ∩∞ r=1 B r1 . n→∞

c

Again, P(B ) ≤

P∞

r=1

c

P(B 1 ) = 0, and hence P(B) = 1. In other words, r   a.s. P {ω : lim |Xn (ω)| = 0} = 1, i.e., Xn → 0. n→∞



Example 1.7. Let {Xn } be a sequence of i.i.d. random variables such that P(Xn = 1) = 12 and P(Xn = Pn a.s. −1) = 12 . Let Sn = i=1 Xi . Then n12 Sn2 → 0. To show the result, we use Proposition 1.9. Note that ∞ X 1 1 E[|Sn2 |2 ] 1 1 a.s. ≤ P( 2 |Sn2 | > ε) < ∞ =⇒ 2 Sn2 → 0. P( 2 |Sn2 | > ε) ≤ =⇒ n n4 ε2 n n n2 ε2 n=1

Let us consider the following example.

Example 1.8. Let Ω = [0, 1], F = B ([0, 1]) and P(dx) = dx. Define Xn = 1 [

j 2k

n = 2k + j, j = 0, 1, . . . , 2k − 1,

, , j+1 k ] 2

k = 0, 2, . . . .

Note that, for each positive integer n, there exist integers j and k(uniquely determined) such that n = 2k + j, j = 0, 1, . . . , 2k − 1,

k = 0, 1, 2, . . . .

( for n = 1, k = j = 0, and for n = 5, k = 2, j = 1 and so on). Let An = {Xn > 0}. Then, clearly P

P(An ) → 0. Consequently, Xn → 0 but Xn (ω) 9 0 for all ω ∈ Ω. Theorem 1.10. The followings hold.

P i) If Xn a.s → X, then Xn → X. P ii) If Xn → X, then there exists a subsequence Xnk of Xn such that Xnk a.s. → X. a.s a.s iii) If Xn → X, then for any continuous function f , f (Xn ) → f (X).

a.s

ε ∞ Proof. Proof of i): For any ε > 0, define Aεn = {|Xn − X| > ε} and Bm = ∪n=m Aεn . Since Xn → X, ε ε P(∩m Bm ) = 0. Note that {B m } are nested and decreasing sequence of events. Hence from the continuity of probability measure P, we have ε ε ) = 0. lim P(B m ) = P(∩m Bm

m→∞

ε ) ≤ P(B εm ). This implies that limm→∞ P(Aεm ) = 0. In other words, Since Aεm ⊂ Bmε , we have P(Am P Xn → X. P

Proof of ii): To prove ii), we will use Borel-Cantelli lemma. Since Xn → X, we can choose a subsequence P Xnk such that P(|Xnk − X| > k1 ) ≤ 21k . Let Ak := {|Xnk − X| > 1k }. Then ∞ k=1 P(Ak ) < +∞. Hence, by Borel-Cantelli lemma P(Ak i.o.) = 0. This implies that  1 ∞ c P(∪∞ }) = 1 n=1 ∩m=n Am ) = 1 =⇒ P {ω ∈ Ω : ∃ n0 : ∀k ≥ n0 , |Xnk − X| ≤ k a.s. =⇒ Xnk → X. Proof of iii): Let N = {ω : limn→∞ Xn (ω) 6= X(ω )}. Then P(N ) = 0. If ω ∈ / N , then by the continuity property of f , we have lim f (Xn (ω)) = f ( lim Xn (ω)) = f (X (ω)) .

n→∞

n→∞

a.s

This is true for any ω ∈ / N and P(N ) = 0. Hence f (Xn ) → f (X).



6

A. K. MAJEE

Definition 1.4 (Convergence in distribution). Let X, X1 , X2 , . . . be real-valued random variables with distribution functions FX , FX1 , FX2 , . . . respectively. We say that (Xn ) converges to X is distribud

tion, denoted by Xn → X, if

lim FXn (x) = FX (x)

n→∞

for all continuity points x of FX .

Remark 1.2. In the above definition, the random variables X, {Xn } need not be defined on the same probability space. Example 1.9. Let Xn =

1 n

and X = 0. Then ( 1 if x ≥ n1 FXn (x) = P(Xn ≤ x) = 0, otherwise

and

( 1 x ≥ 0, FX (x) = 0, x < 0.

Observe that 0 is the only discontinuity point of FX , and limn→∞ FXn (x) = F (x) for x 6= 0. Thus, d Xn → 0.

Example 1.10. Let X be a real-valued random variable with distribution function F . Define Xn = X + 1n . Then 1 1 FXn (x) = P(X + ≤ x) = F (x − ) n n =⇒ lim FXn (x) = F (x−) = F (x) for continuity point x of F . n→∞

d

This implies that Xn → X.

d

P

Theorem 1.11. Xn → X implies that Xn → X. Proof. Let ε > 0. Since FXn (t) = P(Xn ≤ t), we have

FXn (t) = P(Xn ≤ t, |Xn − X| > ε) + P (Xn ≤ t, |Xn − X| ≤ ε)

≤ P(|Xn − X > ε) + P(Xn ≤ t, |Xn − X| ≤ ε) ≤ P(|Xn − X > ε) + P(X ≤ t + ε)

≤ P(|Xn − X > ε) + FX (t + ε),

FX (t − ε) = P(X ≤ t − ε) = P(X ≤ t − ε, |Xn − X| > ε) + P(X ≤ t − ε, |Xn − X| ≤ ε)

≤ P(|Xn − X| > ε) + P(X ≤ t − ε, |Xn − X| ≤ ε) ≤ P(|Xn − X| > ε) + P(Xn ≤ t) ≤ P(|Xn − X| > ε) + FXn (t) .

Thus, since limn→∞ P(|Xn − X| > ε) = 0, we obtain from the above inequalities

FX (t − ε) ≤ lim inf FXn (t) ≤ lim sup FXn (t) ≤ FX (t + ε). n→∞

n→∞

Thet t be the continuity point of F . Then sending ε → 0 in the above inequality, we get lim FXn (t) = FX (t),

n→∞

d

i.e., Xn → X. 

Converge of this theorem is NOT true in general. Example 1.11. Let X ∼ N (0, 1). Define Xn = −X for n = 1, 2, 3, . . .. Then Xn ∼ N (0, 1) and hence d Xn → X. But ε P P(|Xn − X| > ε) = P(|2X| > ε) = P(|X| > ) 6= 0 =⇒ Xn 9 X. 2 Theorem 1.12 (Continuity theorem). Let X, {Xn } be random variables having the characteristic function φX , {φXn } respectively. Then the followings are equivalent. d

i) Xn → X. ii) E(g(Xn )) → E(g(X)) for all bounded Lipschitz continuous function. iii) limn→∞ φXn (t) = φX (t) for all t ∈ R.

PROBABILITY AND STOCHASTIC PROCESS

7

Theorem 1.13 (Strong law of large number). Let {Xi } be a sequence of i.i.d. random variables with finite mean µ and variance σ 2 . Then Sn a.s. → µ, n

where Sn =

n X

Xi .

i=1

The special case of 4-th order moment, above theorem is refered as Borel’s SLLN. To prove the theorem, we need following lemma. Lemma 1.14. Let {Xi } be a sequence of random variables defined on a given probability space (Ω, F , P). i) If {Xn } are positive, then

∞ ∞ X  X E Xn = E[Xn ] .

ii) If

P∞

n=1

E[|Xn |] < ∞, then

P∞

i=1

(1.1)

n=1

n=1

Xi converges almost surely and (1.1) holds as well.

Proof of Theorem 1.13: With out loss of generality we can assume that µ = 0. Set Yn = that, thanks to independent property, n 1 X σ2 1 X E[Yn ] = 0, E[Y n2 ] = 2 E[X j2 ] = . E(Xj Xk ) = 2 n n n

Sn n

. Observe

j=1

1≤j,k≤n

Thus, limn→∞ E[Yn2 ] = 0, and hence along a subsequence, Yn converges to 0 almost surely. But we need to show that original sequence converges to 0 with probability 1. To do so, we proceed as follows. P∞ σ 2 2 P∞ P∞ 2 Since E[Y n2 ] = σn , we see that n=1 E[Yn22 ] = n=1 n=1 Y n2 n2 < +∞ and hence by Lemma 1.14, ii), converges almost surely. Thus, lim Yn2 = 0

with probability 1.

n→∞

(1.2)

Let nN. Then there exists m(n) ∈ N such that (m(n))2 ≤ n < (m(n) + 1)2 . Now 2

(m(n)) n X (m(n))2  (m(n))2 1 1 X 1 2 Xi − Yn − Y(m(n)) = Xi = 2 n n n n ( m ( n)) i=1 i=1

n X

Xi

i=1+(m(n))2

n h 2 i X n − (m(n))2 2 (m(n))2 2m(n) + 1 2 1 σ =⇒ E Yn − E[X 2i ] = σ ≤ = 2 Y(m(n))2 2 n2 n n n i=1+(m(n))2 √ √ 3σ 2 2 n+1 2 ≤ (∴ n < (m(n) + 1)2 and m(n) ≤ n) σ ≤ 3 2 n n2 ∞ ∞ h 2 i X X (m(n))2 3σ 2 E Yn − ≤ Y(m(n))2 =⇒ < +∞ . 3 n n2 n=1 n=1

Thus, again by Lemma 1.14, ii), we conclude that lim Yn −

n→∞

Obsere that limn→∞

(m(n))2 n

(m(n))2 Y(m(n))2 = 0 n

with probability 1.

(1.3)

= 1. Thus, in view of (1.2) and (1.3), we conclude that lim Yn = 0

n→∞

with probability 1.

This completes the proof. Example 1.12. 1) Let {Xn } be a sequence of i.i.d. random variables that are bounded, i.e., there a.s. exists C < ∞ such that P(|X1 | ≤ C) = 1. Then Snn → E(X1 ). 2) Let {Xn } be a sequence of i.i.d. Bernoulli(p) random variables. Then lim

n→∞

n 1X Xi = p n i=1

with probability 1.

8

A. K. MAJEE

Theorem 1.15 (Kolmogorov’s strong law of large numbers). Let {Xn } be a sequence of i.i.d. random variables and µ ∈ R. Then limn→∞ Snn = µ a.s. if and only if E[Xn ] = µ. In this case, the convergence also holds in L1 . R1 Example 1.13 (Monte Carlo approximation). Let f be a measurable function in [0, 1] such that 0 |f (x)| dx < R1 ∞. Let α = 0 f (x) dx. In general we cannot obtain a closed form expression for α and need to estimate it. Let {Uj } be a sequence of independent uniformly random variables on [0, 1]. Then by Theorem 1.15, Z 1 n 1X f (x) dx lim f (Uj ) = E[f (Uj )] = n→∞ n 0 j=1 R1 a.s. and in L2 . Thus, to get an approximation of 0 f (x) dx, we need to simulate the uniform random variables Uj (by using a random number generator). Theorem 1.16 (Central limit theorem). Let {Xn } be a sequence of i.i.d. random variables with finite −nµ mean µ and variance σ 2 with < 0 < σ 2 < +∞. Let Yn = Sσn√ . Then Yn converges in distribution to n Y , where L(Y ) = N (0, 1). Proof. With out loss of generality, we assume that µ = 0. Let Φ be the characteristic function of Xj . Since {Xj } are i.i.d., we have Pn n n hY i Y X X  iu X√i   u n √ i iu i=1 iu S√n iu √i σ n ]=E ΦYn (u) = E[eiuYn ] = E[e σ n ] = E[e E e σ n = Φ( √ ) . e σ n = σ n i=1 i=1 Since E[|Xj |2 ] < +∞, the function Φ has two continuous derivatives. In particular,

Φ′′ (u) = −E[Xj2 eiuXj ] =⇒ Φ′ (0) = 0, Φ′′ (0) = −σ 2 .

Φ′ (u) = iE[Xj eiuXj ],

Expanding Φ in a taylor expansion about u = 0, we have Φ(u) = 1 −

σ 2 u2 + h(u)u2 , 2

where h(u) → 0 as u → 0.

Thus, we get ΦYn (u) = e

u )) n log(φ( σ√ n

2

=e

n log(1− u 2n +

u2 nσ2

u h( σ√ )) n

=⇒ lim ΦYn (u) = e−

u2 2

n→∞

= ΦY (u).

Hence by Levy’s continuity theorem, we conclude that Yn converges in distribution to Y with L(Y ) = N (0, 1).  Remark 1.3. If σ 2 = 0, then Xj = µ a.s. for all j, and hence

Sn n

= µ a.s.

One can weaken slightly the hypotheses of Theorem 1.16. Indeed, we have the following Central limit theorem. Theorem 1.17. Let {Xn } be independent but not necessarily identicallu distributed. Let E[Xn ] = 0 for all n, ane let σ n2 = Var(Xn ). Assume that sup E[|Xn |2+ε ] < +∞ for some ε > 0, n