Lecture 8 Linear Adaptive Filter PDF

Title	Lecture 8 Linear Adaptive Filter
Course	Advanced Signal Processing
Institution	University College Dublin
Pages	9
File Size	457.3 KB
File Type	PDF
Total Downloads	7
Total Views	136

Preview

CLICK TO PREVIEW PDF

Summary

Prof.Naam Tram...

Description

EEEEN40130: Advanced Signal Processing

Lecture 8: Linear Adaptive Filters

1 Least-Mean-Square Algorithm (LMS) Recall that the optimal Wiener filter is found as (see Lecture 5) Rho = g ⇔ ho = R−1 g where R is the autocorrelation matrix of the input and g is the crosscorrelation vector between the input and the reference (or the target) signal. In practice R is difficult to estimate and even more difficult to invert. However, we can iteratively determine h by using a gradient (steepest descent) algorithm. Let us briefly review the steepest descent algorithm to minimise a function of one variable, say f (x). At a current point, say xold , f (x) decreases fastest if we move from xold in the direction of the negative derivative of f (x) at xold , −f ′ (xold ). That is, if xnew = xold − αf ′ (xold )

(1)

for small enough α > 0, then f (xold ) ≥ f (xnew ). This is illustrated in Figure 1. If the cost function to be minimised consists of multiple variables, the same idea is still applied. But in this case, the first derivative is replaced by the gradient. The gradient of a vector of all partial derivatives with respect to each variable. For example, if f (x1 , x2 ) = 2x12 + x22 − 3x1 x2 . Then the gradient of f (x1 , x2 ) is defined as [ ∂ ] [ ] f (x1 , x2 ) 4x1 − 3x2 = f (x1 , x2 ) = ∂x∂ 1 (2) 2x2 − 3x1 f (x1 , x2 ) ∂x2 We now apply the gradient method to find the optimal Wiener filter. The cost function in our context is the mean square error E {e2 [n]}. The first derivative of E {e2 [n]} with respect to h[j] is given by ∂E {e2 [n]} = 2E {e[n]x[n − j]} ∂h[j]

EEEEN40130: Advanced Signal Processing f (x)

f (xold ) ≥ f (xnew )

f ′ (xold )

x xold

xnew

Figure 1: Illustration of the steepest descent method. Applying the gradient method we update the Wiener filter coefficients as hnew[j ] = hold [j ] − αE {e[n]x[n − j]} , j = 1, . . . , N E {e[n]x[n − j]} can be estimated by time averaging. In practice we don’t bother to average at all. We use hnew[j ] = hold [j ] − αe[n]x[n − j ] j = 1, . . . , N The above equation describes the stochastic gradient algorithm also known as the LMS (least mean squares) algorithm.

2 Convergence Analysis of LMS Algorithm LMS update: hn+1[j] = hn [j ] − αe[n]x[n − j ],

j = 1, 2, . . . , N

In vector form this is h[n + 1] = h[n] − αe[n]x[n] where 

   h[n] =   

hn [1] hn [2] .. . .. . hn [N ]

      

(3)

EEEEN40130: Advanced Signal Processing

and 

  x[n] =    

x[n] x[n − 1] .. . .. . x[n − N + 1]

        : N samples from x[n] to x[n − N + 1]       

The convergence of h is a statistical process, so let’s look at how it converges in the mean. Take the expected value of both sides of equation (3): E {h[n + 1]} = E {h[n]} − αE {e[n]x[n]}

(4)

From the Wiener Filter theory we can see that E {e[n]x[n]} = Rh[n] − g where R is the autocorrelation matrix of {xn } and g is the crosscorrelation vector between {zn } and {xn }. Thus equation (3) becomes E {h[n + 1]} = E {h[n]} − α (RE {h[n]} − g) i.e. E {h[n + 1]} = (I − αR)E {h[n]} + αg

(5)

where I is the N × N identity matrix. Subtracting both sides of (5) by h0 , we obtain E {h[n + 1] − h0 } = (I − αR)E {h[n]} + αg − h0 = (I − αR)E {h[n]} + αRho − h0 = (I − αR)E {h[n] − h0 }

(6)

Let us denote v[n] = h[n + 1] − h0 . Then the above equation implies E {v[n + 1]} = (I − αR)E {v[n]}

(7)

We are about to show that E {v[n]} → 0 as n → ∞. To study the convergence of such a recursive equation as in (7) we note the following facts. • So far we have considered real-valued input, and thus the autocorrelation matrix R is symmetric. If the input is complex valued, then R is Hermitian. • A Hermitian matrix is a complex square matrix that is equal to its own conjugate transpose. If we denote the conjugate transpose by (·)H , then a Hermitian matrix R satisfies RH = R (8)

EEEEN40130: Advanced Signal Processing

• For a Hermitian matrix R, we can write R = UΛ UH

(9)

where Λis a diagonal matrix, where the elements of the diagonal are λk , k = 1, 2, . . . , N the eigenvalues of R and also UUH = UH U = I

(10)

U is called a unitary matrix. (9) is called the singular value decomposition of R. Putting (9) into (5) gives ( ) E {v[n + 1]} = I − αUU ΛH E {v[n]}

Multiplying both sides by UH gives { } ( ) E UH v[n + 1] = UH − αUH UU ΛH E {v[n]}

which gives

E {v˜[n + 1]} = (I − αΛ ) E { ˜ v[n]}

(11)

where v˜[n] = U v[n]. Looking at the jth coefficient in equation (11) we have H

E {˜ vj [n + 1]} = (1 − αλj ) E {˜ vj [n]} = (1 − αλj )2 E {˜ vj [n − 1]} = · · · (1 − αλj )n−1 E {˜ vj [0]} (12) Thus the above series converges if and only if |1 − αλj | < 1 which is equivalent to 0 < α < 2 , j = 1, 2, . . . , N . Thus the range of values for α that ensures stability is λj 0...