The matrix cookbook - idk what the discperinog is . I just want to downlaod PDF

Title	The matrix cookbook - idk what the discperinog is . I just want to downlaod
Author	Guru Jojo
Course	Aromatic compounds
Institution	Guru Nanak Dev University
Pages	72
File Size	1.1 MB
File Type	PDF
Total Downloads	37
Total Views	131

Preview

CLICK TO PREVIEW PDF

Summary

idk what the discperinog is . I just want to downlaod...

Description

The Matrix Cookbook [ http://matrixcookbook.com ] Kaare Brandt Petersen Michael Syskind Pedersen Version: November 15, 2012

1

Introduction What is this? These pages are a collection of facts (identities, approximations, inequalities, relations, ...) about matrices and matters relating to them. It is collected in this form for the convenience of anyone who wants a quick desktop reference . Disclaimer: The identities, approximations and relations presented here were obviously not invented but collected, borrowed and copied from a large amount of sources. These sources include similar but shorter notes found on the internet and appendices in books - see the references for a full list. Errors: Very likely there are errors, typos, and mistakes for which we apologize and would be grateful to receive corrections at [email protected]. Its ongoing: The project of keeping a large repository of relations involving matrices is naturally ongoing and the version will be apparent from the date in the header. Suggestions: Your suggestion for additional content or elaboration of some topics is most welcome [email protected]. Keywords: Matrix algebra, matrix relations, matrix identities, derivative of determinant, derivative of inverse matrix, diﬀerentiate a matrix. Acknowledgements: We would like to thank the following for contributions and suggestions: Bill Baxter, Brian Templeton, Christian Rishøj, Christian Schr¨oppel, Dan Boley, Douglas L. Theobald, Esben Hoegh-Rasmussen, Evripidis Karseras, Georg Martius, Glynne Casteel, Jan Larsen, Jun Bin Gao, J¨ urgen Struckmeier, Kamil Dedecius, Karim T. Abou-Moustafa, Korbinian Strimmer, Lars Christiansen, Lars Kai Hansen, Leland Wilkinson, Liguo He, Loic Thibaut, Markus Froeb, Michael Hubatka, Miguel Bar˜ao, Ole Winther, Pavel Sakov, Stephan Hattinger, Troels Pedersen, Vasile Sima, Vincent Rabaud, Zhaoshui He. We would also like thank The Oticon Foundation for funding our PhD studies.

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 2

CONTENTS

CONTENTS

Contents 1 Basics 1.1 Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The Special Case 2x2 . . . . . . . . . . . . . . . . . . . . . . . . . 2 Derivatives 2.1 Derivatives 2.2 Derivatives 2.3 Derivatives 2.4 Derivatives 2.5 Derivatives 2.6 Derivatives 2.7 Derivatives 2.8 Derivatives

of a Determinant . . . . . . . . . . . . of an Inverse . . . . . . . . . . . . . . . of Eigenvalues . . . . . . . . . . . . . . of Matrices, Vectors and Scalar Forms of Traces . . . . . . . . . . . . . . . . . of vector norms . . . . . . . . . . . . . of matrix norms . . . . . . . . . . . . . of Structured Matrices . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 6 6 7 8 8 9 10 10 12 14 14 14

3 Inverses 17 3.1 Basic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Exact Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 Implication on Inverses . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4 Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.5 Generalized Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.6 Pseudo Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4 Complex Matrices 24 4.1 Complex Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2 Higher order and non-linear derivatives . . . . . . . . . . . . . . . 26 4.3 Inverse of complex sum . . . . . . . . . . . . . . . . . . . . . . . 27 5 Solutions and Decompositions 5.1 Solutions to linear equations . . . . . . . . . . . . . . . . . . . . . 5.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . 5.3 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . 5.4 Triangular Decomposition . . . . . . . . . . . . . . . . . . . . . . 5.5 LU decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 LDM decomposition . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 LDL decompositions . . . . . . . . . . . . . . . . . . . . . . . . .

28 28 30 31 32 32 33 33

6 Statistics and Probability 34 6.1 Deﬁnition of Moments . . . . . . . . . . . . . . . . . . . . . . . . 34 6.2 Expectation of Linear Combinations . . . . . . . . . . . . . . . . 35 6.3 Weighted Scalar Variable . . . . . . . . . . . . . . . . . . . . . . 36 7 Multivariate Distributions 7.1 Cauchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Dirichlet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Normal-Inverse Gamma . . . . . . . . . . . . . . . . . . . . . . . 7.5 Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Multinomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37 37 37 37 37 37 37

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 3

CONTENTS

7.7 7.8 7.9

CONTENTS

Student’s t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Wishart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Wishart, Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8 Gaussians 40 8.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 8.2 Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 8.3 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 8.4 Mixture of Gaussians . . . . . . . . . . . . . . . . . . . . . . . . . 44 9 Special Matrices 9.1 Block matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Discrete Fourier Transform Matrix, The . . . . . . . . . . . . . . 9.3 Hermitian Matrices and skew-Hermitian . . . . . . . . . . . . . . 9.4 Idempotent Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . 9.6 Positive Deﬁnite and Semi-deﬁnite Matrices . . . . . . . . . . . . 9.7 Singleentry Matrix, The . . . . . . . . . . . . . . . . . . . . . . . 9.8 Symmetric, Skew-symmetric/Antisymmetric . . . . . . . . . . . . 9.9 Toeplitz Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.10 Transition matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 9.11 Units, Permutation and Shift . . . . . . . . . . . . . . . . . . . . 9.12 Vandermonde Matrices . . . . . . . . . . . . . . . . . . . . . . . .

46 46 47 48 49 49 50 52 54 54 55 56 57

10 Functions and Operators 10.1 Functions and Series . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Kronecker and Vec Operator . . . . . . . . . . . . . . . . . . . . 10.3 Vector Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.5 Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.6 Integral Involving Dirac Delta Functions . . . . . . . . . . . . . . 10.7 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58 58 59 61 61 62 62 63

A One-dimensional Results 64 A.1 Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 A.2 One Dimensional Mixture of Gaussians . . . . . . . . . . . . . . . 65 B Proofs and Details B.1 Misc Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66 66

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 4

CONTENTS

CONTENTS

Notation and Nomenclature A A ij Ai A ij An A −1 A+ A 1/2 (A)ij Aij [A]ij a ai ai a ℜz ℜz ℜZ ℑz ℑz ℑZ det(A) Tr(A) diag(A) eig(A) vec(A) sup ||A|| AT A −T A∗ AH A◦B A⊗B 0 I Jij Σ Λ

Matrix Matrix indexed for some purpose Matrix indexed for some purpose Matrix indexed for some purpose Matrix indexed for some purpose or The n.th power of a square matrix The inverse matrix of the matrix A The pseudo inverse matrix of the matrix A (see Sec. 3.6) The square root of a matrix (if unique), not elementwise The (i, j).th entry of the matrix A The (i, j).th entry of the matrix A The ij-submatrix, i.e. A with i.th row and j.th column deleted Vector (column-vector) Vector indexed for some purpose The i.th element of the vector a Scalar Real part of a scalar Real part of a vector Real part of a matrix Imaginary part of a scalar Imaginary part of a vector Imaginary part of a matrix Determinant of A Trace of the matrix A Diagonal matrix of the matrix A, i.e. (diag(A ))ij = δij Aij Eigenvalues of the matrix A The vector-version of the matrix A (see Sec. 10.2.2) Supremum of a set Matrix norm (subscript if any denotes what norm) Transposed matrix The inverse of the transposed and vice versa, A −T = (A −1 )T = (A T )−1 . Complex conjugated matrix Transposed and complex conjugated matrix (Hermitian) Hadamard (elementwise) product Kronecker product The null matrix. Zero in all entries. The identity matrix The single-entry matrix, 1 at (i, j) and zero elsewhere A positive deﬁnite matrix A diagonal matrix

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 5

1

1

Basics (AB)−1

=

B−1 A −1

(1)

−1

=

...C−1 B−1 A −1

(2)

(ABC...)

T −1

(A )

=

(A

(A + B)T

=

A T + BT

T

(ABC...)T

−1 T

(4)

=

B A

(5)

...CT BT A T

(6)

(A H )−1

=

(A −1 )H

(7)

H

=

A H + BH

(8)

H

(ABC...)H

Tr(A) =

P

(AB)

H

=

B A

H

(9)

=

...CH BH A H

(10)

Trace Aii Pi Tr(A) = i λi , Tr(A) = Tr(A T ) Tr(AB) = Tr(BA) Tr(A + B) = Tr(ABC) = T

a a

1.2

(3)

=

(A + B)

T

)

T

(AB)

1.1

BASICS

=

(11) λi = eig(A)

(12) (13) (14)

Tr(A) + Tr(B)

(15)

Tr(BCA) = Tr(CAB)

(16)

T

Tr(aa )

(17)

Determinant

Let A be an n × n matrix. Q det(A) = λi = eig(A) i λi det(cA) = cn det(A), if A ∈ Rn×n

det(A T ) = det(A) det(AB) = det(A) det(B) det(A −1 ) = 1/ det(A) det(A n ) = det(A)n det(I + uvT ) =

(18) (19) (20) (21) (22) (23)

1 + uT v

(24)

det(I + A) = 1 + det(A) + Tr(A)

(25)

For n = 2: For n = 3: det(I + A) = 1 + det(A) + Tr(A) +

1 1 Tr(A)2 − Tr(A 2 ) 2 2

(26)

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 6

1.3

The Special Case 2x2

1

BASICS

For n = 4: det(I + A) =

1 + det(A) + Tr(A) +

1 2

1 +Tr(A)2 − Tr(A 2 ) 2 1 1 1 + Tr(A)3 − Tr(A )Tr(A 2 ) + Tr(A 3 ) 3 2 6

(27)

For small ε, the following approximation holds 1 1 = 1 + det(A) + εTr(A) + ε2 Tr(A)2 − ε2 Tr(A 2 ) det(I + εA) ∼ 2 2

1.3

(28)

The Special Case 2x2

Consider the matrix A A=



A11 A21

A12 A22



Determinant and trace

Eigenvalues

λ1 =

Tr(A) +

det(A) = A11 A22 − A12 A21

(29)

Tr(A) = A11 + A22

(30)

λ2 − λ · Tr(A) + det(A) = 0 p

Tr(A)2 − 4 det(A ) 2

λ2 =

λ1 + λ2 = Tr(A)

Eigenvectors v1 ∝



A12 λ1 − A11

Inverse A −1 =

p

Tr(A)2 − 4 det(A ) 2

λ1 λ2 = det(A )



1 det(A)

Tr(A) −

v2 ∝ 

A22 −A21



A12 λ2 − A11

−A12 A11



 (31)

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 7

2

2

DERIVATIVES

Derivatives

This section is covering diﬀerentiation of a number of expressions with respect to a matrix X. Note that it is always assumed that X has no special structure, i.e. that the elements of X are independent (e.g. not symmetric, Toeplitz, positive deﬁnite). See section 2.8 for diﬀerentiation of structured matrices. The basic assumptions can be written in a formula as ∂Xkl = δik δlj ∂Xij that is for e.g. vector forms,   ∂x ∂xi = ∂y i ∂y



∂x ∂y



= i

∂x ∂yi

(32)



∂x ∂y



= ij

∂xi ∂yj

The following rules are general and very useful when deriving the diﬀerential of an expression ([19]): ∂A ∂(αX) ∂ (X + Y) ∂(Tr(X)) ∂ (XY) ∂ (X ◦ Y) ∂(X ⊗ Y) ∂(X−1 ) ∂(det(X)) ∂(det(X)) ∂(ln(det(X))) ∂ XT ∂XH

2.1 2.1.1

= = = = = = = = = = = = =

0 α∂X ∂ X + ∂Y Tr(∂X) (∂ X)Y + X(∂Y) (∂ X) ◦ Y + X ◦ (∂Y) (∂X) ⊗ Y + X ⊗ (∂Y) −X−1 (∂X)X−1 Tr(adj(X)∂X) det(X)Tr(X−1 ∂ X) Tr(X−1 ∂X) (∂ X)T (∂X)H

(A is a constant)

(33) (34) (35) (36) (37) (38) (39) (40) (41) (42) (43) (44) (45)

Derivatives of a Determinant General form   ∂Y ∂ det(Y ) = det(Y)Tr Y −1 ∂x ∂x X ∂ det(X) Xjk = δij det(X) ∂Xik k # " " ∂ ∂Y ∂ 2 det(Y ) −1 ∂x = det(Y) Tr Y ∂x ∂x2     −1 ∂Y −1 ∂Y Tr Y +Tr Y ∂x ∂x   #  −1 ∂Y −1 ∂Y −Tr Y Y ∂x ∂x

(46) (47)

(48)

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 8

2.2

Derivatives of an Inverse

2.1.2

2

DERIVATIVES

Linear forms ∂ det(X) ∂X X ∂ det(X) Xjk ∂Xik

=

det(X)(X−1 )T

(49)

=

δij det(X)

(50)

det(AXB)(X−1 )T = det(AXB)(XT )−1

(51)

k

∂ det(AXB) = ∂X

2.1.3

Square forms

If X is square and invertible, then ∂ det(XT AX) = 2 det(XT AX)X−T ∂X

(52)

If X is not square but A is symmetric, then ∂ det(XT AX) = 2 det(XT AX)AX(XT AX)−1 ∂X

(53)

If X is not square and A is not symmetric, then ∂ det(XT AX) = det(XT AX)(AX(XT AX)−1 + A T X(XT A T X)−1 ) ∂X 2.1.4

(54)

Other nonlinear forms

Some special cases are (See [9, 7]) ∂ ln det(XT X)| = 2(X+ )T ∂X ∂ ln det(XT X) = −2XT ∂X+ ∂ ln | det(X)| = (X−1 )T = (XT )−1 ∂X ∂ det(Xk ) = k det(Xk )X−T ∂X

2.2

(55) (56) (57) (58)

Derivatives of an Inverse

From [27] we have the basic identity ∂Y −1 ∂Y −1 = −Y −1 Y ∂x ∂x

(59)

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 9

2.3

Derivatives of Eigenvalues

2

DERIVATIVES

from which it follows ∂(X−1 )kl ∂Xij

=

−(X−1 )ki (X−1 )jl

(60)

∂aT X−1 b = −X−T abT X−T (61) ∂X ∂ det(X−1 ) = − det(X−1 )(X−1 )T (62) ∂X ∂Tr(AX−1 B) = −(X−1 BAX−1 )T (63) ∂X ∂Tr((X + A)−1 ) = −((X + A)−1 (X + A)−1 )T (64) ∂X From [32] we have the following result: Let A be an n × n invertible square matrix, W be the inverse of A, and J (A) is an n × n -variate and diﬀerentiable function with respect to A, then the partial diﬀerentials of J with respect to A and W satisfy ∂J −T ∂J = −A −T A ∂A ∂W

2.3

Derivatives of Eigenvalues

∂ ∂ X eig(X) = Tr(X) = I (65) ∂X ∂X Y ∂ ∂ det(X) = det(X)X−T (66) eig(X) = ∂X ∂X If A is real and symmetric, λi and vi are distinct eigenvalues and eigenvectors of A (see (276)) with vTi vi = 1, then [33] ∂λi ∂vi

2.4 2.4.1

= =

viT ∂(A)vi

(67)

+

(λi I − A) ∂(A)vi

(68)

Derivatives of Matrices, Vectors and Scalar Forms First Order ∂xT a = ∂x T ∂a Xb = ∂X T T ∂a X b = ∂X ∂aT Xa = ∂X ∂X = ∂Xij ∂(XA )ij = ∂Xmn ∂(XT A )ij = ∂Xmn

∂aT x ∂x

=

a

(69)

abT

(70)

baT

(71)

∂aT XT a ∂X

=

aaT

Jij

(72) (73)

δim(A)nj

=

(Jmn A )ij

(74)

δin (A)mj

=

(Jnm A )ij

(75)

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 10

2.4

Derivatives of Matrices, Vectors and Scalar Forms

2.4.2

2

DERIVATIVES

Second Order ∂ X Xkl Xmn ∂Xij

=

2

Xkl

(76)

=

X(bcT + cbT )

(77)

=

BT C(Dx + d) + DT CT (Bx + b)

(78)

=

δlj (XT B)ki + δkj (BX)il

(79)

=

XT BJij + JjiBX

kl

klmn

∂bT XT Xc ∂X ∂(Bx + b)T C(Dx + d) ∂x ∂(XT BX)kl ∂Xij ∂(XT BX) ∂Xij

X

(Jij )kl = δik δjl (80)

See Sec 9.7 for useful properties of the Single-entry matrix Jij ∂xT Bx ∂x ∂bT XT DXc ∂X

=

(B + BT )x

(81)

=

DT XbcT + DXcbT

(82)

(D + DT )(Xb + c)bT

(83)

∂ (Xb + c)T D(Xb + c) = ∂X Assume W is symmetric, then ∂ (x − As)T W(x − As) ∂s ∂ (x − s)T W(x − s) ∂x ∂ (x − s)T W(x − s) ∂s ∂ (x − As)T W(x − As) ∂x ∂ (x − As)T W(x − As) ∂A

=

−2A T W(x − As)

(84)

=

2W(x − s)

(85)

=

−2W(x − s)

(86)

=

2W(x − As)

(87)

=

−2W(x − As)sT

(88)

As a case with complex values the following holds ∂(a − xH b)2 ∂x

=

−2b(a − xH b)∗

(89)

This formula is also known from the LMS algorithm [14] 2.4.3

Higher-order and non-linear n−1

X ∂(Xn )kl = (Xr Jij Xn−1−r )kl ∂Xij

(90)

r=0

For proof of the above, see B.1.3.

n−1 X ∂ T n a X b= (Xr )T abT (Xn−1−r )T ∂X

(91)

r=0

Petersen & Pedersen, The Matrix Cookbook, Version: November 15, 2012, Page 11

2.5

Derivatives of Traces

∂ T n T n a (X ) X b ∂X

2

=

n−1 h

X

DERIVATIVES

Xn−1−r abT (Xn )T Xr

r=0

+(Xr )T Xn abT (Xn−1−r )T

i

(92)

See B.1.3 for a proof. Assume s and r are functions of x, i.e. s = s(x), r = r(x), and that A is a constant, then  T  T ∂s ∂ T ∂r AT s (93) s Ar = Ar + ∂x ∂x ∂x ∂ (Ax)T (Ax) ∂x (Bx)T (Bx)

= =

2.4.4

∂ xT A T Ax ∂x xT BT Bx A T Ax xT A T AxBT Bx 2 T −2 (xT BT Bx)2 x BBx

(94) (95)

Gradient and Hessian

Using the above we have for the gradient and the Hessian f ∂f ∇x f = ∂x ∂2f ∂x∂xT

2.5

=

xT Ax + bT x

(96)

=

(A + A ...