Appunti Machine Learning PDF

Title	Appunti Machine Learning
Course	Big Data Analytics [2932]
Institution	Politecnico di Bari
Pages	55
File Size	5 MB
File Type	PDF
Total Downloads	73
Total Views	186

Preview

CLICK TO PREVIEW PDF

Summary

Appunti per studiare la parte sul Machine Learning del corso di Big Data Analytics....

Description

h

n

x(i)

y (i) (i)

xj

h hθ (x) = h(x) = θ0 + θ1 x θi x θi y

hθ (x)

(x, y) hθ (x) y m

min θ0 θ1

m

2 1 X hΘ (x(i) ) − y (i) 2m i=1

J

J (θ0 , θ1 ) =

m 2 1 X hθ (x(i) ) − y (i) 2m i=1

min J (θ0 , θ1 ) θ0 θ1

hθ (x) = h(x) = θ0 + θ1 x θ0 , θ1 J (θ0 , θ1 ) =

1 2m

minθ0 θ1 J (θ0 , θ1 )

 Pm  (i) (i) 2 i=1 hθ (x ) − y

θi

θi θi θi

θk := θk − α :=

∂ J (θ0 , . . . , θi ) ∂θk α

α J

α 10 . . . , 0.001, 0.01, 0.1, 1, . . . . . . , 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, . . .

3

m

θ0 := θ0 − α

 1 X hθ (x(i) ) − y (i) m i=1 m

θ1 := θ1 − α

 1 X hθ (x(i) ) − y (i) · x(i) m i=1

x x

x1 , . . . , xn

(i)

xj

h θi hθ (x) = θ0 + θ1 x1 + · · · + θn xn x0 = 1 h hθ (x) = θ0 x0 + θ1 x1 + · · · + θn xn



h 

 x0 1  x1   x1   ∈ Rn+1 =  x =         xn xn   θ0  θ1    θ=  ∈ Rn+1   θn 

hθ (x) = θ ⊤ x

J h m

J (θ) =

2 1 X hθ (x(i) ) − y (i) 2m i=1

θk := θk − α

∂ J (θ) ∂θk

θk := θk − α

m  1 X (i) hθ (x(i) ) − y (i) · x k m i=1

x0 = 1

θ0

(i)

xk

−1

+1

µ xi :=

x i − µi max(xi ) − min(xi )

σ xi :=

x i − µi σi

h

hθ (x) = θ0 + θ1 x + θ2 x2

hθ (x) = θ0 + θ1 x + θ2 x2 + θ3 x3

√ hθ (x) = θ0 + θ1 x + θ2 x2

J θ X ∈ Rm×(n+1) n

m

X

y

 −1 ⊤ θ = X ⊤X X y

y ∈ Rm

  ⊤ x(1)  ⊤   x(2)  X =   (m) ⊤ x  (1)  y  y (2)    y =   

     

y (m)

α

α

m

 ⊤ −1 X X

3

O (n)

y ∈ {0, 1}

y ∈ {1, . . . , n} hθ (x)

< 0

> 1 hθ (x) > 0.5

hθ (x) < 0.5 0 ≤ hθ (x) ≤ g

hθ (x) = g(θ ⊤ x) g(z) = hθ (x) =

1 1 + e−z 1 ⊤ 1 + e−θ x

h y=1

x y=0

hθ (x) = P (y = 1|x; θ) P (y = 0|x; θ) = 1 − P (y = 1|x; θ)

0.5

hθ (x) ≥ 0.5

θ⊤ x ≥ 0

θ⊤ x < 0

y=1 hθ (x) <

y=0

h

J (θ) =

1 2n

J 2 Pn  (i) (x ) − y (i) h θ i=1 J

h

h J

m

1X Cost(hθ (x)(i) , y (i) ) m i=1 ( y=1 − log(hθ (x)) Cost(hθ (x), y) = y=0 − log(1 − hθ (x)) J (θ) =

Cost

y = 1 h=1

h = 0 0

y=0

h=0

h=1

y

1

0

Cost Cost(hθ (x), y) = −y log(hθ (x)) − (1 − y) log(1 − hθ (x)) J

J (θ) =

−

n+1m 1 X Cost(hθ (x)(i) , y (i) ) m i=1 # "m 1 X (i) (i) (i) (i) =− y log(hθ (x )) + (1 − y ) log(1 − hθ (x )) m i=1

θ

J h 1 J (θ) = − m

"

m X i=1

y

(i)

(i)

(i)

(i)

#

log(hθ (x )) + (1 − y ) log(1 − hθ (x ))

θk := θk − α

θk := θk − α

∂ J (θ) ∂θk

m  1 X (i) hθ (x(i) ) − y (i) · x k m i=1

h

α

J θ θ

y=1

y=0

(i)

hθ (x)

x (i) maxi hθ (x)

J (θ) =

1 2(n+1)

 Pn+1  (i) (i) 2 ≈0 i=1 hθ (x ) − y

y J

J (θ) =

"m # n 2 X 1 X (i) (i) 2 +λ hθ (x ) − y θi 2m i=1 i=1

θ0 λ 0

λ λ

θi

0

θ0

J θ0 := θ0 − α

m  1 X (i) hθ (x(i) ) − y (i) · x 0 m i=1

"

# m  1 X λ (i) (i) (i) θk := θk − α · xk − θk hθ (x ) − y m m i=1 #   " m  λ 1 X (i) (i) (i) = θk 1 − α ·xk hθ (x ) − y − α m m i=1 θk

  1 − α nλ

0.99

θ0



λ≥0

 ⊤ θ= X X + λ 



 −1

0

   

X ⊤X

1 1

   

X ⊤y

# "m n λ X 2 1 X (i) (i) (i) (i) y log(hθ (x )) + (1 − y ) log(1 − hθ (x )) + θ J (θ) = − m i=1 2m i=1 i J θ0 := θ0 − α

m  1 X (i) hθ (x(i) ) − y (i) · x 0 m i=1

"

# m  1 X λ (i) (i) (i) θk := θk − α · xk − θk hθ (x ) − y m m i=1 #   " m  λ 1 X (i) (i) (i) = θk 1 − α ·xk hθ (x ) − y − α m m i=1 θ hθ

O(n2 ) 3

O(n )

x hθ (x)

θ

Bias unit

x0

Output

x3

hθ (x) =

1 ⊤ 1+e−θ x

hϴ(x)

x2 In pu t

Input unit

x1

x0 1

(j)

Θ(j)

ai

j

j +1   (1) (1) (1) (1) (2) a1 = g Θ10 x0 + Θ11 x1 + Θ 12 x2 + Θ13 x3   (1) (1) (1) (2) (1) a2 = g Θ20 x0 + Θ21 x1 + Θ 22 x2 + Θ23 x3

  (1) (1) (1) (2) (1) a3 = g Θ30 x0 + Θ31 x1 + Θ 32 x2 + Θ33 x3

  (2) (2) (2) (2) (2) (2) (2) (2) (3) hΘ (x) = a 1 = g Θ10 a0 + Θ 11 a1 + Θ 12 a2 + Θ 13 a3 sj sj+1 ×(sj + 1)

j sj+1

j +1

Θ(j)

Bias unit

a(2)0

x1

a(2)1

x2

a(2)2

x3

a(2)3

Livello 1 Input x

Livello 2 Hidden

Input unit

x0

Livello n Output y

---

x = a(1) i

x y a z a(i+1) = g(z (i+1) )

z (i+1) = Θ(i) a(i) (i) a0 = 1

hϴ(x)

a(n)1

1 0 x1

1

0 x1 x1

x2

x2 x1

x2 g 0

x → −∞

0.99 0.01

x = −4.6 g

4.6

−4.6

(1)

ΘAND =



−30 +20

1 x = 4.6

+20



x→∞

hΘAND (x) = g(−30 + 20x1 + 20x2 ) x1

hΘAND (x)

x2

g(−30) ≈ 0 g(−10) ≈ 0 g(−10) ≈ 0 g(10) ≈ 1 (1) ΘOR =



−10

+20 +20



hΘOR (x) =

g(−10 + 20x1 + 20x2 ) x1

hΘOR (x)

x2

g(−10) ≈ 0 g(10) ≈ 1 g(10) ≈ 1 g(30) ≈ 1 (1)

ΘNOT =



+10

−20



hΘNOT (x) =

g(10 − 20x1 ) x1

hΘNOT (x) g(10) ≈ 1 g(−10) ≈ 0

(1) Θ NOT ANDNOT =



+10

−20

−20

g(10 − 20x1 −20x2 ) x1

x2

hΘNOT ANDNO T (x) g(10) ≈ 1 g(−10) ≈ 0 g(−10) ≈ 0 g(−30) ≈ 0



hΘNOT ANDNO T (x) =

-10

+20

x1

a(2)1 0 +2

0 -2 +2 0

Bias unit

+1 0 -3 +10

Input unit

+1

-20

x2

+20

a(2)2

Livello 1 Input x

hϴ(x)

a(3)1

Livello n Output y

Livello 2 Hidden

AND OR (NOT) AND (NOT)

(1)

ΘXNOR = (2) Θ XNOR = (2) (2) g(−10 + 20a1 + 20a 2 )

x1

x2



−30 +20 +20 +10 −20 −20





−10 +20



(2)

a1

(2)

a2

+20

hΘX NOR (x) g(10) ≈ 1 g(−10) ≈ 0 g(−10) ≈ 0 g(10) ≈ 1

hΘX NOR (x) =

hΘ (x) ∈ Rn y ∈ Rn y=



0

n y

n 1

0

y = ⊤ ··· 0

hΘ (x) ≈





1

0

hΘ (x) ≈

0

1

0



··· 0

⊤

1

··· 0

0

··· 0

⊤

L

x x

sl K

l hΘ (x) ∈ RK

⊤

hΘ (x)k

Θ # "n+1 K X X (i) 1 (i) (i) (i) J (θ) = − y log(hΘ (x )k ) + (1 − yk ) log(1 − hΘ (x )k ) n + 1 i=1 k=1 k +

s l+1  L−1 X sl X 2 X λ (l) Θ ji 2(n + 1) l=1 i=0 j=0

0

(l)

aj

a(1) = x z (2) = Θ(1) a(1) a(2) = g(z (2) )

z (L) = Θ(L−1) a(L−1) a(L) = hΘ (x) = g(z (L) ) (l)

δj

(L)

δj (L−1)

δj

.∗

− yj

= (Θ(L−1) )⊤ δ jL . ∗ g ′ (z (L−1) )

(2)

δj

(L)

= aj

= (Θ(2) )⊤ δj3. ∗ g ′ (z (2) )

g ′ (z (l) ) g ′ (z (l) ) = a(l) . ∗ (1 − a(l) )

∂ (l)

∂Θ ij

g δ (1)

(l+1)

(l) J (Θ) = a j δi

λ

z (l)

  (1) (1) (x y ), . . . , (x(m) , y (m) )

(l) ∆ij := 0

i=1

i,

m

a(1) := x(i) a(2) , a(3) , . . . , a(L) δ (L) := a(L) − y (i)

δ /L−1) , . . . , δ (2)

(l)

(l) (l+1) ∆ ij := ∆(l) ij + a j δ i (l)

(l) Dij := ∆(l) ij + λΘij (l)

Dij := ∆(l) ij

j =0

j 6= 0

∂

(l)

(l)

∂Θ ij

J (Θ) = Dij

Θ

0

[−ǫ, +ǫ]

1

4

h

h h

y

h

d Jtrain (Θ) d Jcv (Θ)

Jtest (Θ)

h

h h

Jtrain (Θ) Jcv (Θ) Jtrain (Θ) Jcv (Θ)

d h λ λ

Jtrain (Θ)

Jcv (Θ)

m

λ

→ → → → → →

λ

accuratezza =

#Classif icatiCorrettamente #Claddif icati

0.5% 0

1 accuratezza = 99.5%

1 1

precisione =

#V eroP ositivo #V eroP ositivo = #P redettiP ositivi #V eroP ositivo + #F alsoP ositivo 1

1 richiamo =

#V eroP ositivo #V eroP ositivo = #V eroP ositivo + #F alsoN egativo #RealiP ositivi

0.9 0.4

0.6

Fβ =

(1 + β 2 )P R β2P + R

β = 1 1

F1 =

2P R P +R

0.1

x y

Cost

Cost(hθ (x), y) =

(

− log(hθ (x)) − log(1 − hθ (x))

y=1 y=0 Cost1 Cost0 J

Cost1

Cost0

# "n+1 X 1 (i) ⊤ (i) (i) ⊤ (i) J (θ) = − y Cost1 (θ x ) + (1 − y )Cost0 (θ x ) n + 1 i=1 J A + λB B

λ

A CA + B

C

λ n+1

J (θ) = C

"n+1 X i=1

(i)

⊤ (i)

(i)

⊤ (i)

#

y Cost1 (θ x ) + (1 − y )Cost0 (θ x ) +

h hθ (x) =

(

1 0

if if

θ⊤x ≥ 0

θ⊤x < 0

n 1 X 2 θi 2 i=1

C

C C A= 0 θ ⊤ x(i) ≥ 1

Cost0 y (i) = 0

min θ

(

n 1 X 2 θj 2

θ ⊤ x(i) ≥ 1 θ ⊤ x(i) ≤ −1

j=1

if if

y (i) = 1 y (i) = 0

θ ⊤ x(i) x

θ

Cost1 θ ⊤ x(i) ≤ −1

y (i) = 1

θ

p(i) kθk

θ p(i) θ θ p(i) C

l(i)

xi fi

li fi

l(i)

x

 !  x − l(i)  2 fi = sim(x, l ) = exp − 2σ 2 (i)

x ≈ l(i)

f1 ≈ 0

fi y=1

l(1)

x

f1 ≈ 1

xi θ0 + θ1 f 1 + · · · + θn f n ≥ 0 fi

θ 0

f (i)

x(i)

i

J (θ) = C

"n+1 X i=1

(i)

⊤ (i)

y Cost1 (θ f

(i)

⊤ (i)

) + (1 − y )Cost0 (θ f

#

) +

n+1 1 X 2 θ 2 i=1 i

n n+1 f0 = 1

f

θ ⊤ M θ,

θ⊤ θ M

C λ

C

C σ σ

50000+ 10 − 10000

θ

cost(θ, (x(i) , y(i) )) = J (θ) =

2 1 hθ (x(i) ) − y (i) 2

m 1 X cost(θ, (x(i) , y(i) )) m i=1

θk := θk − α

∂ cost(θ, (x(i) , y(i) )) ∂θk

cost(θ, (x(i) , y(i) )) θk := θk − α

 1  (i) hθ (x(i) ) − y (i) · xk m

i := 1, . . . , m j := 0, . . . , n   (i) θj := θj − α hθ (x(i) ) − y (i) · x j

b b = m b=1

i := 1, . . . , m b j := 0, . . . , n  (k) Pi+b−1  hθ (x(k) ) − y (k) · xj θj := θj − α 1b k=i

J (θ) cost(θ, (x(i) , y(i) )) t = 1000 t = 5000

t

θ α

α= const1 const2

cons1 iterationNumber + const2

K {x(1) , x(2) , . . . , x(m) }

x(i) ∈ Rn

x0 = 1

K {x(1) , x(2) , . . . , x(m) } K

i=1 c

(i)

µ 1 , µ2 , . . . , µK ∈ R n

m

 2 := mink x(i) − µk x(i)

k=1

µk :=

K

P

x(i) z

∀c (i) = k, z = num(c (i) = k) k

1

K...