Title | Appunti Machine Learning |
---|---|
Course | Big Data Analytics [2932] |
Institution | Politecnico di Bari |
Pages | 55 |
File Size | 5 MB |
File Type | |
Total Downloads | 73 |
Total Views | 186 |
Appunti per studiare la parte sul Machine Learning del corso di Big Data Analytics....
h
n
x(i)
y (i) (i)
xj
h hθ (x) = h(x) = θ0 + θ1 x θi x θi y
hθ (x)
(x, y) hθ (x) y m
min θ0 θ1
m
2 1 X hΘ (x(i) ) − y (i) 2m i=1
J
J (θ0 , θ1 ) =
m 2 1 X hθ (x(i) ) − y (i) 2m i=1
min J (θ0 , θ1 ) θ0 θ1
hθ (x) = h(x) = θ0 + θ1 x θ0 , θ1 J (θ0 , θ1 ) =
1 2m
minθ0 θ1 J (θ0 , θ1 )
Pm (i) (i) 2 i=1 hθ (x ) − y
θi
θi θi θi
θk := θk − α :=
∂ J (θ0 , . . . , θi ) ∂θk α
α J
α 10 . . . , 0.001, 0.01, 0.1, 1, . . . . . . , 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, 3, . . .
3
m
θ0 := θ0 − α
1 X hθ (x(i) ) − y (i) m i=1 m
θ1 := θ1 − α
1 X hθ (x(i) ) − y (i) · x(i) m i=1
x x
x1 , . . . , xn
(i)
xj
h θi hθ (x) = θ0 + θ1 x1 + · · · + θn xn x0 = 1 h hθ (x) = θ0 x0 + θ1 x1 + · · · + θn xn
h
x0 1 x1 x1 ∈ Rn+1 = x = xn xn θ0 θ1 θ= ∈ Rn+1 θn
hθ (x) = θ ⊤ x
J h m
J (θ) =
2 1 X hθ (x(i) ) − y (i) 2m i=1
θk := θk − α
∂ J (θ) ∂θk
θk := θk − α
m 1 X (i) hθ (x(i) ) − y (i) · x k m i=1
x0 = 1
θ0
(i)
xk
−1
+1
µ xi :=
x i − µi max(xi ) − min(xi )
σ xi :=
x i − µi σi
h
hθ (x) = θ0 + θ1 x + θ2 x2
hθ (x) = θ0 + θ1 x + θ2 x2 + θ3 x3
√ hθ (x) = θ0 + θ1 x + θ2 x2
J θ X ∈ Rm×(n+1) n
m
X
y
−1 ⊤ θ = X ⊤X X y
y ∈ Rm
⊤ x(1) ⊤ x(2) X = (m) ⊤ x (1) y y (2) y =
y (m)
α
α
m
⊤ −1 X X
3
O (n)
y ∈ {0, 1}
y ∈ {1, . . . , n} hθ (x)
< 0
> 1 hθ (x) > 0.5
hθ (x) < 0.5 0 ≤ hθ (x) ≤ g
hθ (x) = g(θ ⊤ x) g(z) = hθ (x) =
1 1 + e−z 1 ⊤ 1 + e−θ x
h y=1
x y=0
hθ (x) = P (y = 1|x; θ) P (y = 0|x; θ) = 1 − P (y = 1|x; θ)
0.5
hθ (x) ≥ 0.5
θ⊤ x ≥ 0
θ⊤ x < 0
y=1 hθ (x) <
y=0
h
J (θ) =
1 2n
J 2 Pn (i) (x ) − y (i) h θ i=1 J
h
h J
m
1X Cost(hθ (x)(i) , y (i) ) m i=1 ( y=1 − log(hθ (x)) Cost(hθ (x), y) = y=0 − log(1 − hθ (x)) J (θ) =
Cost
y = 1 h=1
h = 0 0
y=0
h=0
h=1
y
1
0
Cost Cost(hθ (x), y) = −y log(hθ (x)) − (1 − y) log(1 − hθ (x)) J
J (θ) =
−
n+1m 1 X Cost(hθ (x)(i) , y (i) ) m i=1 # "m 1 X (i) (i) (i) (i) =− y log(hθ (x )) + (1 − y ) log(1 − hθ (x )) m i=1
θ
J h 1 J (θ) = − m
"
m X i=1
y
(i)
(i)
(i)
(i)
#
log(hθ (x )) + (1 − y ) log(1 − hθ (x ))
θk := θk − α
θk := θk − α
∂ J (θ) ∂θk
m 1 X (i) hθ (x(i) ) − y (i) · x k m i=1
h
α
J θ θ
y=1
y=0
(i)
hθ (x)
x (i) maxi hθ (x)
J (θ) =
1 2(n+1)
Pn+1 (i) (i) 2 ≈0 i=1 hθ (x ) − y
y J
J (θ) =
"m # n 2 X 1 X (i) (i) 2 +λ hθ (x ) − y θi 2m i=1 i=1
θ0 λ 0
λ λ
θi
0
θ0
J θ0 := θ0 − α
m 1 X (i) hθ (x(i) ) − y (i) · x 0 m i=1
"
# m 1 X λ (i) (i) (i) θk := θk − α · xk − θk hθ (x ) − y m m i=1 # " m λ 1 X (i) (i) (i) = θk 1 − α ·xk hθ (x ) − y − α m m i=1 θk
1 − α nλ
0.99
θ0
λ≥0
⊤ θ= X X + λ
−1
0
X ⊤X
1 1
X ⊤y
# "m n λ X 2 1 X (i) (i) (i) (i) y log(hθ (x )) + (1 − y ) log(1 − hθ (x )) + θ J (θ) = − m i=1 2m i=1 i J θ0 := θ0 − α
m 1 X (i) hθ (x(i) ) − y (i) · x 0 m i=1
"
# m 1 X λ (i) (i) (i) θk := θk − α · xk − θk hθ (x ) − y m m i=1 # " m λ 1 X (i) (i) (i) = θk 1 − α ·xk hθ (x ) − y − α m m i=1 θ hθ
O(n2 ) 3
O(n )
x hθ (x)
θ
Bias unit
x0
Output
x3
hθ (x) =
1 ⊤ 1+e−θ x
hϴ(x)
x2 In pu t
Input unit
x1
x0 1
(j)
Θ(j)
ai
j
j +1 (1) (1) (1) (1) (2) a1 = g Θ10 x0 + Θ11 x1 + Θ 12 x2 + Θ13 x3 (1) (1) (1) (2) (1) a2 = g Θ20 x0 + Θ21 x1 + Θ 22 x2 + Θ23 x3
(1) (1) (1) (2) (1) a3 = g Θ30 x0 + Θ31 x1 + Θ 32 x2 + Θ33 x3
(2) (2) (2) (2) (2) (2) (2) (2) (3) hΘ (x) = a 1 = g Θ10 a0 + Θ 11 a1 + Θ 12 a2 + Θ 13 a3 sj sj+1 ×(sj + 1)
j sj+1
j +1
Θ(j)
Bias unit
a(2)0
x1
a(2)1
x2
a(2)2
x3
a(2)3
Livello 1 Input x
Livello 2 Hidden
Input unit
x0
Livello n Output y
---
x = a(1) i
x y a z a(i+1) = g(z (i+1) )
z (i+1) = Θ(i) a(i) (i) a0 = 1
hϴ(x)
a(n)1
1 0 x1
1
0 x1 x1
x2
x2 x1
x2 g 0
x → −∞
0.99 0.01
x = −4.6 g
4.6
−4.6
(1)
ΘAND =
−30 +20
1 x = 4.6
+20
x→∞
hΘAND (x) = g(−30 + 20x1 + 20x2 ) x1
hΘAND (x)
x2
g(−30) ≈ 0 g(−10) ≈ 0 g(−10) ≈ 0 g(10) ≈ 1 (1) ΘOR =
−10
+20 +20
hΘOR (x) =
g(−10 + 20x1 + 20x2 ) x1
hΘOR (x)
x2
g(−10) ≈ 0 g(10) ≈ 1 g(10) ≈ 1 g(30) ≈ 1 (1)
ΘNOT =
+10
−20
hΘNOT (x) =
g(10 − 20x1 ) x1
hΘNOT (x) g(10) ≈ 1 g(−10) ≈ 0
(1) Θ NOT ANDNOT =
+10
−20
−20
g(10 − 20x1 −20x2 ) x1
x2
hΘNOT ANDNO T (x) g(10) ≈ 1 g(−10) ≈ 0 g(−10) ≈ 0 g(−30) ≈ 0
hΘNOT ANDNO T (x) =
-10
+20
x1
a(2)1 0 +2
0 -2 +2 0
Bias unit
+1 0 -3 +10
Input unit
+1
-20
x2
+20
a(2)2
Livello 1 Input x
hϴ(x)
a(3)1
Livello n Output y
Livello 2 Hidden
AND OR (NOT) AND (NOT)
(1)
ΘXNOR = (2) Θ XNOR = (2) (2) g(−10 + 20a1 + 20a 2 )
x1
x2
−30 +20 +20 +10 −20 −20
−10 +20
(2)
a1
(2)
a2
+20
hΘX NOR (x) g(10) ≈ 1 g(−10) ≈ 0 g(−10) ≈ 0 g(10) ≈ 1
hΘX NOR (x) =
hΘ (x) ∈ Rn y ∈ Rn y=
0
n y
n 1
0
y = ⊤ ··· 0
hΘ (x) ≈
1
0
hΘ (x) ≈
0
1
0
··· 0
⊤
1
··· 0
0
··· 0
⊤
L
x x
sl K
l hΘ (x) ∈ RK
⊤
hΘ (x)k
Θ # "n+1 K X X (i) 1 (i) (i) (i) J (θ) = − y log(hΘ (x )k ) + (1 − yk ) log(1 − hΘ (x )k ) n + 1 i=1 k=1 k +
s l+1 L−1 X sl X 2 X λ (l) Θ ji 2(n + 1) l=1 i=0 j=0
0
(l)
aj
a(1) = x z (2) = Θ(1) a(1) a(2) = g(z (2) )
z (L) = Θ(L−1) a(L−1) a(L) = hΘ (x) = g(z (L) ) (l)
δj
(L)
δj (L−1)
δj
.∗
− yj
= (Θ(L−1) )⊤ δ jL . ∗ g ′ (z (L−1) )
(2)
δj
(L)
= aj
= (Θ(2) )⊤ δj3. ∗ g ′ (z (2) )
g ′ (z (l) ) g ′ (z (l) ) = a(l) . ∗ (1 − a(l) )
∂ (l)
∂Θ ij
g δ (1)
(l+1)
(l) J (Θ) = a j δi
λ
z (l)
(1) (1) (x y ), . . . , (x(m) , y (m) )
(l) ∆ij := 0
i=1
i,
m
a(1) := x(i) a(2) , a(3) , . . . , a(L) δ (L) := a(L) − y (i)
δ /L−1) , . . . , δ (2)
(l)
(l) (l+1) ∆ ij := ∆(l) ij + a j δ i (l)
(l) Dij := ∆(l) ij + λΘij (l)
Dij := ∆(l) ij
j =0
j 6= 0
∂
(l)
(l)
∂Θ ij
J (Θ) = Dij
Θ
0
[−ǫ, +ǫ]
1
4
h
h h
y
h
d Jtrain (Θ) d Jcv (Θ)
Jtest (Θ)
h
h h
Jtrain (Θ) Jcv (Θ) Jtrain (Θ) Jcv (Θ)
d h λ λ
Jtrain (Θ)
Jcv (Θ)
m
λ
→ → → → → →
λ
accuratezza =
#Classif icatiCorrettamente #Claddif icati
0.5% 0
1 accuratezza = 99.5%
1 1
precisione =
#V eroP ositivo #V eroP ositivo = #P redettiP ositivi #V eroP ositivo + #F alsoP ositivo 1
1 richiamo =
#V eroP ositivo #V eroP ositivo = #V eroP ositivo + #F alsoN egativo #RealiP ositivi
0.9 0.4
0.6
Fβ =
(1 + β 2 )P R β2P + R
β = 1 1
F1 =
2P R P +R
0.1
x y
Cost
Cost(hθ (x), y) =
(
− log(hθ (x)) − log(1 − hθ (x))
y=1 y=0 Cost1 Cost0 J
Cost1
Cost0
# "n+1 X 1 (i) ⊤ (i) (i) ⊤ (i) J (θ) = − y Cost1 (θ x ) + (1 − y )Cost0 (θ x ) n + 1 i=1 J A + λB B
λ
A CA + B
C
λ n+1
J (θ) = C
"n+1 X i=1
(i)
⊤ (i)
(i)
⊤ (i)
#
y Cost1 (θ x ) + (1 − y )Cost0 (θ x ) +
h hθ (x) =
(
1 0
if if
θ⊤x ≥ 0
θ⊤x < 0
n 1 X 2 θi 2 i=1
C
C C A= 0 θ ⊤ x(i) ≥ 1
Cost0 y (i) = 0
min θ
(
n 1 X 2 θj 2
θ ⊤ x(i) ≥ 1 θ ⊤ x(i) ≤ −1
j=1
if if
y (i) = 1 y (i) = 0
θ ⊤ x(i) x
θ
Cost1 θ ⊤ x(i) ≤ −1
y (i) = 1
θ
p(i) kθk
θ p(i) θ θ p(i) C
l(i)
xi fi
li fi
l(i)
x
! x − l(i) 2 fi = sim(x, l ) = exp − 2σ 2 (i)
x ≈ l(i)
f1 ≈ 0
fi y=1
l(1)
x
f1 ≈ 1
xi θ0 + θ1 f 1 + · · · + θn f n ≥ 0 fi
θ 0
f (i)
x(i)
i
J (θ) = C
"n+1 X i=1
(i)
⊤ (i)
y Cost1 (θ f
(i)
⊤ (i)
) + (1 − y )Cost0 (θ f
#
) +
n+1 1 X 2 θ 2 i=1 i
n n+1 f0 = 1
f
θ ⊤ M θ,
θ⊤ θ M
C λ
C
C σ σ
50000+ 10 − 10000
θ
cost(θ, (x(i) , y(i) )) = J (θ) =
2 1 hθ (x(i) ) − y (i) 2
m 1 X cost(θ, (x(i) , y(i) )) m i=1
θk := θk − α
∂ cost(θ, (x(i) , y(i) )) ∂θk
cost(θ, (x(i) , y(i) )) θk := θk − α
1 (i) hθ (x(i) ) − y (i) · xk m
i := 1, . . . , m j := 0, . . . , n (i) θj := θj − α hθ (x(i) ) − y (i) · x j
b b = m b=1
i := 1, . . . , m b j := 0, . . . , n (k) Pi+b−1 hθ (x(k) ) − y (k) · xj θj := θj − α 1b k=i
J (θ) cost(θ, (x(i) , y(i) )) t = 1000 t = 5000
t
θ α
α= const1 const2
cons1 iterationNumber + const2
K {x(1) , x(2) , . . . , x(m) }
x(i) ∈ Rn
x0 = 1
K {x(1) , x(2) , . . . , x(m) } K
i=1 c
(i)
µ 1 , µ2 , . . . , µK ∈ R n
m
2 := mink x(i) − µk x(i)
k=1
µk :=
K
P
x(i) z
∀c (i) = k, z = num(c (i) = k) k
1
K...