memória associativa linear ruy luiz milidiú. regressão linear objetivo examinar o modelo de...

Post on 07-Apr-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Memória Associativa Linear

Ruy Luiz Milidiú

Regressão Linear ObjetivoExaminar o modelo de memória associativa

linear, suas vantagens e limitações

Sumário Memória linear simples Memória linear múltipla Múltiplas memórias lineares múltiplas Cross-Validation

Memória linear simples Exemplos

(x1, y1), (x2, y2), … , (xn, yn) xi real, yi real

Neurônio Linear ŷ = w0 + w1. x w0 , w1 = ?

Desempenho E Erro = (ŷ1 – y1)2 + … + (ŷn – yn)2

Exemplo

0

5

10

15

20

0 5 10 15

x

y

Exemplo

0

5

10

15

20

0 5 10 15

x

y

Aprendizado Supervisionado

Minimizar Erro … E(w0,w1) = (ŷ1 – y1)2 + … + (ŷn – yn)2

E(w0,w1) = (w0+w1.xi – yi)2

Diferenciando…2.(w0+w1.x1 – y1)+ … +2.(w0+w1.xn – yn) = 02.x1.(w0+w1.x1 – y1)+ … +2.xn.(w0+w1.xn – yn)

= 0

Equações NormaisSistema de equações lineares…(w0+w1.x1 – y1) + … + (w0+w1.xn – yn) = 0x1.(w0+w1.x1 – y1)+ … +xn.(w0+w1.xn – yn) =

0

n.w0 + (x1+ … +xn).w1 = y1+ … +yn

(x1+ … +xn).w0 + (x12+ … +xn

2).w1 = x1.y1+ … +xn.yn

Equações NormaisSolução por substituição…w0 = (y1+ … +yn )/n - [(x1+ … +xn)/n].w1

(x1+ … +xn).{(y1+ … +yn )/n - [(x1+ … +xn)/n].w1} + (x1

2+ … +xn2).w1 = x1.y1+ … +xn.yn

w1 = A / BA = x1.y1+ … +xn.yn - (x1+ … +xn).(y1+ … +yn )/n B = (x1

2+ … +xn2) - (x1+ … +xn).(x1+ … +xn)/n

w0 = (y1+ … +yn )/n - [(x1+ … +xn)/n].(A/B)

Equações Normais

n xi w0 yi

xi xi2 w1

xi.yi

=

Memória linear múltipla Exemplos

(x1, y1), (x2, y2), … , (xn, yn) xi Rk, yi real

Neurônio Linear ŷ = wT. x w = ?

Desempenho E Erro = (ŷ1 – y1)2 + … + (ŷn – yn)2

Aprendizado Supervisionado

E(w) = (wT.xi – yi)2

E(w) = vT.v onde vi = wT.xi – yi = xiT.w – yi

E(w) =∑i (xiT.w – yi)T.(xi

T.w – yi) E(w) =∑i yi

2 – 2 ∑i yi.xiT.w + wT.∑i xi.xi

T.w X = [x1,…,xn] E(w) = yT.y – 2.(Xy)T.w + wT.XXTw

∂E/∂w = 2.XXTw – 2.Xy = 0

Aprendizado Supervisionado

XXTw = Xy Eq. Normal

w = (XXT)-1Xy

Adaptação Rápidaw = (XXT)-1Xy

X(n) = [x1,…,xn] X(n)X(n)

T = [x1,…,xn] . [x1,…,xn]T

X(n)X(n)T = xi.xi

T

X(n)X(n)T = X(n-1)X(n-1)

T + xn.xnT

[X(n)X(n)T]-1 = [X(n-1)X(n-1)

T + xn.xnT]-1

Inversa Incremental(A + x.xT)-1 = ? AT = A

A + x.xT = (I + x.xT.A-1).A

(I + x.xT.A-1)-1 = ? (I + x.vT)-1 = ? onde v = A-1x (I + U)-1 = I - U + U2 - U3 + … (x.vT)2 = c.x.vT onde c = vT.x (x.vT)r =cr-1.x.vT r=1,2,… (I + x.vT)-1 = I - (1+c)-1.x.vT

Inversa Incremental

(A + x.xT)-1 = A-1 - (1+c)-1.v.vT)

onde v = A-1x c = vT.x

Adaptação Rápidaw = (XXT)-1Xy

X(n) = [x1,…,xn] = [X(n-1) ,xn]X(n)X(n)

T = .I + xi.xiT 0

X(n)X(n)T = A + xn.xn

Tvn = A-1.xn

w(n) = {A-1 - (1+xnT.vn)-1.vn.vn

T}.(X(n-1)y(n-1)+yn.xn)A-1.X(n-1)y(n-1) = w(n-1) A-1.yn.xn = yn.vn

vnT.X(n-1)y(n-1) = xn

T.A-1.X(n-1)y(n-1) = xnT.w(n-1)

vnT.yn.xn = yn.xn

T.vn

Adaptação Rápidaw = (XXT)-1Xy

XXT = [x1,…,xn] . [x1,…,xn]T

XXT = .I + xi.xiT 0

XXT = A + xn.xnT vn = A-1.xn

w(n) = w(n-1) + (yn-w(n-1)T.xn).(1+xn

T.vn)-1.vn

Adaptação Rápidaw = (XXT)-1Xy

XXT = [x1,…,xn] . [x1,…,xn]T

XXT = .I + xi.xiT 0

XXT = A + xn.xnT vn = A-1.xn

w w + (yn-wT.xn).(1+xnT.vn)-1.vn

Adaptação LentaE(w) = (wT.xi – yi)2

E(n)(w) = E(n-1)(w) + (wT.xn – yn)2

∂E(n)/∂w = ∂E(n-1)/∂w + 2.(wT.xn – yn).xn ∂E(n)/∂w 0 + 2.(wn-1

T.xn – yn).xn ∂E(n)/∂w 2.(wn-1

T.xn – yn).xn

wn = wn-1 + .(yn-wn-1T.xn).xn

ADALINEE(w) = (wT.xi – yi)2

ADAptive LInear NEuron Adaptação lenta Método do gradiente Aprendizado online por exemplo Aprendizado distribuido vetores

wn = wn-1 + .(yn-wn-1T.xn).xn

… memórias lineares múltiplas Exemplos

(x1, y1), (x2, y2), … , (xn, yn) xi em Rk, yi em Rs

Neurônio Linear ŷ = W.x W = ?

Memória Ótima

W = Y.XT.(XXT)-1

20 1 2: ( ; ) m

mf R R f x w w x w x w x w

1T TX X X

w y

2(1) (1) (1)

0(1) 2(2) (2) (2)1

( )

2( ) ( ) ( )

1

1

1

m

m

nmmn n n

x x xw

ywx x xX

y wx x x

y w

Polynomial Regression

0 1 ( ; )f x w w x w 2 30 1 2 3 ( ; )f x w w x w x w x w

2 50 1 2 5( ; )f x w w x w x w x w 2 10

0 1 2 10( ; )f x w w x w x w x w

-1.5 -1 -0.5 0 0.5 1 1.5

2

4

6

8

-1.5 -1 -0.5 0 0.5 1 1.5

2

4

6

8

-1.5 -1 -0.5 0 0.5 1 1.5

0

2

4

6

8

Regression with polynomials: fit improves with increased order

We want to fit the training set, but as model complexity

increases, we run the risk of over-fitting.

2( ) ( )

1

1 ˆ; 0n

i i

iy f

n x w

-1.5 -1 -0.5 0 0.5 1 1.5

0

2

4

6

8

Train set

-1.5 -1 -0.5 0 0.5 1 1.50

2

4

6

8

Leave out

When the model order is over-fitting, leaving a single data point out of the training set can drastically change the fit.

Over-fitting

We want to fit the training set, but we want to also generalize correctly. To measure generalization, we leave out a data point (named the test

point), fit the data, and then measure error on the test point. The average error over all possible test points is the cross validation error.

2( ) ( ) (! )

1

1 ;n

i i i

iCV y f

n x w

(! )iwWeights estimated from a training set that does not include the i-th data point

Cross-Validation

2 4 6 8 10

0.4

0.5

0.6

0.7

0.8

0.9

Model order

Mea

n-sq

uare

d er

ror

(trai

ning

set

)

2 4 6 8 100

25

50

75

100

125

150

175

1 2 3 4 5

1

1.5

2

2.5

3

Model order

Cro

ss-v

alid

atio

n er

ror

Cro

ss-v

alid

atio

n er

ror

Model order

(actual data was generated with a 2nd order polynomial process)

Model Selection

top related