memória associativa linear ruy luiz milidiú. regressão linear objetivo examinar o modelo de...

Memória Associativa Linear

Ruy Luiz Milidiú

Regressão Linear ObjetivoExaminar o modelo de memória associativa

linear, suas vantagens e limitações

Sumário Memória linear simples Memória linear múltipla Múltiplas memórias lineares múltiplas Cross-Validation

Memória linear simples Exemplos

(x1, y1), (x2, y2), … , (xn, yn) xi real, yi real

Neurônio Linear ŷ = w0 + w1. x w0 , w1 = ?

Desempenho E Erro = (ŷ1 – y1)2 + … + (ŷn – yn)2

Exemplo

0 5 10 15

Exemplo

0 5 10 15

Aprendizado Supervisionado

Minimizar Erro … E(w0,w1) = (ŷ1 – y1)2 + … + (ŷn – yn)2

E(w0,w1) = (w0+w1.xi – yi)2

Diferenciando…2.(w0+w1.x1 – y1)+ … +2.(w0+w1.xn – yn) = 02.x1.(w0+w1.x1 – y1)+ … +2.xn.(w0+w1.xn – yn)

Equações NormaisSistema de equações lineares…(w0+w1.x1 – y1) + … + (w0+w1.xn – yn) = 0x1.(w0+w1.x1 – y1)+ … +xn.(w0+w1.xn – yn) =

n.w0 + (x1+ … +xn).w1 = y1+ … +yn

(x1+ … +xn).w0 + (x12+ … +xn

2).w1 = x1.y1+ … +xn.yn

Equações NormaisSolução por substituição…w0 = (y1+ … +yn )/n - [(x1+ … +xn)/n].w1

(x1+ … +xn).{(y1+ … +yn )/n - [(x1+ … +xn)/n].w1} + (x1

2+ … +xn2).w1 = x1.y1+ … +xn.yn

w1 = A / BA = x1.y1+ … +xn.yn - (x1+ … +xn).(y1+ … +yn )/n B = (x1

2+ … +xn2) - (x1+ … +xn).(x1+ … +xn)/n

w0 = (y1+ … +yn )/n - [(x1+ … +xn)/n].(A/B)

Equações Normais

n xi w0 yi

xi xi2 w1

Memória linear múltipla Exemplos

(x1, y1), (x2, y2), … , (xn, yn) xi Rk, yi real

Neurônio Linear ŷ = wT. x w = ?

Desempenho E Erro = (ŷ1 – y1)2 + … + (ŷn – yn)2

E(w) = (wT.xi – yi)2

E(w) = vT.v onde vi = wT.xi – yi = xiT.w – yi

E(w) =∑i (xiT.w – yi)T.(xi

T.w – yi) E(w) =∑i yi

2 – 2 ∑i yi.xiT.w + wT.∑i xi.xi

T.w X = [x1,…,xn] E(w) = yT.y – 2.(Xy)T.w + wT.XXTw

∂E/∂w = 2.XXTw – 2.Xy = 0

XXTw = Xy Eq. Normal

w = (XXT)-1Xy

Adaptação Rápidaw = (XXT)-1Xy

X(n) = [x1,…,xn] X(n)X(n)

T = [x1,…,xn] . [x1,…,xn]T

X(n)X(n)T = xi.xi

X(n)X(n)T = X(n-1)X(n-1)

T + xn.xnT

[X(n)X(n)T]-1 = [X(n-1)X(n-1)

T + xn.xnT]-1

Inversa Incremental(A + x.xT)-1 = ? AT = A

A + x.xT = (I + x.xT.A-1).A

(I + x.xT.A-1)-1 = ? (I + x.vT)-1 = ? onde v = A-1x (I + U)-1 = I - U + U2 - U3 + … (x.vT)2 = c.x.vT onde c = vT.x (x.vT)r =cr-1.x.vT r=1,2,… (I + x.vT)-1 = I - (1+c)-1.x.vT

Inversa Incremental

(A + x.xT)-1 = A-1 - (1+c)-1.v.vT)

onde v = A-1x c = vT.x

X(n) = [x1,…,xn] = [X(n-1) ,xn]X(n)X(n)

T = .I + xi.xiT 0

X(n)X(n)T = A + xn.xn

Tvn = A-1.xn

w(n) = {A-1 - (1+xnT.vn)-1.vn.vn

T}.(X(n-1)y(n-1)+yn.xn)A-1.X(n-1)y(n-1) = w(n-1) A-1.yn.xn = yn.vn

vnT.X(n-1)y(n-1) = xn

T.A-1.X(n-1)y(n-1) = xnT.w(n-1)

vnT.yn.xn = yn.xn

XXT = [x1,…,xn] . [x1,…,xn]T

XXT = .I + xi.xiT 0

XXT = A + xn.xnT vn = A-1.xn

w(n) = w(n-1) + (yn-w(n-1)T.xn).(1+xn

T.vn)-1.vn

XXT = [x1,…,xn] . [x1,…,xn]T

XXT = .I + xi.xiT 0

XXT = A + xn.xnT vn = A-1.xn

w w + (yn-wT.xn).(1+xnT.vn)-1.vn

Adaptação LentaE(w) = (wT.xi – yi)2

E(n)(w) = E(n-1)(w) + (wT.xn – yn)2

∂E(n)/∂w = ∂E(n-1)/∂w + 2.(wT.xn – yn).xn ∂E(n)/∂w 0 + 2.(wn-1

T.xn – yn).xn ∂E(n)/∂w 2.(wn-1

T.xn – yn).xn

wn = wn-1 + .(yn-wn-1T.xn).xn

ADALINEE(w) = (wT.xi – yi)2

ADAptive LInear NEuron Adaptação lenta Método do gradiente Aprendizado online por exemplo Aprendizado distribuido vetores

wn = wn-1 + .(yn-wn-1T.xn).xn

… memórias lineares múltiplas Exemplos

(x1, y1), (x2, y2), … , (xn, yn) xi em Rk, yi em Rs

Neurônio Linear ŷ = W.x W = ?

Memória Ótima

W = Y.XT.(XXT)-1

20 1 2: ( ; ) m

mf R R f x w w x w x w x w

1T TX X X

2(1) (1) (1)

0(1) 2(2) (2) (2)1

2( ) ( ) ( )

nmmn n n

x x xw

ywx x xX

y wx x x

Polynomial Regression

0 1 ( ; )f x w w x w 2 30 1 2 3 ( ; )f x w w x w x w x w

2 50 1 2 5( ; )f x w w x w x w x w 2 10

0 1 2 10( ; )f x w w x w x w x w

-1.5 -1 -0.5 0 0.5 1 1.5

Regression with polynomials: fit improves with increased order

We want to fit the training set, but as model complexity

increases, we run the risk of over-fitting.

2( ) ( )

1 ˆ; 0n

-1.5 -1 -0.5 0 0.5 1 1.5

Train set

-1.5 -1 -0.5 0 0.5 1 1.50

Leave out

When the model order is over-fitting, leaving a single data point out of the training set can drastically change the fit.

Over-fitting

We want to fit the training set, but we want to also generalize correctly. To measure generalization, we leave out a data point (named the test

point), fit the data, and then measure error on the test point. The average error over all possible test points is the cross validation error.

2( ) ( ) (! )

iCV y f

(! )iwWeights estimated from a training set that does not include the i-th data point

Cross-Validation

2 4 6 8 10

Model order

2 4 6 8 100

1 2 3 4 5

Model order

(actual data was generated with a 2nd order polynomial process)

Model Selection

memória associativa linear ruy luiz milidiú. regressão linear objetivo examinar o modelo de...

Documents

“a importância da formação acadêmica na gestão em...

a03 bd-i - generalização, especialização, entidade...

vita associativa pasqua 2013 ””””hhhhiiiicccc...

contribuição das mulheres para a vida associativa da...

diodos. comparação: dispositivo linear - não linear 2 r d

Álgebra linear aula 3: espaços vetoriais 2 mauro...

fonte linear ajustavelfonte linear ajustavel...

programação linear (e rudimentos de otimização não...

integração associativa em métodos runge-kutta

cartilha gestão associativa e design social

transformação linear

trabalho de conclusÃo de...

realidade associativa

programação linear programação linear - prof. helder...

auschwitz: história e memória - fe.unicamp.br · do...

anÁlise estÁtica e dinÂmica, linear e nÃo-linear

otimizaÇÃo nÃo linear de grande porte tese … · no...

projeto de banco de dados - inf.ufpr.br · – solução:...

cefai, daniel et al. arenas públicas. por uma etnografia da...

relatÓrio cientÍfico final -...