multi-layer perceptron in matlab nn toolbox [part 1]

22
Perceptron Preliminary Training Network Use Functions Solve Problem Multi-Layer Perceptron in MATLAB NN Toolbox [Part 1] Yousof Koohmaskan, Behzad Bahrami, Seyyed Mahdi Akrami, Mahyar AbdeEtedal Department of Electrical Engineering Amirkabir University of Technology (Tehran Polytechnic) Advisor: Dr. F. Abdollahi Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 1 / 21

Upload: others

Post on 11-Feb-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Perceptron Preliminary Training Network Use Functions Solve Problem

Multi-Layer Perceptron in MATLAB NN Toolbox[Part 1]

Yousof Koohmaskan, Behzad Bahrami,Seyyed Mahdi Akrami, Mahyar AbdeEtedal

Department of Electrical EngineeringAmirkabir University of Technology (Tehran Polytechnic)

Advisor: Dr. F. Abdollahi

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 1 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Introduction

n Rosenblatt in 1961 created many variations of the perceptronn One of the simplest was a single-layer network whose weights

and biases could be trained to produce a correct target vectorwhen presented with the corresponding input vector

Neuron Model

3-3

Neuron ModelA perceptron neuron, which uses the hard-limit transfer function hardlim, is shown below.

Each external input is weighted with an appropriate weight w1j, and the sum of the weighted inputs is sent to the hard-limit transfer function, which also has an input of 1 transmitted to it through the bias. The hard-limit transfer function, which returns a 0 or a 1, is shown below.

The perceptron neuron produces a 1 if the net input into the transfer function is equal to or greater than 0; otherwise it produces a 0.

The hard-limit transfer function gives a perceptron the ability to classify input vectors by dividing the input space into two regions. Specifically, outputs will be 0 if the net input n is less than 0, or 1 if the net input n is 0 or greater. The following figure show the input space of a two-input hard limit neuron with the weights and a bias .

Input

p1

anp

2p3

pR

w1,

R

w1,1

�� f

a = hardlim (Wp + b)

b

1

Where

R = number of elements ininput vector

Perceptron Neuron

������

����

a = hardlim(n)

Hard-Limit Transfer Function

-1

n0

+1a

w1 1, 1, w1 2,– 1= = b 1=

Figure: Perceptron Representation-[1]

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 2 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Introduction

n There are many transfer function that can be used in theperceptron structure, e.g.

Neuron Model

3-3

Neuron ModelA perceptron neuron, which uses the hard-limit transfer function hardlim, is shown below.

Each external input is weighted with an appropriate weight w1j, and the sum of the weighted inputs is sent to the hard-limit transfer function, which also has an input of 1 transmitted to it through the bias. The hard-limit transfer function, which returns a 0 or a 1, is shown below.

The perceptron neuron produces a 1 if the net input into the transfer function is equal to or greater than 0; otherwise it produces a 0.

The hard-limit transfer function gives a perceptron the ability to classify input vectors by dividing the input space into two regions. Specifically, outputs will be 0 if the net input n is less than 0, or 1 if the net input n is 0 or greater. The following figure show the input space of a two-input hard limit neuron with the weights and a bias .

Input

p1

anp

2p3

pR

w1,

R

w1,1

�� f

a = hardlim (Wp + b)

b

1

Where

R = number of elements ininput vector

Perceptron Neuron

������

����

a = hardlim(n)

Hard-Limit Transfer Function

-1

n0

+1a

w1 1, 1, w1 2,– 1= = b 1=

Neuron Model

4-3

Neuron ModelA linear neuron with R inputs is shown below.

This network has the same basic structure as the perceptron. The only difference is that the linear neuron uses a linear transfer function purelin.

The linear transfer function calculates the neuron’s output by simply returning the value passed to it.

This neuron can be trained to learn an affine function of its inputs, or to find a linear approximation to a nonlinear function. A linear network cannot, of course, be made to perform a nonlinear computation.

Input

p1

anp

2p3

pR

w1,

R

w1,1

���� f

b

1

Where...

R = number of elements ininput vector

Linear Neuron with Vector Input

��������

a = purelin (Wp + b)

n0

-1

+1

��

a = purelin(n)

Linear Transfer Function

a

a purelin n( ) purelin Wp b+( ) Wp b += = =

5 Backpropagation

5-8

ArchitectureThis section presents the architecture of the network that is most commonly used with the backpropagation algorithm—the multilayer feedforward network.

Neuron Model (logsig, tansig, purelin)An elementary neuron with R inputs is shown below. Each input is weighted with an appropriate w. The sum of the weighted inputs and the bias forms the input to the transfer function f. Neurons can use any differentiable transfer function f to generate their output.

Multilayer networks often use the log-sigmoid transfer function logsig.

The function logsig generates outputs between 0 and 1 as the neuron’s net input goes from negative to positive infinity.

Input

p1

anp

2p3

pR

w1,

R

w1,1

���� f

b

1

Where

R = number of elements ininput vector

General Neuron

����

a = f(Wp +b)

-1

n0

+1

����

a

Log-Sigmoid Transfer Function

a = logsig(n)

Figure: hardlim, purelin, logsig

n Usually hardlim is selected as default

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 3 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron Architecture

n The perceptron network consists of asingle layer

n R input vectorn S output scaler

Here, we consider just one-layer percep-tron

Perceptron Architecture

3-5

Perceptron ArchitectureThe perceptron network consists of a single layer of S perceptron neurons connected to R inputs through a set of weights wi,j, as shown below in two forms. As before, the network indices i and j indicate that wi,j is the strength of the connection from the jth input to the ith neuron.

The perceptron learning rule described shortly is capable of training only a single layer. Thus only one-layer networks are considered here. This restriction places limitations on the computation a perceptron can perform. The types of problems that perceptrons are capable of solving are discussed in “Limitations and Cautions” on page 3-21.

w1,1

wS R,

S

S

S

n1

p1

p2

p3

pR

n2

nS

b1

b2

bS

a1

a2

aS

1

1

1

Input Perceptron Layer

S x 1

S x 1

S x 1

R x 1

S x R

S

n

p aW

b1

R

Input Perceptron Layer

Where

= number of elements in input

= number of neurons in layer

R

S

a = hardlim(Wp + b)

a = hardlim(Wp + b)

Figure: PerceptronArchitecture

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 4 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Mathematical Notation

n A single superscript is used to identify elements of a layer, e.g.n3 expresses input to layer 3

n Input weight matrix from layer k to layer l can be shown as IW l,k ,also for layer weight matrix, i.e.LW l,k

Mathematical Notation for Equations and Figures

A-3

Layer NotationA single superscript is used to identify elements of a layer. For instance, the net input of layer 3 would be shown as n3.

Superscripts k, l are used to identify the source (l) connection and the destination (k) connection of layer weight matrices and input weight matrices. For instance, the layer weight matrix from layer 2 to layer 4 would be shown as LW4,2.

Figure and Equation ExamplesThe following figure, taken from Chapter 12, “Advanced Topics,” illustrates notation used in such advanced figures.

Input weight matrix IWk, l

Layer weight matrix LWk, l

p1(k)

a1(k)1

n1(k) 2 x 14 x 2

4 x 1

4 x 1

4 x 1

Inputs

��IW1,1

����b1

2 4

Layers 1 and 2 Layer 3

a1(k) = tansig (IW1,1p1(k) +b1)������

5

3 x (2*2)����IW2,1

3 x (1*5)����

IW2,2

n2(k)

3 x 1

3������

����TDL

p2(k)

5 x 1����

TDL

1 x 4������LW3,1

1 x 3

������

1 x (1*1)

������

11 x 1���

���b3

����TDL

3 x 1

a2(k)

a3(k)n3(k)1 x 1 1 x 1

1������

a2(k) = logsig (IW2,1 [p1(k);p1(k-1) ]+ IW2,2p2(k-1))

0,1

1

1

a3(k)=purelin(LW3,3a3(k-1)+LW3,1 a1 (k)+b3+LW3,2a2 (k))

LW3,2

LW3,3

y2(k)1 x 1

y1(k)

3 x 1

Outputs

Figure: Example of a perceptronKoohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 5 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Matlab Notation Considerations

n superscripts⇔ cell array indices, e.g. p1 → p{1}n subscripts⇔ indices within parentheses, e.g. p2 → p(2)

n ⇒ superscripts + subscripts ,e.g. p12 → p{1}(2)

n indices within parentheses⇔ a second cell array index1 e.g.p1(k − 1)→ p{1, k − 1}

1also in the Matlab help, it should be corrected; p{1, k − 1]→ p{1, k − 1}Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 6 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Learning Rules

n It breaks into two parts, supervised and unsupervised learning,we consider the first topic

n The objective is to reduce the error e, which is the difference t-abetween the neuron response a and the target vector t

n There are many algorithms that do this and we will introducethem

n All the algorithms compute ∆w utilizing some methods

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 7 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Learning Alghorithms

n Hebb weight learning rule2

∆wk = cakpk

MATLAB command learnh

n Perceptron weight and bias learning function

∆wk = ek (pk )T

MATLAB command learnp

n Normalized perceptron weight and bias learning function

pn =p√

1 + ‖p‖2, ∆wk = ek (pnk )T

MATLAB command learnpn2in all, e: error, p: input and a: output

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 8 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Learning Alghorithms

n Widrow-Hoff weight/bias learning function

∆wk = cek (pnk )T

learning rate c,MATLAB command learnwh

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 9 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Training

n Training is using the learning rule to update the weights ofnetwork repeatedly

wknew = wk

old + ∆wk

MATLAB command train

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 10 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Utility Functions

n MATLAB command init, Initialize neural networkn MATLAB command sim, Simulate neural networksn MATLAB command adapt

Allow neural network to change weights and biases on inputsn MATLAB command newp, create perceptron

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 11 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem, step by step

n Creating a Perceptron, >> net=newp(P,T);

n Assigning initial weight and bias,>> net.IW{1,1}=

[w1 w2

];

>> net.b{1} = [b];

n Determining number of pass,>> net.trainParam.epochs = pass;

n Training Network, >> net = train(net,p,t);

n Test and Verifying, >> a = sim(net,p);

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 12 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem, step by step

Example 1:Classifying displayed data

-5 0 5 10-5

0

5

10

Vectors to be Classified

P(1)

P(2)

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 13 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem, step by step

Example 1; Solution% number of samples of each classN = 20;% define inputs and outputsoffset = 5; % offset for second classx = [randn(2,N) randn(2,N)+offset]; % inputsy = [zeros(1,N) ones(1,N)]; % outputs% Plot input samples with PLOTPV (Plot perceptroninput/target vectors)plotpv(x,y);net=newp(x,y);net = train(net,x,y); % train networkfigure(1)

plotpc(net.IW{1},net.b{1});

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 14 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem, step by step

Example 1; Solution

-5 0 5 10-6

-4

-2

0

2

4

6

8

10

Vectors to be Classified

P(1)

P(2)

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 15 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Note

n net.trainParam.epochs

n net.trainParam.goal

n Random Weights Initializationnet.inputweights{1,1}.initFcn = ’rands’;net.biases{1}.initFcn = ’rands’;net = init(net);

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 16 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem

Example 2:Classifying displayed data

-1 -0.5 0 0.5 1 1.5 2-1

-0.5

0

0.5

1

1.5

2

Class A Class B

Class CClass D

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 17 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem

Example 2; SolutionK = 30;% define classesq = .6; % offset of classesA = [rand(1,K)-q; rand(1,K)+q]; B = [rand(1,K)+q;rand(1,K)+q];C = [rand(1,K)+q; rand(1,K)-q]; D = [rand(1,K)-q;rand(1,K)-q];% plot classesplot(A(1,:),A(2,:),’bs’)hold on; grid onplot(B(1,:),B(2,:),’r+’); plot(C(1,:),C(2,:),’go’);plot(D(1,:),D(2,:),’m*’);% text labels for classestext(.5-q,.5+2*q,’Class A’); text(.5+q,.5+2*q,’Class B’)

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 18 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem

Example 2; Solution-Cont’dtext(.5+q,.5-2*q,’Class C’); text(.5-q,.5-2*q,’Class D’)a = [0 1]’; b = [1 1]’;c = [1 0]’; d = [0 0]’;P = [A B C D];% define targetsT = [repmat(a,1,length(A)) repmat(b,1,length(B)) ...repmat(c,1,length(C)) repmat(d,1,length(D)) ];net=newp(P,T);E = 1;net.adaptParam.passes = 1;linehandle = plotpc(net.IW{1},net.b{1}); n = 0;while (sse(E) & n<1000)n = n+1;[net,Y,E] = adapt(net,P,T);linehandle = plotpc(net.IW{1},net.b{1},linehandle);drawnow;end

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 19 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Perceptron problem

Example 2; Solution- Cont’d

-1 -0.5 0 0.5 1 1.5 2-1

-0.5

0

0.5

1

1.5

2

Class A Class B

Class CClass D

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 20 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

NNNNNNNNNN

Thank You

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 21 / 21

Perceptron Preliminary Training Network Use Functions Solve Problem

Matlab 7.9, Neural Network Toolbox Help MathWorks Inc. R2009b

Click: LASIN - Laboratory of Synergetics Website, Slovenia

Dr. F. Abdollahi, Lecture Notes on Neural Networks Winter 2011

Koohmaskan, Bahrami, Akrami, AbdeEtedal (AUT) Multi-Layer Perceptron - part 1 February 2011 21 / 21