trabalho proposto 1
TRANSCRIPT
-
7/27/2019 Trabalho Proposto 1
1/85
DUBLIN CITY UNIVERSITY
SCHOOL OF ELECTRONIC ENGINEERING
Artificial Neural Network Identification &
Control of an Inverted Pendulum
Barry N. Sweeney
August 2004
MASTER OF ENGINEERINGIN
ELECTRONIC SYSTEMS
Supervised by Ms. Jennifer Bruton
-
7/27/2019 Trabalho Proposto 1
2/85
ii
DeclarationI hereby declare that, except where otherwise indicated, this document is entirely my own work and has not
been submitted in whole or in part to any other university.
Signed: ...................................................................... Date: ...............................
-
7/27/2019 Trabalho Proposto 1
3/85
iii
AbstractThe purpose of this project is to illustrate the use of artificial neural networks (ANNs) in the
identification and control of a non-linear system. Non-linear systems are investigated with
respect to the dynamics of the inverted pendulum. The inverted pendulum is a classicexample of an unstable non-linear dynamic system. Consequently it has received much
attention, as it is an extremely complex and challenging control problem. The interesting
feature of neural networks is that it can learn from the environment in which the system is
being operated. The potential of ANNs to system identification and control is examined.
Subsequently feed-forward and recurrent neural networks are used to identify a robust
model of the inverted pendulum. Finally a neuro-controller is developed and implemented
using Borland C++, for control of the physical system.
-
7/27/2019 Trabalho Proposto 1
4/85
iv
AcknowledgementsI would like to take this opportunity to thank all those who supported and helped me
throughout the development and research of this project. Firstly, I would like to thank my
project supervisor Ms. Jennifer Bruton whose guidance and vast knowledge provedinvaluable, all the staff in the faculty of Engineering with especial thanks to Conor Maguire
for the setting up of the physical rig. Finally I would like to thank my family and girlfriend
for their support and patience throughout.
-
7/27/2019 Trabalho Proposto 1
5/85
v
Table of Contents
Declaration.....E
rror! Bookmark not defined.ii
Abstract..iii
Acknowledgementsiv
Table Of Contents..v
Table of Figures....vii
Chapter 1 Introduction..1
1.1 Motivation2
1.2 Outline of Report.2
Chapter 2 Inverted Pendulum..4
2.1 Mathamatical Equations.5
2.2 Modelling of the Inverted Pendulum.7
2.3 Closed-loop Control.9
2.4 Summary.....12
Chapter 3 Neural Networks13
3.1 Artifical neuron model.....143.3 Activation functions......16
3.3 Neural network architecture....16
3.4 Learning algorithms..19
3.5 Learning rules20
3.6 Neural Network Limitations.20
3.7 Applications....21
3.8 Summary.22
Chapter 4 System Identification...23
4.1 System Identification Procedure24
4.2 Conventional linear system Identification25
4.3 Non-linear System Identification using NARMAX..29
4.4 System Identification using Neural Networks..30
4.5 Javier's Linearised Model..39
4.6 Summary..42
Chapter 5 Real -Time Identification43
-
7/27/2019 Trabalho Proposto 1
6/85
vi
5.1 Closed-loop controller.44
5.2 Identification of physical system..45
5.3 Summary50
Chapter 6 Neuro Control.51
6.1 Supervised Control51
6.2 Unsupervised Control52
6.3 Adaptive neuro control..52
6.4 Model Reference Control..53
6.5 Direct inverse control54
6.6 Neuro Control in Simulink55
6.7 Real time neuro-control.56
6.8 Project Plan.62
6.9 Summary.62
Chapter 7 Conclusions.63
7.1 Future Recommendations..65
References.67
Appendix 170
Appendix 2...77
-
7/27/2019 Trabalho Proposto 1
7/85
vii
Table of FiguresFigure 2.1 Inverted Pendulum..4
Figure 2.2 Simulink model of linear pendulum...7
Figure 2.3 Subsystem block of linear pendulum..7
Figure 2.4 Simulink model of linear pendulum...8
Figure 2.5 Subsystem block of linear pendulum..8
Figure 2.6 Open loop response of inverted pendulum.9
Figure 2.7 Simulink model of linear pendulum and controller..10
Figure 2.8 Closed loop response of linear pendulum with controller10
Figure 2.9 Simulink model of non-linear pendulum and controller...11
Figure 2.10 Closed loop response of non-linear pendulum with controller...11
Figure 2.11 Inverted Pendulum, Time (t) = 0.0 secs..12
Figure 3.1 Biological Neuron.13
Figure 3.2 Artificial Neuron, McMulloch & Pitts (1943)..14
Figure 3.3 Perceptron Model..15
Figure 3.4 Activation functions..16
Figure 3.5 Multi-layer Feed-forward Network structure...16Figure 3.6 Multi-layer Recurrent Network structure.18
Figure 3.7 Supervised Learning.19
Figure 3.8 Unsupervised Learning.20
Figure 3.9 Local & global minimum..21
Figure 4.1 Input, Output, Disturbance of a System....23
Figure 4.2 System Identification Procedure...24
Figure 4.3 ARX model output with measured output26
Figure 4.4 ARX [4 3 1] model output and measured output..27
Figure 4.5 RARMAX [3 3 3 1] model output and process output.28
Figure 4.6 RARMAX[4 3 3 1] model output and validation data..29
Figure 4.7 Forward modelling of inverted pendulum using neural networks30
Figure 4.8 Neural Network training...31
Figure 4.9 Model validation set-up in simulink.32
Figure 4.10 feed-forward network, 1 hidden layer, 4 hidden neurons...33
Figure 4.11 feed-forward network, 2 hidden layers, 4 and 2 hidden neuron respectively.33
-
7/27/2019 Trabalho Proposto 1
8/85
viii
Figure 4.12 Elman network, 2 hidden layers, 4 and 2 neurons respectively..35
Figure 4.13 Elman network, 2 hidden layers, 15 and 10 neurons respectively..35
Figure 4.14 Feed-forward network, 1 hidden layer with 50 neurons.36
Figure 4.15 Feed-forward networks, 2 hidden layers, 30 & 20 neurons respectively...37
Figure 4.16 Difference between network trained with data scaled....37
Figure 4.17 Elman network, 2 hidden layers, 15 and 10 neurons respectively..38
Figure 4.18 Javiers Linearised Model...40
Figure 4.19 A comparison of Javiers Model & Non-linear Model...40
Figure 4.20 Feed-forward NN, 2 hidden layers, 30 and 20 neurons respectively..41
Figure 5.1 Set-up of pendulum rig.43
Figure 5.2 Real Time Task in simulink environment.....44
Figure 5.3 Zones of Control Algorithms....44
Figure 5.4 NN no-linear model output & physical system output.46
Figure 5.5 Pendulum angle of Real System...46
Figure 5.6 Validation set-up...47
Figure 5.7 feed-forward NN, 1 hidden layer with 75 neurons...47
Figure 5.8 feed-forward NN, 1 hidden layer with 75 neurons...48
Figure 5.9 System set-up with disturbance49
Figure 5.10 Pendulum angle during disturbance...49
Figure 5.11 Pendulum angle, with large excitation signal.50
Figure 6.1 Supervised learning using existing controller...52
Figure 6.2 Adaptive neuro control.53
Figure 6.3 Model Reference Control..53
Figure 6.4 Direct inverse control....54
Figure 6.5 Neuro Controller in simulink....55
Figure 6.6 Pendulum angle using neuro controller in Simulink....55Figure 6.7 neuro control of non-linear pendulum model...56
Figure 6.8 neuro control of non-linear model....56
Figure 6.9 Validation set-up...58
Figure 6.10 Neuro-Controller output.58
Figure 6.11 Neuro-Controller Structure.59
Figure 6.12 Pendulum Angle.60
Figure 6.13 Pendulum Angle.61
-
7/27/2019 Trabalho Proposto 1
9/85
1
Chapter 1
Introduction
Due to increasing technological demands and ever increasing complex systems requiring
highly sophisticated controllers to ensure that high performance can be achieved and
maintained under adverse conditions, there is a demand for an alternative form of control as
conventional approaches to control do not meet the requirements of these complex systems.
To achieve such highly autonomous behaviour for complex systems one can enhance
today's control methods using intelligent control systems and techniques. It is for this reason
that neural networks are of significant importance in the design and construction of the
overall intelligent controller for complex non-linear systems. Currently neural networks are
established in many application areas (expert systems, pattern recognition, system control,
etc.). These methods have received a lot of criticism during their existence (for example, see
Cheeseman, 1986). However this criticism has weakened as artificial neural networks have
been successfully applied to practical problems.
Artificial neural networks attempt to simulate the human brain. This simulation is
based on the present knowledge of the brain, and this knowledge is even at its best
primitive. The operation of the brain is believed to be based on simple basic elements called
neurons, which are connected to each other with transmission lines, called axons and
receptive lines called dendrites. The learning may be based on two mechanisms: the creation
of new connections, and the modification of connections. Each neuron has an activation
level, which, in contrast to Boolean Logic, ranges between some minimum and maximum
value. Neural network have several important characteristics which make them suitable for
the identification and control of a non-linear system, their features include
No need to know data relationships. Self-learning capability.
Self-tuning capability.
Applicable to model various systems.
Further to this neural networks contain non-linear elements that enables them to model and
control complex non-linear systems.
From a given transfer function the system response can be predicted. The reverse of
this process i.e. calculating the transfer function from a measured response is called system
identification. It is essentially a process of sampling the input and output signals of a
-
7/27/2019 Trabalho Proposto 1
10/85
2
system, and subsequently using the respective data to generate a mathematical model of the
system to be controlled. System identification enables the real system to be altered without
the need to calculate the dynamic equations and remodel the parameters again. Knowledge
of the dynamics of the system is useful in the determination of the neural network
architecture, its inputs, outputs and training process for dynamic model identification
purposes [1].
1.1 Motivation
The inverted pendulum problem is a classic example of an unstable non-linear dynamic
system [2]. Consequently it has received much attention, as it is an extremely complex and
challenging control problem. The dynamics of the inverted pendulum constitute great
difficulty in system identification and control. Conventional control systems have been
found wanting given raised demands for high performance, due to their inability to adapt to
new or unusual circumstances. Conventional control systems do not incorporate the
desirable control features, such as, non-linear capability, adaptation, flexible control
objectives and multivariable capability [3]. Thus there is a need for a control method which
addresses the non-linearities of an operating system, incorporating an adaptation capability.
Considering these control issues, artificial neural network evolve as a solution. ANNs have
several important characteristics that identify them as suitable for the identification and
control of non-linear systems: their ability to learn [4], their ability for the approximation of
non-linear functions and their inherent parallelism. The predominant goal of this project is
to identify an accurate model of the physical inverted pendulum; the majority of modelling
is first simulated using Matlab simulink. Subsequently a suitable neuro controller is
developed and implemented using Borland C++.
1.2 Outline of Report
Chapter 2 investigates non-linear systems with particular respect to the inverted pendulum
and its associated control difficulties. The dynamic equations are derived and subsequently
models developed for both the linear and non-linear model. The development of feedback
controllers is also detailed in this section. Chapter 3 introduces Artificial Neural Networks
detailing their basic components, structure, architecture and application in system
identification and control. Chapter 4 discusses the area of system identification and its
subsequent procedure.Traditional identification techniques are examined first with respect
-
7/27/2019 Trabalho Proposto 1
11/85
3
to the linear pendulum. Non-linear identification is subsequently performed using neural
networks. Chapter 5 details the set-up of the physical inverted pendulum; following on from
this real-time identification is performed. Chapter 6 details different neuro-control
techniques; subsequently a neuro-controller is developed and implemented using Borland
C++. Finally in Chapter 7, conclusions are drawn and future recommendations detailed.
-
7/27/2019 Trabalho Proposto 1
12/85
4
Chapter 2
Inverted Pendulum
A dynamic system is a system that changes during time. The starting point for the system is
the initial state and the final point is the equilibrium. Most often a dynamic system is
described by differential or difference equations, where the rate of change is a function of
time or some parameter. Basically all real systems are dynamic system. Most real-world
dynamic processes are nonlinear. Thus, nonlinear mathematical models are the most desired
ones [5]. The inverted pendulum is an example of a highly nonlinear and unstable dynamic
system. Pole-balancing is the task of keeping a rigid pole, hinged to a cart and free to fall in
a plane, in a roughly vertical orientation by moving the cart horizontally in the plane whilekeeping the cart within some maximum distance of its starting position (see Figure 2.1).
Despite the dynamics being well understood it is still a difficult process to accomplish, as
most people who have experimented with such devices will appreciate. Further still, if the
system parameters are not known precisely, then the task of constructing a suitable
controller is, accordingly, more difficult. Many researchers have restrained their control-
learning systems to simulations of the inverted pendulum, which can be accredited to the
level of difficulty associated with the control problem. In this chapter, the dynamic
equations will be derived for both the linear and non-linear pendulum. Using a mathematical
model of this form for a system, computer simulation is possible and subsequently the
respective models can be developed using Matlab simulink. The underlying aim of
modelling is that the developed models will have the same characteristics as the actual
process.
Figure 2.1 Inverted Pendulum
Inverted Pendulum
y2
y
z2
Force
L = length of pole,
m = mass of pole,M = mass of cart,
g = gravity
-
7/27/2019 Trabalho Proposto 1
13/85
5
2.1 Mathematical Equations
Lagranges equation of motion can be used for the analysis of mechanical systems.
,Fy
L
y
L
dt
d
=
(2.1)
withy(t) the generalised position vector, )(ty
the generalised velocity vector, and F(t)
the generalised force vector. The Lagrangian is L = K-U, the kinetic energy minus the
potential energy.
The kinetic energy of the cart is
= 21 21
yMK , (2.2)
The pole can move in both horizontal and vertical directions therefore the kinetic energy of
the pole is
)(2
12.
2
2.
22 zymK += , (2.3)
where y2and z2are equal to
,sin2 Lyy += cos2 Lz = , (2.4)giving
,cos..
2
.
Lyy += sin..
2 Lz = , (2.5)
Therefore the total kinetic energy,Kof the system is
+++=+=
.22
...2
.2
21 cos22
1
2
1 LLyymyMKKK , (2.6)
The potential energy due to the pendulum U is
cos2 mgLmgzU == , (2.7)
The Lagrangian function is
cos2
1cos)(
2
1 .22...2 mgLmLymLymMUKL +++== , (2.8)
-
7/27/2019 Trabalho Proposto 1
14/85
6
The state- space variables of the systems are y and , thus the Lagrange equations are
fy
L
y
L
dt
d=
.
, (2.9)
0.
=
LL
dt
d, (2.10)
By substituting for L and performing the partial differentiation produces
fmLmlymM =++ sincos)(.2
....
, (2.11)
0sincos..
2..
=+ mgLmLymL , (2.12)
The above dynamic equations can be placed into state-space form, this is achieved by
expressing the Lagrange equation in terms of matrices
+=
+
sin
sin
cos
cos.2
..
..
2
mgL
fmLy
mLmL
mLmM, (2.13)
This gives a mechanical system in typical Lagrangian form, i.e. the inertia matrix
multiplying the acceleration vector. Inverting the inertia matrix and simplifying, the
following non-linear equations describing the inverted pendulum are derived [7].
)(cos
sincossin2
.2..
mMm
fmLmgy
+
=
, (2.14)
LmMmL
fmLgmM
)(cos
coscossinsin)(2
.2..
+
+++=
, (2.15)
As some of the modelling involved in this project is linear, these equations must be
linearised. The simplest approach is to approximate cos1 and sin0. In addition to
this, the quadratic terms are extremely small. Consequently they are set to zero. This yields
the linear system equations
Mmgfy =
..
, gML
mMML
f ++=..
, (2.16)
-
7/27/2019 Trabalho Proposto 1
15/85
7
2.2 Modelling of the Inverted Pendulum
Given the set of equations describing the linear and non-linear inverted pendulum, the next
step is modelling. The models are developed using Matlab simulink, which provides an
environment for computer simulation. The first model developed is the linear model of theinverted pendulum, which for proficiency is encapsulated in a subsystem block, see Figure
2.2 and Figure 2.3 respectively.
Figure 2.2 Simulink model of linear pendulum
Figure 2.3 Subsystem block of linear pendulum
The subsystem block is set up using a mask. This enables the parameters m, l, g and M to be
altered for different simulations. As it is desired to model accurately the physical rig, the
parameters are taken from this system, the mass of the pendulum, m is set to 0.11 Kg, the
mass of the cart, M is set to 1.2 Kg, the length of the pendulum, L is set to 0.4 meters with
gravity, g set to 9.8m/s2. The non-linear model is developed in a similar manner to the linear
system and the parameter values remain the same, see Figure 2.4 and Figure 2.5 on the
following page.
-
7/27/2019 Trabalho Proposto 1
16/85
8
Figure 2.4 Simulink model of the non-linear pendulum
Figure 2.5 Subsystem block of linear pendulum
Both pendulum models are simulated with an input of a step, to check the stability of the
systems. The system can be said to be input-output stable if it responds to every bounded
input variable with a bounded output variable [7]. Subsequently input-output instability can
be generalised as, for every bounded input variable the output variable goes unbounded. The
angle of the pendulum is shown in Figure 2.6, on the following page. The simulation shows
that the pendulum is open loop unstable i.e. the pendulum falls over.
-
7/27/2019 Trabalho Proposto 1
17/85
9
Figure 2.6 Open loop response of the inverted pendulum
2.3 Closed-loop Control
System identification requires the collection of interesting data. In order to accurately
model the inverted pendulum and to generate suitable input/output data for system
identification it is necessary to stabilise it. This is achieved using a PID controller. Other
methods could have been used to stabilise the linear pendulum such as full state feedback.
However PID control is chosen for its ease of implementation. PID controllers have a long
history in control engineering as they have been proven to be stable, simple and robust for
many real life applications. ThePaction is related to the present error, the Iaction is based
on the past history of error, while the Daction relates to the future behaviour of the error.
These actions roughly estimate to filtering, smoothing and prediction problems respectively.
The equation of a PID controller is given by
dt
tdeKdeKteKtu
t
o
Dp
)()()()( 1 ++= , (2.17)
There are several methods to design PID controllers. Initially Ziegler-Nichols method was
tested. However the optimum parameters obtained from this method offered a pure
response. Subsequently the parameters are chosen heuristically and subjectively, that is by
trial and error testing. The simulink model with controller is shown is Figure 2.7 on the
following page. The pendulum is now in a closed loop, thus the PID controller is de-tuned
so that the dynamics of pendulum are emphasised, and a band-limited white noise signal is
-
7/27/2019 Trabalho Proposto 1
18/85
10
used as the input to the system to emulate the disturbances the physical rig would be
subjected to. In the closed loop model in Figure 2.7 there is a graphic visualisation of the
inverted pendulum when compiled, which has been adapted from the Matlab demonstration
slcp.mdl.
Figure 2.7 Simulink model of linear pendulum and controller
The linear pendulum is simulated, see Figure 2.8 for the angle of the pendulum. The closed
loop system with PID controller keeps the pendulum angle stable. This allows for longer
simulation of the pendulum and more importantly the generation of information rich data for
system identification purposes.
Figure 2.8 Closed loop response of linear pendulum with controller
-
7/27/2019 Trabalho Proposto 1
19/85
11
The next step is to develop a controller for the non-linear pendulum. Similar to the control
of the linear pendulum, a PID controller was developed, see Figure 2.9. PID control of the
non-linear pendulum is possible because the simulation starts with the pendulum in a
linearisied region that is with the pendulum in an up-right position see Figure 2.11.
Figure 2.9 Simulink model of non-linear pendulum and controller
The closed loop response of the pendulum angle is shown in Figure 2.10. The closed loop
response with PID control is stable, thus suitable input-output data for system identificationpurposes is obtained. Similar to the linear closed loop system, the PID controller has been
de-tuned so that the full dynamics of the pendulum emphasised and not the controller
dynamics.
Figure 2.10 Closed loop response of non-linear pendulum with controller
-
7/27/2019 Trabalho Proposto 1
20/85
12
Figure 2.11 Inverted Pendulum, Time (t) = 0.0 secs.
2.4 Summary
The dynamic equations for both the linear and non-linear pendulum have been derived and
subsequently models for each developed using Matlab simulink. It was evident that the
system is open loop unstable, i.e. for a bounded input variable the output variable goes
unbounded. However a criterion for accurate system identification is that the process must
be stable. Consequently simple PID controllers were developed which stabilised the system.
They were de-tuned so that the dynamics of the pendulum is emphasised and interesting
data would be generated for system identification purposes. The subsequent chapter details
the theory, operation and structure of artificial neural networks.
Pendulum in
Upright Position
Time (t) = 0.0 secs
-
7/27/2019 Trabalho Proposto 1
21/85
13
Chapter 3
Artificial Neural Networks
A neural network is an information-processing paradigm inspired by the way, the brain
processes information [8]. It is composed of a large number of highly interconnected
processing elements (neurons) working in parallel to solve a specific problem. ANNs learn
by example, trained using known input/output data sets to adjust the synaptic connections
that exist between the neurons. The Biological Neuron is composed of a large number of
highly interconnected processing elements called neurons and are tied together with
weighted connections or synapses. Learning in biological systems involves adjustments to
the synaptic connections that exist between the neurons. These connections store theknowledge necessary to solve specific problems.
Figure 3.1 Biological Neuron [9]
One of the most interesting features of NNs is their learning ability. This is achieved by
presenting a training set of different examples to the network and using learning algorithms,
which changes the weights (or parameters of activation functions) in such a way that the
network will reproduce a correct output with the correct input values. One encountered
difficulty is how to guarantee generalisation and to determine when the network is
sufficiently trained. Neural networks offer non-linearity, input-output mapping, adaptivity
and fault tolerance. Non-linearity is a desired property if the generator of the input signal is
inherently non-linear [10]. The high connectivity of the network ensures that the influence
of errors in a few terms will be minor, which ideally gives a high fault tolerance.
-
7/27/2019 Trabalho Proposto 1
22/85
14
3.1 Artificial neuron model
In ANNs the inputs are combined in a linear way with different weights. Walter Pitts
developed the first model of an elementary computing neuron.
Figure 3.2 Artificial Neuron, McMulloch & Pitts (1943)
Each neuron consists of a processing element with synaptic input connections and a single
output. The initial step is a process whereby the inputs nxxxx ....3,2,1 are multiplied by their
weights respectively nwwww ....,, 321 and then summed by the neuron. The summation
process may be defined as
= =n
i
ii xwnet1
. , (3.1)
Further to this a threshold value or bias may be included, subsequently the summation
process may be rewritten as
bxwnetn
i
ii +
=
=1
. , (3.2)
A non-linear activation function f is generally included in the neuron arrangement, this is
added to introduce non-linearities into the model. The output of the neuron can now be
expressed as (see Figure 3.3)
)(netfy= , (3.3)
T O
W1
W2
Wn
X1
X2
Xn
-
7/27/2019 Trabalho Proposto 1
23/85
15
Figure 3.3 Perceptron Model
Using the back propagation algorithm the weights are dynamically updated. The error
between the target output and the actual output is calculated as
)()()( kyktke = , (3.4)
The error is then back propagated through the layers and the weights adjusted accordingly
by the formula
)().(.)()1( kxkekwnw =+ , (3.5)
The feed-forward process is subsequently repeated. The weights are updated and adjusted
on each pass until the error between the target and the actual output is low i.e. the model has
been sufficiently trained.
Y
W1
W2
Wn
X1
X2
Xn
f(net)net
b
learning target
error
-
7/27/2019 Trabalho Proposto 1
24/85
16
3.2 Activation functions
There are number of different types of activation functions such as step, ramp, sigmoid etc.
However the most commonly used activation functions are tan-sigmoid, log-sigmoid and
linear. The effect of the linear function is to multiply by a factor a constant factor. The
sigmoid function has an S shaped curve, see Figure 3.4.
Figure 3.4 Log-sigmoid function Tan-sigmoid function Linear function
3.3 Neural Network architecture
Network architectures can be categorised to two main types according to their connectivity:
feed-forward networks and recurrent networks (feedback networks). A Network is feed-
forward if all of the hidden and output neurons receive inputs from the preceding layer only.
The input is presented to the input layer and it is propagated forwards through the network.
Output never forms a part of its own input, see Figure 3.5.
Figure 3.5 Multi-layer Feed-forward Network structure
f(net) f(net)
0
0
1 1
-1
f(net)
net
net
net
N
N
N
1f
1f
1f
2f
2f
2f
X1
X2
1Y1
1Y2
1Y3
1b1
1b2
1b3
2b1
2b2
2b3
1w1,1
1wn
2w1
2wn
2Y1
2Y2
2Y3
Input layer hidden layer Output layer
-
7/27/2019 Trabalho Proposto 1
25/85
17
In a multi-layer feed-forward network each layer has a weight matrix w, a bias vector b, and
an output vector Y. Thus the network in Figure 3.5, has an input vector
=
2
1
x
xx , (3.6)
The weights of the network are the weight matrices
=
3,2
1
2,2
1
1,2
1
3,1
1
2,1
1
1,1
1
1
w
w
w
w
w
w
W ;
=
3,2
2
2,2
2
1,2
2
3,1
2
2,1
2
1,1
2
2
w
w
w
w
w
w
W , (3.7)
The biases are the bias vectors
=
3
1
2
1
1
1
1
b
b
b
b ;
=
3
2
2
2
1
2
2
b
b
b
b , (3.8)
The output of the network can now be written as
+= )( 11111 bxwfy )( 21
122
1
2 bywfy += , (3.9)
+= )( 11121 bxwfy )( 22
122
2
2 bywfy += , (3.10)
+= )( 11131 bxwfy )( 23
121
3
2 bywfy += , (3.11)
The premise behind the addition of extra layers or nodes enables the network to deal with
more complex problems and extract higher order statistics. Cybenko proved that a feed-
forward network with a sufficient number of hidden neurons with continuous and
differentiable transfer functions could approximate any continuous function over a closed
interval [11]. There is not a limit to the number of hidden layers. These hidden layers
increase the non-linear complexity of a network, however, Hornik and indeed other
researchers have shown that even a two-layer network with a suitable number of nodes in
the hidden layer can approximate any continuous function over a compact subset [12]. Thus,
generally just one or two hidden layers are used. Subsequently it is implied that a feed-
forward neural network with just one hidden layer is suitable for the purpose of
identification.
-
7/27/2019 Trabalho Proposto 1
26/85
18
Recurrent network have at least one feedback loop, i.e., cyclic connection, which
means that at least one of its neurons feed its signal back to the inputs of all the other
neurons. The behaviour of such networks may be extremely complex [13].
Figure 3.6 Multi-layer Recurrent Network structure
The effect of the feedback loop enables the control of outputs through outputs, thus giving
recurrent networks memory. This is especially meaningful if the network is
approximating functions dependent on time. Considering dynamics of the inverted
pendulum, this is particularly applicable i.e. there is several feedback loops in the developed
model of the pendulum. There are two main types of recurrent networks widely used,
Elman and Hopfield. The feedback loop in Elman networks enables them to learn to
recognise and generate spatial as well as temporal patterns [14]. An Elman network can
approximate any function (with a finite number of discontinuities) with arbitrary
accuracy, if the hidden layer has a sufficient number of neurons [15]. Hopfield is generally
N
N
N
f
f
f
f
f
f
X1
X2
1Y1
1Y2
1Y3
1b1
1b2
1b3
2b1
2b2
2b3
w1
wn
2w1
2wn
2Y1
2Y2
2Y3
z-1
z-1
z-1
z-1
-
7/27/2019 Trabalho Proposto 1
27/85
19
used for classification of feature vectors [16]. Subsequently Elman networks will be the
chosen as the recurrent network architecture for identification purposes.
3.4 Learning Algorithms
In neural networks learning ability is achieved by presenting a training set of different
examples to the network and using learning algorithm to changes the weights (or the
parameters of activation functions) in such a way that the network will reproduce a correct
output with the correct input values. There are three main classes of learning reinforced,
supervised, and unsupervised. The latter two will be considered here.
In the supervised learning procedure a set of pairs of input-output patterns. The
network propagates the pattern inputs to produce its own output pattern and then compares
this with the desired output. The difference is the error, if this is absence; learning is
stopped, if present error is back propagated to have weight and bias changed. A supervised
learning scheme is illustrated in Figure 3.7.
Figure 3.7 Supervised Learning
In unsupervised learning there is no external learning signals to adjust the networks weights.
The approach adopted is to internally monitor their performance. It proceeds by seeking
trends in the input signals making adaptations in the according to the network function. At
present unsupervised learning is not fully understood and still the subject of much research.
An unsupervised learning scheme is illustrated in Figure 3.8.
X1
X2
e error Learning signal
t
Input
training data
-
7/27/2019 Trabalho Proposto 1
28/85
20
Figure 3.8 Unsupervised Learning
3.5 Learning rules
Hebbs Rule was the first rule developed. The rule declares that when a neuron receives an
input from another neuron, and if both are highly active, then the weight between the
neurons should be strengthened. Kohonens learning rule is a procedure whereby
competing processing elements contend for the opportunity to learn. The only permitted
output is from the winning element, furthermore this element plus its adjacent neighbours
are permitted to adjust their connection weights. It should also be noted that the size of the
neighbourhood may adjust during this training period. The Back propagation learning
algorithm is perhaps the most popular learning algorithm. The net simply propagates the
pattern inputs to outputs to produce its own patterns comparing this with the desired output,
the difference being the error. If no error is present learning stops, however if error is
present, it is back propagated to change weights and biases, this recurs until no error is
present.
3.6 Neural Network Limitations
Neural Networks do have a number of limiting factors including training times, opacity and
local minima. Neural networks may require exhaustive training times especially with large
dimensional problems due to the increased number of synaptic weights required to be
adjusted. However this problem is gradually disintegrating with ever-increasing computer
processing capabilities. Opacity is associated with Neural Networks that operate as black
-
7/27/2019 Trabalho Proposto 1
29/85
21
boxes. In a Neural Network that operates as a black box, only the inputs and outputs are
visible, thus it is difficult to relate to the parameters of the system under consideration to the
internals of the network. Subsequently it is difficult to obtain an intuitive feel for its
operation as their performance can only be measured statistically. A major concern
associated with training is the possibility of becoming trapped in a local minima, see Figure
3.9 below.
Figure 3.9 Local & global minimum
The global minimum represents the lowest point on the graph, which is the optimal solution
for the problem. The majority of training algorithms operate by travelling down these slopes
until they find the lowest point. The training algorithm may become trapped in a non-
optimal solution i.e. local minimum. There are a number of possibilities to overcome this
issue such as the use of momentum terms or Boltzman annealing. However there is a trade-
off with increased training time.
3.7 Applications
Neural networks have the capability of adaptively controlling and modelling a wide range of
non-linear process at a high level of performance. In particular to obtain models which
describe system behaviour i.e. system identification. Modelling is extremely important as itprovides a tool for simulating effects of different control strategies techniques. Using the
back propagation algorithm a feed-forward network can be trained to approximate arbitrary
non-linear input-output mappings from a collection of example input and output patterns.
This learning technique has been applied in a wide variety of pattern classification and
modelling tasks. Neural networks have found several fields of application including
medicine, finance, management and other signal processing applications.
-
7/27/2019 Trabalho Proposto 1
30/85
22
3.8 Summary
Undoubtedly Neural Networks provide an extremely powerful information-processing tool.
They are ideal for control systems because of their non-linear approximation capabilities,
adaptive control and computational efficient due to their parallel architecture. Their ability
to learn by example makes them both flexible and powerful. They also remove the need to
explicitly know the internal structure of a specific task. Considering the specific control
problem of the inverted pendulum, it is apparent that Neural Networks have certain features
which make them extremely suitable for identification and control of such a challenging
control problem.
-
7/27/2019 Trabalho Proposto 1
31/85
23
Chapter 4
System Identification
The modelling and identification of linear and non-linear dynamic systems through the use
of measured experimental data is a problem of considerable importance in engineering [17],
and has duly received much attention. System identification is essentially a process of
sampling the input and output signals of a system, subsequently using to respective data to
generate a mathematical model of the system to be controlled i.e. it is procedure whereby a
model is developed. Figure 4.1 depicts a system inputs/outputs. The motivation behind
system identification is to obtain a model that enables the design and implementation of a
high level performance control system, while providing an insight into system behaviour,
prediction, state estimation, simulation etc. [18]
Figure 4.1 Input, Output, Disturbances of a System
Identification of multivariable systems is an extremely difficult problem due to the coupling
between various inputs and outputs, further complicated when systems are non-linear [19].
Neural networks are becoming increasingly recognised for this purpose due to their
attributes parallelism, adaptability, robustness and their inherent capability to handle non-
linear systems. Intuitively the inverted pendulum is extremely unstable, further to this from
the modelling of the pendulum in Chapter 2 the system is open-loop unstable. However
stability is a necessary criterion for system identification. Duly the system was placed in a
closed loop and stabilised using PID control. As the system is in a closed loop, it is
desirable that little of dynamics of the controller be seen at the system output. To achieve
this the controller was de-tuned i.e. the control was left loose, this ensures that the
input/output data generated emphasised the dynamics of the pendulum, thus is suitable data
for system identification.
Systeminput Uoutput Y
disturbance e
-
7/27/2019 Trabalho Proposto 1
32/85
24
Before neural networks are directly used for system identification, conventional linear
techniques such as auto regressive with exogenous input (ARX) and recursive auto
regressive moving average with exogenous input (RARMAX) will be investigated and
applied to the linear pendulum.
4.1 System Identification Procedure
System identification is essentially the process of adjusting the parameters of the model
until the model output resembles the output of the real system. The procedure for system
identification can be viewed graphically in Figure 4.2. The procedure can be categorised
into three main stages [20]:
Experimental input/output data from the process that is being modelled is required. With
respect to the inverted pendulum system this would consist of the input force on the cart
and the pendulum angle.
The second stage is to choose which model structure to use.
Subsequently the parameters of the model will be adjusted until the model output
resembles the system output.
Figure 4.2 System Identification Procedure
Experimental
Design
Data
Choose
Model SetChoose
Criterion
of Fit
Calculate Model
Validate
Model
Prior Knowledge
Not OK
Revise
Ok use it.
-
7/27/2019 Trabalho Proposto 1
33/85
25
4.2 Conventional linear system identification
The ARX model structure is a simple linear difference equation which relates the current
outputy(t)to a finite number of past outputsy(t-k)and inputs u(t-k).
)1()()()1()( 11 +++=++= nbnktubnktubnatyatyaty nbna , (4.1)
or in more compact form
)()(
1)(
)(
)()( te
qAnktu
qA
qBty += , (4.2)
Thus the ARX structure is defined by the three integers na, nb, and nk. nais the number of
poles and nb-1is the number of zeros, while nkis the pure time-delay in the system. For a
system under sampled-data control, generally nk is equal to 1. The main method used to
estimate the aand bcoefficients in the ARX model structure is the Least Squares method. It
proceeds by minimising the sum of squares of the right-hand side minus the left-hand side
of the expression above, with respect to a and b [21].
The RARMAX is an extension from the ARMAX structure, which in turn is an extension
from the ARX model structure. RARMAX model recursively estimates the a and b
coefficients in the ARMAX model structure. However, the ARMAX structure also includes
an extra C parameter in the noise spectrum model. Consequently RARMAX provides
greater accuracy.
)()(
)()(
)(
)()( te
qA
qCnktu
qA
qBty += , (4.3)
The data from model of the linear pendulum is exported to the Matlab workspace and
subsequently split into estimation and validation data. Changing the initial seed of the
excitation signal in the simulation creates the validation data. The ARX model is
implemented using Matlab functions as follows:
input_estim = force_1; %input data for estimation
output_estim= theta_1; %output data for estimation
input_val= force_2; %input data for validation
output_val= theta_2; %output data for validation
orders = [4 5 1]; % defines model structure
-
7/27/2019 Trabalho Proposto 1
34/85
26
arx_model = arx([output_estim input_estim],orders); %arx function
compare([output_val input_val],arx_model); %compare function
The parameters of ARX model are chosen heuristically and subjectively. The compare
function is used to compare the model output with the validation data. Initially the open
loop unstable model of the inverted pendulum is modelled. A criterion for system
identification is that the system is stable. As the open loop system is inherently unstable the
ARX model completely fails to identify the pendulum as expected, see Figure 4.3.
Figure 4.3 ARX model output with measured output
Subsequently input/output data is generated from the controlled closed loop model of the
inverted pendulum. It is anticipated that the ARX model should identify the stable model.
Further to this as the complexity of the models increased the accuracy should also increase;
from the results obtained this was proven to be correct. Table 4.1 shows the different
parameters tested and the resulting performance from these models.
-
7/27/2019 Trabalho Proposto 1
35/85
27
ARX [na nb nk] ARX performance1 1 1 21.94%
2 2 1 93.28%
3 2 1 93.28%
3 3 1 99.90%4 2 1 93%
4 3 1 100%
Table 4.1 ARX model performance
Figure 4.4 shows the best model performance tracking both actual output from the system
and the model output. From the results the ARX model identifies the pendulum dynamics
with extremely good accuracy.
Figure 4.4 ARX [4 3 1] model output and measured output
Having identified the linear system using the arx model, the rarmax subsequently is tested.
The results obtained are marginally better to that using the arx method, see Table 4.2 and
Figure 4.5. This is expected as this method includes an additional Cparameter in the noise
spectrum model. The Matlab script for this is implemented as follows
orders = [3 3 3 1]; %model structure
[rarmax_model,yhat]=rarmax([output_estim input_estim],orders,'ff',); % function
-
7/27/2019 Trabalho Proposto 1
36/85
28
RARMAX [na nb nc nk ] RARMAX performance
2 1 1 1 98.98%
2 2 2 1 99.01%
3 2 2 1 99.34%
3 3 2 1 99.58%
3 3 3 1 100%
Table 4.2 RARMAX model performance
Figure 4.5 RARMAX [3 3 3 1] model output and process output
The results using linear identification techniques (arx, rarmax) verify that the system must
be stabilised before identification can be performed. Also these conventional linear
identification techniques performed extremely well in modelling the dynamics of the linear
pendulum. However their representation ability of non-linear systems is restricted. For
completeness the arx and rarmax are tested to verify this. The input-output data for the non-
linear pendulum is generated in a similar manner as previously for the linear model. Again
parameters for both models are chosen heuristically and subjectively. Figure 4.6 shows the
best model identified. As expected these conventional identification techniques cannot
identify the full dynamics of the non-linear pendulum.
-
7/27/2019 Trabalho Proposto 1
37/85
29
Figure 4.6 RARMAX [4 3 3 1] model output and validation data
4.3 Non-linear System Identification using NARMAX
Linear identification techniques are well established. However their representation ability of
non-linear processes is clearly limited. Subsequently, non-linear black-box model structures
have been developed, however they are still the subject of much debate. One such technique
is non-linear auto regressive moving average with exogenous inputs or narmax. The narmax
model can be described by
)())1(),...,(),1(),.....,(()1( tentutuntytyhty uy +++=+= ))
)
)
, (4.4)
The main problem with narmax is how to construct a model that easily estimated and used
to construct a systems dynamics in practical terms [22]. The main disadvantage in the
narmax estimation procedure is the need to select the most useful terms to be included in the
model, which are chosen from a large number of available model terms usually running into
thousands. This presents the most challenging procedure in the estimation of narmax
structures since it is dependent on factors like the sampling frequency and prior knowledge
about the system orders [23]. As such non-linear identification techniques such as narmax
do not offer a suitable solution for system identification in practical terms.
-
7/27/2019 Trabalho Proposto 1
38/85
30
4.4 System Identification using Neural Networks
In this section neural networks are implemented for the identification of both the linear and
non-linear inverted pendulum models. In Chapter 3, different neural networks structures
were examined, consequently two structures were identified as suitable for the system
identification of the inverted pendulum, feed-forward and recurrent (Elman) networks. A
common structure for achieving system identification using neural networks is forward
modelling, Figure 4.7. This form of learning structure is a classic example of supervised
learning. The neural network model is placed in parallel with the system both receiving the
same input, the error between the system and network outputs is calculated and
subsequently used as the network training signal.
Figure 4.7 Forward modelling of inverted pendulum using neural networks
In order to provide targets for the network, the previously developed simulink models of the
inverted pendulum with feedback control are used. The control force is used as the input to
the neural network while the target for the network is the angle of the pendulum theta
(radians). For completeness the linear model of the inverted pendulum shall be identified
first. The first type of neural network tested is the feed-forward. Using Matlab it is possible
to develop multi-layer perceptons. However this shall be restricted to either one or two
hidden layers, as research previous has shown this to be sufficient for the identification of
non-linear systems. It is also expected that increasing the number of neurons in the hidden
layer will improve the models accuracy.
Inverted
Pendulum
Learning
Algorithm
System i/p System o/p
Adjust weights
+
-
-
7/27/2019 Trabalho Proposto 1
39/85
31
A feed-forward back propagation network is created using Matlab script as follows
net = newff([-10 10],[10 1], {'tansig' 'purelin'},'trainlm');
net.trainParam.epochs = 400;
net.trainParam.lr = 0.0001;
net = train(net,in(1:2000)',theta(2:1000)');
In the example above a two-layer feed-forward network is created. The network's input
ranges from [-10 to 10]. The first layer has ten tansig neurons, the second layer has one
purelin neuron. The is the standard set-up of activation functions for multi-layer perceptons,
the hidden layer has non-linear functions however the output layer always has a linear
activation function. The trainlm network training function is used. Back-propagation
updates the weights. The number of epochs and learning rate can be set and adjusted. By
examining the training diagram it can be determined when the network is sufficiently
trained, and whether the convergence is too fast; this can sometimes account for getting
struck in a local minimum, see Figure 4.8.
Figure 4.8 Neural network training
When the network is sufficiently trained, the network is exported to the simulink
environment using the gensim command. To ensure the network is adequately validated
the initial seed of the input signal must be changed. The mean squared error and a
comparison of the systems and networks output calculate the quality of each model. The
mean squared error is not alone sufficient to determine to the quality of the model as there
-
7/27/2019 Trabalho Proposto 1
40/85
32
could be a low mean squared error and yet a poor prediction of the dynamics of the system.
The simulink set-up for model validation may be viewed in Figure 4.9.
Figure 4.9 Model validation set-up in simulink
The following table 4.3 summarises the results obtained.
NN Architecture Hidden Layers Neurons in Hidden Layer TrainingEpochs
MSE
feed-forward 1 4 400 0.000061404feed-forward 1 10 400 0.0029
feed-forward 1 20 400 0.00007839
feed-forward 1 35 400 0.001
feed-forward 1 50 400 0.0013
feed-forward 2 [4 2] 400 0.000059481
feed-forward 2 [10 4] 400 0.000060253
feed-forward 2 [20 10] 400 0.00040555
feed-forward 2 [30 20] 400 0.0002063
Table 4.3 Summary of feed-forward neural network performance
A nominal number of neurons in the hidden layer were sufficient to identify the model
extremely well, above this threshold the performance was decreased. Tim Callinans work
on identification of the inverted pendulum using feed-forward neural networks, accredited
this to the fact, it is in a closed loop as such de-tuning the controller has a greater effect on
the models performance than an increase in the number of neurons [24]. The lowest mean
squared error and overall best performance was achieved using two hidden layers with four
and two neurons respectively. This is expected, as previously discussed, an increase in
hidden layers should improve the models performance. Figure 4.10 and 4.11 show the
-
7/27/2019 Trabalho Proposto 1
41/85
33
optimum performance obtained using one, and two hidden layer feed-forward networks
respectively, plotting the process output against the model output. Overall the feed-forward
networks model the process well, predicting the pendulum angle with a low MSE error.
Figure 4.10 feed-forward network, 1 hidden layer, 4 hidden neurons
Figure 4.11 feed-forward network, 2 hidden layers, 4 and 2 hidden neurons respectively
-
7/27/2019 Trabalho Proposto 1
42/85
34
The next type of ANN tested are Elman networks. As discussed in Chapter 3, Elman
networks are expected to perform well, due to their feedback loop providing dynamic
memory, subsequently making them suitable in the prediction of dynamic systems. The
Elman network is implemented using Matlab functions as follows:
net = newelm([-10 10],[10 5 1], {'tansig' 'tansig' 'purelin'},'trainlm');
net.trainParam.epochs = 400;
net.trainParam.lr = 0.0001;
net = train(net,in(1:2000)',theta(1:2000)');
In the example above, an Elman network with two hidden layers is created. The network's
input ranges from [-10 to 10]. The first layer has ten tansig neurons, the second layer has
five tansig neurons and finally the output layer has a single linear purelin neuron.
After initially testing it is found the Elman models fail to predict the angle of the pendulum
and the model predictions are completely out of range. Consequently several pre-processing
techniques of signals for neural networks were implemented. It is found that scaling the
input/output data for training has a small filtering effect; subsequently improving the
networks performance during training. A scaling factor of ten was used, and all validation
data is also scaled accordingly. By filtering the data in this way it decreases the possibility
of the network getting caught in a local minimum. The improvement in results was
extremely good and the models successfully predicted the pendulum angle. Table 4.4
summaries the results obtained.
NN Architecture Hidden Layers Neurons in Hidden Layer TrainingEpochs
MSE
Elman (Recurrent) 1 4 400 0.0334
Elman (Recurrent) 1 10 400 0.0033
Elman (Recurrent) 1 20 400 0.0014
Elman (Recurrent) 1 25 400 0.0011
Elman (Recurrent) 2 [4 2] 400 0.0023
Elman (Recurrent) 2 [10 4] 400 0.00359
Elman (Recurrent) 2 [15 10] 400 0.003
Table 4.4 Elman network performance
Despite some Elman models having a low MSE error, Figure 4.12 shows that they still
failed to adequately identify the pendulum angle. It found however that an increase in the
number of neurons and hidden layers significantly improved their performance, see Figure
4.13.
-
7/27/2019 Trabalho Proposto 1
43/85
35
Figure 4.12 Elman network, 2 hidden layers, 4 and 2 neurons respectively
Figure 4.13 Elman network, 2 hidden layers, 15 and 10 neurons respectively
Overall both networks performed well. The feed-forward network slightly outperformed the
recurrent network with a lower mean squared error. However both models identified the
pendulum dynamics well. Having successfully identified the linear pendulum, the next step
is the identification of the non-linear model. Identification of the non-linear model proceeds
in the same manner as that for the linear model. Input-output data is generated, the networkis trained, imported into simulink and validation performed.
-
7/27/2019 Trabalho Proposto 1
44/85
36
The first network tested is the feed-forward network. Table 4.5 summarises the models
tested and their performance. The mean squared error is low, and an increase in neurons in
the hidden layer does improve performance as expected. This was not the case in identifying
the linear model, as the controlled closed-loop model had a greater impact on the dynamics
of pendulum seen at the output. Figure 4.14 and Figure 4.15 show the best models identified
using one, and two hidden layer feed-forward neural networks respectively; plotting the
process output against the model output. From the graphs the models identify the pendulum
dynamics extremely well.
NNArchitecture
Hidden Layers Neurons in Hidden Layer Training Epochs MSE
feed-forward 1 4 400 0.000051028
feed-forward 1 10 400 0.000050314feed-forward 1 20 400 0.000048409
feed-forward 1 35 400 0.00004851
feed-forward 1 50 400 0.000046509
feed-forward 2 [4 2] 400 0.000049856
feed-forward 2 [10 4] 400 0.00005027
feed-forward 2 [20 10] 400 0.000048053
feed-forward 2 [30 20] 400 0.000046268
Table 4.5 Summary of feed-forward network performance
Figure 4.14 Feed-forward network, 1 hidden layer with 50 neurons
-
7/27/2019 Trabalho Proposto 1
45/85
37
Figure 4.15 Feed-forward network, 2 hidden layers, 30 and 20 neurons respectively
The next step is to identify the non-linear pendulum model using the Elman recurrent
network. It is found that similar to identification of the linear pendulum a scaling factor is
required for the training data. This value is chosen heuristically and subjectively; the ideal
scaling factor is determined to 1000. This improves results dramatically, however the
network still performs poorly. Figure 4.16 shows the difference using the scaling factor. The
models do not identify the dynamics with the same degree of accuracy as the feed-forward
models.
Figure 4.16 Difference between network trained with data scaled
-
7/27/2019 Trabalho Proposto 1
46/85
38
Clearly the scaling of the training data vastly improves the model performance; yet on
examination of the best model identified, the Elman network performance is still inferior to
the feed-forward networks. This is unexpected, as the presence of a feedback loop providing
dynamic memory in the Elman network, should enhance their prediction of non-linear
systems. However in cases where the required depth of memory is much larger than the size
of the tapped delay line, a recurrent network may operate poorly; essentially the information
needed to predict the future is not concentrated in the current sample neighbourhood [25].
Thus the network is unable to fully identify the dynamics of the pendulum. Table 4.6
summarises models tested and their performance and Figure 4.17 shows the best model
identified.
NNArchitecture
Hidden Layers Neurons in Hidden Layer Training Epochs MSE
Elman 1 4 400 0.1243
Elman 1 10 400 0.0535
Elman 1 20 400 0.3049
Elman 1 25 400 0.3868
Elman 2 [4 2] 400 0.6928
Elman 2 [10 4] 400 0.1017
Elman 2 [15 10] 400 0.05
Table 4.6 Elman network performance
4.17 Elman Network, 2 hidden layers, 15 and 10 neurons respectively
The overall results show that the feed-forward network outperforms the recurrent Elman
network identifying the non-linear models dynamics. Subsequently the report shall proceed
focused primarily using feed-forward networks.
-
7/27/2019 Trabalho Proposto 1
47/85
39
4.5 Javiers Linearised Model
Javiers linear model of the inverted pendulum is of considerable interest; as it is upon this
model, controllers for the physical pendulum were developed, thus giving an intuitive feel
for how accurate the models developed of the linear and non-linear inverted pendulum are.
The following is the transfer function of the system [26].
+
+
=
=
L
gs
M
Fs
sML
M
Fss
M
G
GsG
2
2
1
1
1
)(
using the following parameters
F= 0.303Kg/s
M= 0.091Kg
L= 0.32898 m
g= 9.8 m/s2
Thus giving the following linearised model
++
+=
=
))()((
)()(
2
1
2
1
bsbsas
skass
k
G
GsG
where
30
11
46.5
33.3
2
1
==
==
k
k
b
a
This linearised system of the inverted pendulum is modelled using Matlab simulink. As this
model proved extremely robust in the design of controllers for the physical system, it
provides a good insight into the behaviour of the physical rig. Figure 4.18 shows the
simulink set-up. The system is stabilised using a PID controller.
-
7/27/2019 Trabalho Proposto 1
48/85
40
Figure 4.18 Javiers Linearised Model
The input to the system is a dither signal in the form of a Pseudo-random Binary sequence
(PRBS). The signal has a spectral content rich in frequencies [27]. The objective of this
excitation signal is to generate input-output data, which contains the process dynamics over
the entire operating range. Figure 4.19 shows the pendulum angle theta.
Figure 4.19 A comparison of Javiers Model & Non-linear Model
Comparing the two models, although they do differ, they do display the same dynamics.
Thus the next step is to identify Javiers model using neural networks. The same procedure
-
7/27/2019 Trabalho Proposto 1
49/85
41
that is used to identify the non-linear pendulum is adopted; and input-output data for
training and validation is carried out using the same method. For completeness recurrent
networks in the form of Elman are tested, a similar pattern emerges as with previous testing,
the training data had to be scaled and the feed-forward networks out-performed them. Table
4.7 shows the different feed-forward networks tested and their respective performance.
NN Architecture Hidden Layers Neurons in Hidden Layer TrainingEpochs
MSE
Feed-forward 1 4 400 1.296E-08
Feed-forward 1 10 400 2.3718E-08
Feed-forward 1 20 400 1.2917E-08
Feed-forward 1 35 400 1.2901E-08
Feed-forward 1 50 400 1.2892E-08
Feed-forward 2 [4 2] 400 1.2931E-08
Feed-forward 2 [10 4] 400 1.2942E-08Feed-forward 2 [20 10] 400 1.2917E-08
Feed-forward 2 [30 20] 400 1.2916E-08
Table 4.7 Summary of feed-forward network performance
Table 4.6 clearly shows the mean square error is extremely small. An increase in the number
of hidden layers and the number of neurons in the hidden layer does improve performance
slightly. However, with only one hidden layer and four neurons in this layer gives an
extremely small mse. Figure 4.20 shows the best model identified, plotting the process
output against the model output. The model identifies the system dynamics extremely well.
Figure 4.20 Feed-forward NN, 2 hidden layers, 30 and 20 neurons respectively
-
7/27/2019 Trabalho Proposto 1
50/85
42
4.6 Summary
In this chapter an overall feel for system identification has been presented and its associated
procedure explained. Conventional identification techniques were first applied to the linear
model of the inverted pendulum. These traditional techniques modelled the linear pendulum
with good accuracy. However such traditional techniques cannot identify the complexity of
the non-linear pendulum. Extensions have been made to these linear identification
techniques in the form of non-linear armax. The main problem with non-linear armax is how
to construct a model that is easily estimated and used to construct a systems dynamics in
practical terms. As such system identification proceeded forward using neural networks.
Two main network structures were used feed-forward and recurrent. Initially the linear
pendulum model was identified. Both structures performed well. However it was found
necessary to pre-process the data for the recurrent network by using a scaling factor,
consequently having a filtering effect on the data. This helps prevent the network getting
stuck in a local minimum during training. The next step was identification of the non-linear
pendulum. Again both network architectures were tested. Similar results to the linear
identification were obtained with pre-processing of training data required for the recurrent
elman network. Overall the feed-forward networks out-performed the recurrent networks;
this is unexpected, as recurrent networks with their dynamic memory should identify a
dynamic system with good accuracy. This can be accredited to the fact the in some cases
recurrent networks perform poorly if the length of the fixed line delays is smaller than the
required length of memory to predict the next sample. Consequently the remainder of the
report will use feed-forward networks exclusively. Finally in this chapter Javiers linearised
model of the inverted pendulum was examined. This model is important because it is upon
this controllers for the physical pendulum were developed. As such it gives an intuitive feel
for the accuracy of the models identified. It is seen that overall the model contain mainly the
same dynamics. For completeness Javiers model is also identified using both feed-forwardand recurrent networks. Subsequently in the next chapter identification of the physical
pendulum is examined and implemented.
-
7/27/2019 Trabalho Proposto 1
51/85
43
Chapter 5
Real Time Identification
The pendulum rig is comprised of a pole mounted on a cart free to swing in only a vertical
plane. The cart is driven by a DC motor and allowed to move on a rail of limited length.
Two optical encoders are used to detect pendulum angle and cart position. The two output
signals are received by a control algorithm via the interface card, which subsequently
determines the control action necessary to keep the pendulum upright. The control signal is
limited within a normalised range from 1 to 1. Figure 5.1 shows the pendulum control
system.
Figure 5.1 Set-up of pendulum rig
The control algorithm is in Matlab. Figure 5.2 shows the real time kernal (RTK) in the
Matlab environment. The RTK is an encapsulated block implementing the control tasks.
The input to the RTK block can be in the form of an excitation signal or desired cart
position. The outputs from the RTK contains all data regarding pendulum angle, angular
velocity, cart position, cart velocity and the control value. There is however no feedbackcontrol loop because the controller is embedded in the RTK
Limit switchCart & angle
sensor
DC motor
&position sensor
measurement
DC motor driver
&
interface
control
Control
Algorithms
-
7/27/2019 Trabalho Proposto 1
52/85
44
Figure 5.2 Real Time Task in simulink environment
5.1 Closed-loop controller
The experiment starts with the pendulum in a downward position. The pendulum is steered
to its upright unstable position and subsequently kept erect by the linear-Quadratic (LQ)
controller. As such two independent control algorithms are required.
Swinging algorithm
Stabilising algorithm
Only one control algorithm is active in each control zone. Figure 5.3 shows these zones.
Figure 5.3 Zones of Control Algorithms
Stabilisation
zone
Swinging
zone
-
7/27/2019 Trabalho Proposto 1
53/85
45
The swinging control algorithm is a heuristic one, based on energy rules and has the form
Frictionusignuu oldold )(+= , (5.1)
where control uis the a normalised value 1 to 1.
The linear-quadratic to keep the inverted pendulum stabilised has the form
)( 44332211 KKKKu +++= , (5.2)where
1 = desired position of the cart measured position of the cart,2 = desired angle of the pendulum measured angle of the pendulum,3 = desired velocity of the cart observed velocity of the cart,
4 = desired angular velocity of the pendulum observed angular velocity of the pendulumK1.K4 are positive constants.
The optimal feedback gain vectorK= [K1.K4 ] is calculated such that the feedback law
u = -K; (5.3)
where = [1..4] minimises the cost function
dtRuuQxxIntegralJ '' += , (5.4)
where Q and R are the weighting matrix [28].
5.2 Identification of physical system
At this stage there is a closed loop stabilised system, this is necessary for identification of an
open-loop unstable system. The non-linear model identified using neural networks is first
compared with the physical rig. The input to both the physical system and the model is a
small dither signal is to generate input-output data, which contains the non-linear process
dynamics over the entire operating range. Both systems are in a controlled closed loop.
Comparing the pendulum angle of each the dynamics are similar thus the modelling of the
system in simulink has been accurate, see Figure 5.4
-
7/27/2019 Trabalho Proposto 1
54/85
46
Figure 5.4 NN non-linear model output & physical system output
Having confirmed that the two systems dynamics are similar, the next step is the
identification of the real rig. The data for this is generated in a similar manner, the
experiment starts with the pendulum in a downward position. Figure 5.5 shows pendulum
angle during test.
Figure 5.5 Pendulum angle of Real System
-
7/27/2019 Trabalho Proposto 1
55/85
47
Having generated input-output data for the physical system, the next stage is to develop a
neural network, which identifies the pendulum angle of the physical rig. A feed-forward
network is used to identify the pendulum angle. Training of the network is performed and
subsequently imported into the simulink environment. Validation of the network is
performed on-line, see Figure 5.6. A single hidden layer in the neural network is adopted,
and a different number of neurons in the hidden layer are tested.
Figure 5.6 Validation set-up
Several different models were tested but unable to identify the pendulum angle, see Figure
5.7. The problem exists that two different controllers are used to swing up and stabilise the
pendulum, thus depending on what zone the pendulum is in, the output is calculated in a
different manner; thus effecting the data for training of the neural network and subsequent
identification.
Figure 5.7 feed-forward NN, 1 hidden layer with 75 neurons
-
7/27/2019 Trabalho Proposto 1
56/85
48
Given that a neural network can approximate data on which it has not been trained it is
decided to train the neural network using the data from the stabilised zone and see if it can
approximate the swinging up action. Thus the neural network is re-trained using only the
input-output data when the pendulum is in the stabilised zone using the linear-quadratic
control i.e. it is not trained during the swing up action using the swinging algorithm.
Figure 5.8 shows the best model obtained.
Figure 5.8 feed-forward neural network, 1 hidden layer with 75 neurons
From the diagram a vast improve in the model is observed, it identifies the pendulum angle
in the stabilised region extremely well and also models well, the pendulum swing up
motion, on which it has not been trained. This is possible because of neural networks ability
to approximate functions on which they have not been trained. At stage a suitable model has
been developed for the physical rig. However this must be subjected to further tests. The
model identified must be robust in order to achieve neuro control. Subsequently a
disturbance is added on-line to see how the model responses. See Figure 5.9 for set-up.
-
7/27/2019 Trabalho Proposto 1
57/85
49
Figure 5.9 System set-up with Disturbance
From Figure 5.10 it can be seen that the model correctly identifies the pendulum angle
during the disturbance.
Figure 5.10 Pendulum angle during disturbance
To further test the model the dither signal is increase so that the system is unstable
throughout the experiment. Figure 5.11 on the following page shows that the model still
identifies the pendulum angle. Thus an accurate model of the inverted pendulum has been
developed.
Disturbance
-
7/27/2019 Trabalho Proposto 1
58/85
50
Figure 5.11 Pendulum Angle, with large excitation signal
5.3 Summary
In this chapter the physical rig set-up is discussed and its closed-loop controllers. The
system comprises of two controllers, one to swing-up the pendulum and one to maintain it
in a stabilised region. Neural networks had difficulty in identifying the process, as
depending on what zone the pendulum is in, the output is calculated in a different manner,
as such the output from the two controllers do not relate to each other. Neural networks can
approximate data on which they have not been trained; consequently the network is trained
using only the data from the pendulum in the stabilised region. The neural network
subsequently identifies the pendulum angle with good accuracy and successful predicts the
swing-up action. The model is then subjected to further experiments to test its robustness.
The model successfully identifies the process when subjected to a disturbance and a larger
excitation signal. Subsequently the next stage is neuro-control, which is dealt with in the
following chapter.
-
7/27/2019 Trabalho Proposto 1
59/85
51
Chapter 6
Neuro Control
The inverted pendulum is open loop unstable, non-linear and a multi-output system. Thephysical rig has one input a normalised control output value between 1 and 1, and two
outputs the pendulum angle theta and the cart position. A model of the physical rig has been
identified using static feed-forward networks modelling the pendulum angle. Thus the next
step is neuro control. However before a neuro-controller is developed, a comparison
between standard linear techniques of control such as PID and neuro control is made.
Subsequently the different techniques of neuro control are discussed and a control
developed.
Standard linear control techniques such as PID cannot map the complex non-
linearitys of the pendulum system. They have been used to control the physical rig, but
based on the condition the experiment starts with the pendulum in a stabilised zone. Even at
this its control of the system is extremely limited. ANNs have the capability of adaptively
controlling and modelling a wide of non-linear processes at a high level of performance.
The inverted pendulum is a SIMO system; in order to have full-state feedback control
several PID controllers would be necessary. However due to neural networks parallel
nature, a single neural network is sufficient.
Before a neuro-controller is developed, the merits of the main types of neuro-control
are discussed. The main types of neuro control include supervised, unsupervised, model
reference, direct inverse and adaptive.
6.1 Supervised Control
In supervised control the neural network uses an existing controller to learn the control
action. One may question why mimic an existing controller if it performs satisfactory. The
problem arises that traditional controllers may operate well around a specific operating
point. However if a disturbance or uncertainty occurs these traditional controllers fail. The
advantage of neuro-control is that the network can adjust and update its weights. A neuro-
controller can also approximate on data on which it has never been trained. Supervised
control proceeds with a teacher providing the control output for the neural network to learn.
The simplest approach to this method is to teach the network off-line; subsequently the
neural network is placed in the feedback loop, see Figure 6.1
-
7/27/2019 Trabalho Proposto 1
60/85
52
Figure 6.1 Supervised learning using existing controller
6.2 Unsupervised Control
Unsupervised control does not require a prior knowledge. However the behaviour of such a
network may be extremely complex; at best unsupervised control is still not fully
understood. In unsupervised learning the neural network tests different states determining
which produces the correct control action. The learning time is computationally inefficient.
However an unsupervised neuro-controller can deal with complex non-linear control.
Anderson et al [29] developed an unsupervised controller for the inverted pendulum,
however a certain amount of prior knowledge is incorporated; in that failure signal is
supplied to the neural network based on pendulum angle and cart position
6.3 Adaptive neuro control
The main advantage of adaptive control is the ability to adapt on-line. This is achieved by
presenting the neuro controller with an error signal. This is calculated by subtracting the
actual output from the desired output. Subsequently the error is used to adjust the weights
on-line, see Figure 6.2.
PLANT
Control
U Y
error
Update weights
-
7/27/2019 Trabalho Proposto 1
61/85
53
Figure 6.2 Adaptive neuro control
6.4 Model Reference Control
Model reference control differs from adaptive neuro control, in that the desired closed loop
response is specified through reference model. Thus, the error signal is calculated using the
reference model. The neuro-control forces the plant output similar to the reference model
output, see Figure 6.3
Figure 6.3 Model Reference Control
PLANTU Y
error signal used to
adapt weights
Desired Response
Y
PLANT
Reference Model
-
+
U
-
7/27/2019 Trabalho Proposto 1
62/85
54
6.5 Direct inverse control
In direct inverse control the neural network is trained to model the inverse of the plant. The
plant output is inputted to the neuro controller; subsequently the neuro controllers output is
compared with the plant input and network trained, see Figure 6.4. The main difficulty with
this method is that the inverse model must be extremely accurate; as such this method is
limited to open-loop stable systems [30]. This can be accredited to the fact in a closed-loop
too much of the plant dynamics are removed, subsequently an accurate model cannot be
identified.
Figure 6.4 Direct inverse control
Considering the different control techniques possible with respect to the inverted pendulum
supervised emerges as a suitable solution. Inverse control is not possible, as previously
stated, an extremely accurate inverse model of the plant is required. However this is
unobtainable as the inverted pendulum is open-loop unstable, and the closed loop model
contains some of the dynamics of the controller. Anderson has proved unsupervised control
of the inverted pendulum possible. However this is an extremely complex method not yet
fully understood. Given the time constraints of the project this is not a viable technique.
Similar to inverse model control, model reference control also requires an accurate model of
the plant, as such this method is limited to an open-loop stable system. Given that there is an
existing controller for the inverted pendulum rig based on a swinging algorithm and a
stabilising algorithm, supervised control offers the best solution. Consequently neuro-
control proceeds focused using supervised neuro-control.
PLANTU Y
+_
-
7/27/2019 Trabalho Proposto 1
63/85
55
6.6 Neuro Control in Simulink
It is decided to develop a neuro-controller using supervised learning. Using the existing
feedback controller for the non-linear pendulum a feed-forward network is trained to model
the controller. The model is developed and trained using the same techniques covered in
system identification; with the exception that the input to train the network is the angle theta
and the output is the controller input. When training is complete the model imported into the
simulink environment and placed in the feedback loop replacing the existing controller, see
Figure 6.5.
Figure 6.5 Neuro Controller in Simulink
Figure 6.6 shows the pendulum angle controlled using the neuro controller, clearly the
controller maintains stability, keeping the pendulum up-right.
Figure 6.6 Pendulum angle using neuro controller in Simulink
-
7/27/2019 Trabalho Proposto 1
64/85
56
The next step is placing the controller in closed-loop with the non-linear pendulum model
identified using neural networks, see Figure 6.7.
Figure 6.7 neuro control of non-linear pendulum model
Figure 6.8 show the neuro-controller also stabilises the model identified using neural
networks.
Figure 6.8 neuro control of non-linear model
6.7 Real time neuro-control
At this stage neuro-control of the non-linear pendulum model and the model identified using
neural networks has been carried out in the simulink environment. Subsequently the next
step is neuro control of the physical system. It is possible to test and develop controllers for
the physical rig using the external controller function. The external controller is a file, which
contains the control routine, which is accessed at the interrupt time. The control algorithm
-
7/27/2019 Trabalho Proposto 1
65/85
57
must be written in C code and a dynamic link library created. Thus the steps to develop the
neuro-controller can be categorised into four main stages:
Train the neural network offline, import into simulink.
Validate model online.
Code the neuro-controller in C.
Test online.
Up until this point system identification has primarily focused on the pendulum
angle. In order to achieve control of the physical system both the pendulum angle theta and
the cart position must be taken into account due to the limited length of rail. The neural
network training procedure is carried out in a similar manner as previous, with the exception
the input for training is the pendulum angle and the cart position. The output for training is
the control output. The Matlab script is as follows:
tempP = [angle';position'];
net = newff([-3.5 3.5; -1 1],[20 1], {'tansig' 'purelin'},'trainlm');
net.trainParam.epochs = 1000;
net.trainParam.lr = 0.0001;
net = train(net,tempP,theta');
Feed-forward networks are used and different parameters are tested i.e. the number of
hidden layers and the number of neurons in the hidden layer. They are kept to a minimum so
that the subsequent C coding of the model is easier. Similar to identification of the physical
rig, the network is trained using only data form the stabilised zone, and the network is
allowed approximate the swing-up action. The next step is to validate the model before
implementing it in C. Figure 6.9 shows validation set-up. It is found that the optimum
number of hidden layers is 1 with 20 neurons in each layer respectively.
-
7/27/2019 Trabalho Proposto 1
66/85
58
Figure 6.9 Validation set-up
The control output from the existing controller is compared with the output from the neuro-
controller, it is found that the two are similar with a low mean squared error, see Figure
6.10.
Figure 6.10 Neuro Controller output
The neuro-controller structure is shown in Figure 6.11, on the following page.
-
7/27/2019 Trabalho Proposto 1
67/85
59
Figure 6.11 Neuro Controller
Structure
N
N
X1
X2
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
Tansig
b1
b2
b11
b12
b13
b10
b9
b8
b7
b6
b5
b4
b3
b18
b17
b16
b15
b14
b19
N
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig
Tansig