trabalho proposto 1

Upload: edgf

Post on 14-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Trabalho Proposto 1

    1/85

    DUBLIN CITY UNIVERSITY

    SCHOOL OF ELECTRONIC ENGINEERING

    Artificial Neural Network Identification &

    Control of an Inverted Pendulum

    Barry N. Sweeney

    August 2004

    MASTER OF ENGINEERINGIN

    ELECTRONIC SYSTEMS

    Supervised by Ms. Jennifer Bruton

  • 7/27/2019 Trabalho Proposto 1

    2/85

    ii

    DeclarationI hereby declare that, except where otherwise indicated, this document is entirely my own work and has not

    been submitted in whole or in part to any other university.

    Signed: ...................................................................... Date: ...............................

  • 7/27/2019 Trabalho Proposto 1

    3/85

    iii

    AbstractThe purpose of this project is to illustrate the use of artificial neural networks (ANNs) in the

    identification and control of a non-linear system. Non-linear systems are investigated with

    respect to the dynamics of the inverted pendulum. The inverted pendulum is a classicexample of an unstable non-linear dynamic system. Consequently it has received much

    attention, as it is an extremely complex and challenging control problem. The interesting

    feature of neural networks is that it can learn from the environment in which the system is

    being operated. The potential of ANNs to system identification and control is examined.

    Subsequently feed-forward and recurrent neural networks are used to identify a robust

    model of the inverted pendulum. Finally a neuro-controller is developed and implemented

    using Borland C++, for control of the physical system.

  • 7/27/2019 Trabalho Proposto 1

    4/85

    iv

    AcknowledgementsI would like to take this opportunity to thank all those who supported and helped me

    throughout the development and research of this project. Firstly, I would like to thank my

    project supervisor Ms. Jennifer Bruton whose guidance and vast knowledge provedinvaluable, all the staff in the faculty of Engineering with especial thanks to Conor Maguire

    for the setting up of the physical rig. Finally I would like to thank my family and girlfriend

    for their support and patience throughout.

  • 7/27/2019 Trabalho Proposto 1

    5/85

    v

    Table of Contents

    Declaration.....E

    rror! Bookmark not defined.ii

    Abstract..iii

    Acknowledgementsiv

    Table Of Contents..v

    Table of Figures....vii

    Chapter 1 Introduction..1

    1.1 Motivation2

    1.2 Outline of Report.2

    Chapter 2 Inverted Pendulum..4

    2.1 Mathamatical Equations.5

    2.2 Modelling of the Inverted Pendulum.7

    2.3 Closed-loop Control.9

    2.4 Summary.....12

    Chapter 3 Neural Networks13

    3.1 Artifical neuron model.....143.3 Activation functions......16

    3.3 Neural network architecture....16

    3.4 Learning algorithms..19

    3.5 Learning rules20

    3.6 Neural Network Limitations.20

    3.7 Applications....21

    3.8 Summary.22

    Chapter 4 System Identification...23

    4.1 System Identification Procedure24

    4.2 Conventional linear system Identification25

    4.3 Non-linear System Identification using NARMAX..29

    4.4 System Identification using Neural Networks..30

    4.5 Javier's Linearised Model..39

    4.6 Summary..42

    Chapter 5 Real -Time Identification43

  • 7/27/2019 Trabalho Proposto 1

    6/85

    vi

    5.1 Closed-loop controller.44

    5.2 Identification of physical system..45

    5.3 Summary50

    Chapter 6 Neuro Control.51

    6.1 Supervised Control51

    6.2 Unsupervised Control52

    6.3 Adaptive neuro control..52

    6.4 Model Reference Control..53

    6.5 Direct inverse control54

    6.6 Neuro Control in Simulink55

    6.7 Real time neuro-control.56

    6.8 Project Plan.62

    6.9 Summary.62

    Chapter 7 Conclusions.63

    7.1 Future Recommendations..65

    References.67

    Appendix 170

    Appendix 2...77

  • 7/27/2019 Trabalho Proposto 1

    7/85

    vii

    Table of FiguresFigure 2.1 Inverted Pendulum..4

    Figure 2.2 Simulink model of linear pendulum...7

    Figure 2.3 Subsystem block of linear pendulum..7

    Figure 2.4 Simulink model of linear pendulum...8

    Figure 2.5 Subsystem block of linear pendulum..8

    Figure 2.6 Open loop response of inverted pendulum.9

    Figure 2.7 Simulink model of linear pendulum and controller..10

    Figure 2.8 Closed loop response of linear pendulum with controller10

    Figure 2.9 Simulink model of non-linear pendulum and controller...11

    Figure 2.10 Closed loop response of non-linear pendulum with controller...11

    Figure 2.11 Inverted Pendulum, Time (t) = 0.0 secs..12

    Figure 3.1 Biological Neuron.13

    Figure 3.2 Artificial Neuron, McMulloch & Pitts (1943)..14

    Figure 3.3 Perceptron Model..15

    Figure 3.4 Activation functions..16

    Figure 3.5 Multi-layer Feed-forward Network structure...16Figure 3.6 Multi-layer Recurrent Network structure.18

    Figure 3.7 Supervised Learning.19

    Figure 3.8 Unsupervised Learning.20

    Figure 3.9 Local & global minimum..21

    Figure 4.1 Input, Output, Disturbance of a System....23

    Figure 4.2 System Identification Procedure...24

    Figure 4.3 ARX model output with measured output26

    Figure 4.4 ARX [4 3 1] model output and measured output..27

    Figure 4.5 RARMAX [3 3 3 1] model output and process output.28

    Figure 4.6 RARMAX[4 3 3 1] model output and validation data..29

    Figure 4.7 Forward modelling of inverted pendulum using neural networks30

    Figure 4.8 Neural Network training...31

    Figure 4.9 Model validation set-up in simulink.32

    Figure 4.10 feed-forward network, 1 hidden layer, 4 hidden neurons...33

    Figure 4.11 feed-forward network, 2 hidden layers, 4 and 2 hidden neuron respectively.33

  • 7/27/2019 Trabalho Proposto 1

    8/85

    viii

    Figure 4.12 Elman network, 2 hidden layers, 4 and 2 neurons respectively..35

    Figure 4.13 Elman network, 2 hidden layers, 15 and 10 neurons respectively..35

    Figure 4.14 Feed-forward network, 1 hidden layer with 50 neurons.36

    Figure 4.15 Feed-forward networks, 2 hidden layers, 30 & 20 neurons respectively...37

    Figure 4.16 Difference between network trained with data scaled....37

    Figure 4.17 Elman network, 2 hidden layers, 15 and 10 neurons respectively..38

    Figure 4.18 Javiers Linearised Model...40

    Figure 4.19 A comparison of Javiers Model & Non-linear Model...40

    Figure 4.20 Feed-forward NN, 2 hidden layers, 30 and 20 neurons respectively..41

    Figure 5.1 Set-up of pendulum rig.43

    Figure 5.2 Real Time Task in simulink environment.....44

    Figure 5.3 Zones of Control Algorithms....44

    Figure 5.4 NN no-linear model output & physical system output.46

    Figure 5.5 Pendulum angle of Real System...46

    Figure 5.6 Validation set-up...47

    Figure 5.7 feed-forward NN, 1 hidden layer with 75 neurons...47

    Figure 5.8 feed-forward NN, 1 hidden layer with 75 neurons...48

    Figure 5.9 System set-up with disturbance49

    Figure 5.10 Pendulum angle during disturbance...49

    Figure 5.11 Pendulum angle, with large excitation signal.50

    Figure 6.1 Supervised learning using existing controller...52

    Figure 6.2 Adaptive neuro control.53

    Figure 6.3 Model Reference Control..53

    Figure 6.4 Direct inverse control....54

    Figure 6.5 Neuro Controller in simulink....55

    Figure 6.6 Pendulum angle using neuro controller in Simulink....55Figure 6.7 neuro control of non-linear pendulum model...56

    Figure 6.8 neuro control of non-linear model....56

    Figure 6.9 Validation set-up...58

    Figure 6.10 Neuro-Controller output.58

    Figure 6.11 Neuro-Controller Structure.59

    Figure 6.12 Pendulum Angle.60

    Figure 6.13 Pendulum Angle.61

  • 7/27/2019 Trabalho Proposto 1

    9/85

    1

    Chapter 1

    Introduction

    Due to increasing technological demands and ever increasing complex systems requiring

    highly sophisticated controllers to ensure that high performance can be achieved and

    maintained under adverse conditions, there is a demand for an alternative form of control as

    conventional approaches to control do not meet the requirements of these complex systems.

    To achieve such highly autonomous behaviour for complex systems one can enhance

    today's control methods using intelligent control systems and techniques. It is for this reason

    that neural networks are of significant importance in the design and construction of the

    overall intelligent controller for complex non-linear systems. Currently neural networks are

    established in many application areas (expert systems, pattern recognition, system control,

    etc.). These methods have received a lot of criticism during their existence (for example, see

    Cheeseman, 1986). However this criticism has weakened as artificial neural networks have

    been successfully applied to practical problems.

    Artificial neural networks attempt to simulate the human brain. This simulation is

    based on the present knowledge of the brain, and this knowledge is even at its best

    primitive. The operation of the brain is believed to be based on simple basic elements called

    neurons, which are connected to each other with transmission lines, called axons and

    receptive lines called dendrites. The learning may be based on two mechanisms: the creation

    of new connections, and the modification of connections. Each neuron has an activation

    level, which, in contrast to Boolean Logic, ranges between some minimum and maximum

    value. Neural network have several important characteristics which make them suitable for

    the identification and control of a non-linear system, their features include

    No need to know data relationships. Self-learning capability.

    Self-tuning capability.

    Applicable to model various systems.

    Further to this neural networks contain non-linear elements that enables them to model and

    control complex non-linear systems.

    From a given transfer function the system response can be predicted. The reverse of

    this process i.e. calculating the transfer function from a measured response is called system

    identification. It is essentially a process of sampling the input and output signals of a

  • 7/27/2019 Trabalho Proposto 1

    10/85

    2

    system, and subsequently using the respective data to generate a mathematical model of the

    system to be controlled. System identification enables the real system to be altered without

    the need to calculate the dynamic equations and remodel the parameters again. Knowledge

    of the dynamics of the system is useful in the determination of the neural network

    architecture, its inputs, outputs and training process for dynamic model identification

    purposes [1].

    1.1 Motivation

    The inverted pendulum problem is a classic example of an unstable non-linear dynamic

    system [2]. Consequently it has received much attention, as it is an extremely complex and

    challenging control problem. The dynamics of the inverted pendulum constitute great

    difficulty in system identification and control. Conventional control systems have been

    found wanting given raised demands for high performance, due to their inability to adapt to

    new or unusual circumstances. Conventional control systems do not incorporate the

    desirable control features, such as, non-linear capability, adaptation, flexible control

    objectives and multivariable capability [3]. Thus there is a need for a control method which

    addresses the non-linearities of an operating system, incorporating an adaptation capability.

    Considering these control issues, artificial neural network evolve as a solution. ANNs have

    several important characteristics that identify them as suitable for the identification and

    control of non-linear systems: their ability to learn [4], their ability for the approximation of

    non-linear functions and their inherent parallelism. The predominant goal of this project is

    to identify an accurate model of the physical inverted pendulum; the majority of modelling

    is first simulated using Matlab simulink. Subsequently a suitable neuro controller is

    developed and implemented using Borland C++.

    1.2 Outline of Report

    Chapter 2 investigates non-linear systems with particular respect to the inverted pendulum

    and its associated control difficulties. The dynamic equations are derived and subsequently

    models developed for both the linear and non-linear model. The development of feedback

    controllers is also detailed in this section. Chapter 3 introduces Artificial Neural Networks

    detailing their basic components, structure, architecture and application in system

    identification and control. Chapter 4 discusses the area of system identification and its

    subsequent procedure.Traditional identification techniques are examined first with respect

  • 7/27/2019 Trabalho Proposto 1

    11/85

    3

    to the linear pendulum. Non-linear identification is subsequently performed using neural

    networks. Chapter 5 details the set-up of the physical inverted pendulum; following on from

    this real-time identification is performed. Chapter 6 details different neuro-control

    techniques; subsequently a neuro-controller is developed and implemented using Borland

    C++. Finally in Chapter 7, conclusions are drawn and future recommendations detailed.

  • 7/27/2019 Trabalho Proposto 1

    12/85

    4

    Chapter 2

    Inverted Pendulum

    A dynamic system is a system that changes during time. The starting point for the system is

    the initial state and the final point is the equilibrium. Most often a dynamic system is

    described by differential or difference equations, where the rate of change is a function of

    time or some parameter. Basically all real systems are dynamic system. Most real-world

    dynamic processes are nonlinear. Thus, nonlinear mathematical models are the most desired

    ones [5]. The inverted pendulum is an example of a highly nonlinear and unstable dynamic

    system. Pole-balancing is the task of keeping a rigid pole, hinged to a cart and free to fall in

    a plane, in a roughly vertical orientation by moving the cart horizontally in the plane whilekeeping the cart within some maximum distance of its starting position (see Figure 2.1).

    Despite the dynamics being well understood it is still a difficult process to accomplish, as

    most people who have experimented with such devices will appreciate. Further still, if the

    system parameters are not known precisely, then the task of constructing a suitable

    controller is, accordingly, more difficult. Many researchers have restrained their control-

    learning systems to simulations of the inverted pendulum, which can be accredited to the

    level of difficulty associated with the control problem. In this chapter, the dynamic

    equations will be derived for both the linear and non-linear pendulum. Using a mathematical

    model of this form for a system, computer simulation is possible and subsequently the

    respective models can be developed using Matlab simulink. The underlying aim of

    modelling is that the developed models will have the same characteristics as the actual

    process.

    Figure 2.1 Inverted Pendulum

    Inverted Pendulum

    y2

    y

    z2

    Force

    L = length of pole,

    m = mass of pole,M = mass of cart,

    g = gravity

  • 7/27/2019 Trabalho Proposto 1

    13/85

    5

    2.1 Mathematical Equations

    Lagranges equation of motion can be used for the analysis of mechanical systems.

    ,Fy

    L

    y

    L

    dt

    d

    =

    (2.1)

    withy(t) the generalised position vector, )(ty

    the generalised velocity vector, and F(t)

    the generalised force vector. The Lagrangian is L = K-U, the kinetic energy minus the

    potential energy.

    The kinetic energy of the cart is

    = 21 21

    yMK , (2.2)

    The pole can move in both horizontal and vertical directions therefore the kinetic energy of

    the pole is

    )(2

    12.

    2

    2.

    22 zymK += , (2.3)

    where y2and z2are equal to

    ,sin2 Lyy += cos2 Lz = , (2.4)giving

    ,cos..

    2

    .

    Lyy += sin..

    2 Lz = , (2.5)

    Therefore the total kinetic energy,Kof the system is

    +++=+=

    .22

    ...2

    .2

    21 cos22

    1

    2

    1 LLyymyMKKK , (2.6)

    The potential energy due to the pendulum U is

    cos2 mgLmgzU == , (2.7)

    The Lagrangian function is

    cos2

    1cos)(

    2

    1 .22...2 mgLmLymLymMUKL +++== , (2.8)

  • 7/27/2019 Trabalho Proposto 1

    14/85

    6

    The state- space variables of the systems are y and , thus the Lagrange equations are

    fy

    L

    y

    L

    dt

    d=

    .

    , (2.9)

    0.

    =

    LL

    dt

    d, (2.10)

    By substituting for L and performing the partial differentiation produces

    fmLmlymM =++ sincos)(.2

    ....

    , (2.11)

    0sincos..

    2..

    =+ mgLmLymL , (2.12)

    The above dynamic equations can be placed into state-space form, this is achieved by

    expressing the Lagrange equation in terms of matrices

    +=

    +

    sin

    sin

    cos

    cos.2

    ..

    ..

    2

    mgL

    fmLy

    mLmL

    mLmM, (2.13)

    This gives a mechanical system in typical Lagrangian form, i.e. the inertia matrix

    multiplying the acceleration vector. Inverting the inertia matrix and simplifying, the

    following non-linear equations describing the inverted pendulum are derived [7].

    )(cos

    sincossin2

    .2..

    mMm

    fmLmgy

    +

    =

    , (2.14)

    LmMmL

    fmLgmM

    )(cos

    coscossinsin)(2

    .2..

    +

    +++=

    , (2.15)

    As some of the modelling involved in this project is linear, these equations must be

    linearised. The simplest approach is to approximate cos1 and sin0. In addition to

    this, the quadratic terms are extremely small. Consequently they are set to zero. This yields

    the linear system equations

    Mmgfy =

    ..

    , gML

    mMML

    f ++=..

    , (2.16)

  • 7/27/2019 Trabalho Proposto 1

    15/85

    7

    2.2 Modelling of the Inverted Pendulum

    Given the set of equations describing the linear and non-linear inverted pendulum, the next

    step is modelling. The models are developed using Matlab simulink, which provides an

    environment for computer simulation. The first model developed is the linear model of theinverted pendulum, which for proficiency is encapsulated in a subsystem block, see Figure

    2.2 and Figure 2.3 respectively.

    Figure 2.2 Simulink model of linear pendulum

    Figure 2.3 Subsystem block of linear pendulum

    The subsystem block is set up using a mask. This enables the parameters m, l, g and M to be

    altered for different simulations. As it is desired to model accurately the physical rig, the

    parameters are taken from this system, the mass of the pendulum, m is set to 0.11 Kg, the

    mass of the cart, M is set to 1.2 Kg, the length of the pendulum, L is set to 0.4 meters with

    gravity, g set to 9.8m/s2. The non-linear model is developed in a similar manner to the linear

    system and the parameter values remain the same, see Figure 2.4 and Figure 2.5 on the

    following page.

  • 7/27/2019 Trabalho Proposto 1

    16/85

    8

    Figure 2.4 Simulink model of the non-linear pendulum

    Figure 2.5 Subsystem block of linear pendulum

    Both pendulum models are simulated with an input of a step, to check the stability of the

    systems. The system can be said to be input-output stable if it responds to every bounded

    input variable with a bounded output variable [7]. Subsequently input-output instability can

    be generalised as, for every bounded input variable the output variable goes unbounded. The

    angle of the pendulum is shown in Figure 2.6, on the following page. The simulation shows

    that the pendulum is open loop unstable i.e. the pendulum falls over.

  • 7/27/2019 Trabalho Proposto 1

    17/85

    9

    Figure 2.6 Open loop response of the inverted pendulum

    2.3 Closed-loop Control

    System identification requires the collection of interesting data. In order to accurately

    model the inverted pendulum and to generate suitable input/output data for system

    identification it is necessary to stabilise it. This is achieved using a PID controller. Other

    methods could have been used to stabilise the linear pendulum such as full state feedback.

    However PID control is chosen for its ease of implementation. PID controllers have a long

    history in control engineering as they have been proven to be stable, simple and robust for

    many real life applications. ThePaction is related to the present error, the Iaction is based

    on the past history of error, while the Daction relates to the future behaviour of the error.

    These actions roughly estimate to filtering, smoothing and prediction problems respectively.

    The equation of a PID controller is given by

    dt

    tdeKdeKteKtu

    t

    o

    Dp

    )()()()( 1 ++= , (2.17)

    There are several methods to design PID controllers. Initially Ziegler-Nichols method was

    tested. However the optimum parameters obtained from this method offered a pure

    response. Subsequently the parameters are chosen heuristically and subjectively, that is by

    trial and error testing. The simulink model with controller is shown is Figure 2.7 on the

    following page. The pendulum is now in a closed loop, thus the PID controller is de-tuned

    so that the dynamics of pendulum are emphasised, and a band-limited white noise signal is

  • 7/27/2019 Trabalho Proposto 1

    18/85

    10

    used as the input to the system to emulate the disturbances the physical rig would be

    subjected to. In the closed loop model in Figure 2.7 there is a graphic visualisation of the

    inverted pendulum when compiled, which has been adapted from the Matlab demonstration

    slcp.mdl.

    Figure 2.7 Simulink model of linear pendulum and controller

    The linear pendulum is simulated, see Figure 2.8 for the angle of the pendulum. The closed

    loop system with PID controller keeps the pendulum angle stable. This allows for longer

    simulation of the pendulum and more importantly the generation of information rich data for

    system identification purposes.

    Figure 2.8 Closed loop response of linear pendulum with controller

  • 7/27/2019 Trabalho Proposto 1

    19/85

    11

    The next step is to develop a controller for the non-linear pendulum. Similar to the control

    of the linear pendulum, a PID controller was developed, see Figure 2.9. PID control of the

    non-linear pendulum is possible because the simulation starts with the pendulum in a

    linearisied region that is with the pendulum in an up-right position see Figure 2.11.

    Figure 2.9 Simulink model of non-linear pendulum and controller

    The closed loop response of the pendulum angle is shown in Figure 2.10. The closed loop

    response with PID control is stable, thus suitable input-output data for system identificationpurposes is obtained. Similar to the linear closed loop system, the PID controller has been

    de-tuned so that the full dynamics of the pendulum emphasised and not the controller

    dynamics.

    Figure 2.10 Closed loop response of non-linear pendulum with controller

  • 7/27/2019 Trabalho Proposto 1

    20/85

    12

    Figure 2.11 Inverted Pendulum, Time (t) = 0.0 secs.

    2.4 Summary

    The dynamic equations for both the linear and non-linear pendulum have been derived and

    subsequently models for each developed using Matlab simulink. It was evident that the

    system is open loop unstable, i.e. for a bounded input variable the output variable goes

    unbounded. However a criterion for accurate system identification is that the process must

    be stable. Consequently simple PID controllers were developed which stabilised the system.

    They were de-tuned so that the dynamics of the pendulum is emphasised and interesting

    data would be generated for system identification purposes. The subsequent chapter details

    the theory, operation and structure of artificial neural networks.

    Pendulum in

    Upright Position

    Time (t) = 0.0 secs

  • 7/27/2019 Trabalho Proposto 1

    21/85

    13

    Chapter 3

    Artificial Neural Networks

    A neural network is an information-processing paradigm inspired by the way, the brain

    processes information [8]. It is composed of a large number of highly interconnected

    processing elements (neurons) working in parallel to solve a specific problem. ANNs learn

    by example, trained using known input/output data sets to adjust the synaptic connections

    that exist between the neurons. The Biological Neuron is composed of a large number of

    highly interconnected processing elements called neurons and are tied together with

    weighted connections or synapses. Learning in biological systems involves adjustments to

    the synaptic connections that exist between the neurons. These connections store theknowledge necessary to solve specific problems.

    Figure 3.1 Biological Neuron [9]

    One of the most interesting features of NNs is their learning ability. This is achieved by

    presenting a training set of different examples to the network and using learning algorithms,

    which changes the weights (or parameters of activation functions) in such a way that the

    network will reproduce a correct output with the correct input values. One encountered

    difficulty is how to guarantee generalisation and to determine when the network is

    sufficiently trained. Neural networks offer non-linearity, input-output mapping, adaptivity

    and fault tolerance. Non-linearity is a desired property if the generator of the input signal is

    inherently non-linear [10]. The high connectivity of the network ensures that the influence

    of errors in a few terms will be minor, which ideally gives a high fault tolerance.

  • 7/27/2019 Trabalho Proposto 1

    22/85

    14

    3.1 Artificial neuron model

    In ANNs the inputs are combined in a linear way with different weights. Walter Pitts

    developed the first model of an elementary computing neuron.

    Figure 3.2 Artificial Neuron, McMulloch & Pitts (1943)

    Each neuron consists of a processing element with synaptic input connections and a single

    output. The initial step is a process whereby the inputs nxxxx ....3,2,1 are multiplied by their

    weights respectively nwwww ....,, 321 and then summed by the neuron. The summation

    process may be defined as

    = =n

    i

    ii xwnet1

    . , (3.1)

    Further to this a threshold value or bias may be included, subsequently the summation

    process may be rewritten as

    bxwnetn

    i

    ii +

    =

    =1

    . , (3.2)

    A non-linear activation function f is generally included in the neuron arrangement, this is

    added to introduce non-linearities into the model. The output of the neuron can now be

    expressed as (see Figure 3.3)

    )(netfy= , (3.3)

    T O

    W1

    W2

    Wn

    X1

    X2

    Xn

  • 7/27/2019 Trabalho Proposto 1

    23/85

    15

    Figure 3.3 Perceptron Model

    Using the back propagation algorithm the weights are dynamically updated. The error

    between the target output and the actual output is calculated as

    )()()( kyktke = , (3.4)

    The error is then back propagated through the layers and the weights adjusted accordingly

    by the formula

    )().(.)()1( kxkekwnw =+ , (3.5)

    The feed-forward process is subsequently repeated. The weights are updated and adjusted

    on each pass until the error between the target and the actual output is low i.e. the model has

    been sufficiently trained.

    Y

    W1

    W2

    Wn

    X1

    X2

    Xn

    f(net)net

    b

    learning target

    error

  • 7/27/2019 Trabalho Proposto 1

    24/85

    16

    3.2 Activation functions

    There are number of different types of activation functions such as step, ramp, sigmoid etc.

    However the most commonly used activation functions are tan-sigmoid, log-sigmoid and

    linear. The effect of the linear function is to multiply by a factor a constant factor. The

    sigmoid function has an S shaped curve, see Figure 3.4.

    Figure 3.4 Log-sigmoid function Tan-sigmoid function Linear function

    3.3 Neural Network architecture

    Network architectures can be categorised to two main types according to their connectivity:

    feed-forward networks and recurrent networks (feedback networks). A Network is feed-

    forward if all of the hidden and output neurons receive inputs from the preceding layer only.

    The input is presented to the input layer and it is propagated forwards through the network.

    Output never forms a part of its own input, see Figure 3.5.

    Figure 3.5 Multi-layer Feed-forward Network structure

    f(net) f(net)

    0

    0

    1 1

    -1

    f(net)

    net

    net

    net

    N

    N

    N

    1f

    1f

    1f

    2f

    2f

    2f

    X1

    X2

    1Y1

    1Y2

    1Y3

    1b1

    1b2

    1b3

    2b1

    2b2

    2b3

    1w1,1

    1wn

    2w1

    2wn

    2Y1

    2Y2

    2Y3

    Input layer hidden layer Output layer

  • 7/27/2019 Trabalho Proposto 1

    25/85

    17

    In a multi-layer feed-forward network each layer has a weight matrix w, a bias vector b, and

    an output vector Y. Thus the network in Figure 3.5, has an input vector

    =

    2

    1

    x

    xx , (3.6)

    The weights of the network are the weight matrices

    =

    3,2

    1

    2,2

    1

    1,2

    1

    3,1

    1

    2,1

    1

    1,1

    1

    1

    w

    w

    w

    w

    w

    w

    W ;

    =

    3,2

    2

    2,2

    2

    1,2

    2

    3,1

    2

    2,1

    2

    1,1

    2

    2

    w

    w

    w

    w

    w

    w

    W , (3.7)

    The biases are the bias vectors

    =

    3

    1

    2

    1

    1

    1

    1

    b

    b

    b

    b ;

    =

    3

    2

    2

    2

    1

    2

    2

    b

    b

    b

    b , (3.8)

    The output of the network can now be written as

    += )( 11111 bxwfy )( 21

    122

    1

    2 bywfy += , (3.9)

    += )( 11121 bxwfy )( 22

    122

    2

    2 bywfy += , (3.10)

    += )( 11131 bxwfy )( 23

    121

    3

    2 bywfy += , (3.11)

    The premise behind the addition of extra layers or nodes enables the network to deal with

    more complex problems and extract higher order statistics. Cybenko proved that a feed-

    forward network with a sufficient number of hidden neurons with continuous and

    differentiable transfer functions could approximate any continuous function over a closed

    interval [11]. There is not a limit to the number of hidden layers. These hidden layers

    increase the non-linear complexity of a network, however, Hornik and indeed other

    researchers have shown that even a two-layer network with a suitable number of nodes in

    the hidden layer can approximate any continuous function over a compact subset [12]. Thus,

    generally just one or two hidden layers are used. Subsequently it is implied that a feed-

    forward neural network with just one hidden layer is suitable for the purpose of

    identification.

  • 7/27/2019 Trabalho Proposto 1

    26/85

    18

    Recurrent network have at least one feedback loop, i.e., cyclic connection, which

    means that at least one of its neurons feed its signal back to the inputs of all the other

    neurons. The behaviour of such networks may be extremely complex [13].

    Figure 3.6 Multi-layer Recurrent Network structure

    The effect of the feedback loop enables the control of outputs through outputs, thus giving

    recurrent networks memory. This is especially meaningful if the network is

    approximating functions dependent on time. Considering dynamics of the inverted

    pendulum, this is particularly applicable i.e. there is several feedback loops in the developed

    model of the pendulum. There are two main types of recurrent networks widely used,

    Elman and Hopfield. The feedback loop in Elman networks enables them to learn to

    recognise and generate spatial as well as temporal patterns [14]. An Elman network can

    approximate any function (with a finite number of discontinuities) with arbitrary

    accuracy, if the hidden layer has a sufficient number of neurons [15]. Hopfield is generally

    N

    N

    N

    f

    f

    f

    f

    f

    f

    X1

    X2

    1Y1

    1Y2

    1Y3

    1b1

    1b2

    1b3

    2b1

    2b2

    2b3

    w1

    wn

    2w1

    2wn

    2Y1

    2Y2

    2Y3

    z-1

    z-1

    z-1

    z-1

  • 7/27/2019 Trabalho Proposto 1

    27/85

    19

    used for classification of feature vectors [16]. Subsequently Elman networks will be the

    chosen as the recurrent network architecture for identification purposes.

    3.4 Learning Algorithms

    In neural networks learning ability is achieved by presenting a training set of different

    examples to the network and using learning algorithm to changes the weights (or the

    parameters of activation functions) in such a way that the network will reproduce a correct

    output with the correct input values. There are three main classes of learning reinforced,

    supervised, and unsupervised. The latter two will be considered here.

    In the supervised learning procedure a set of pairs of input-output patterns. The

    network propagates the pattern inputs to produce its own output pattern and then compares

    this with the desired output. The difference is the error, if this is absence; learning is

    stopped, if present error is back propagated to have weight and bias changed. A supervised

    learning scheme is illustrated in Figure 3.7.

    Figure 3.7 Supervised Learning

    In unsupervised learning there is no external learning signals to adjust the networks weights.

    The approach adopted is to internally monitor their performance. It proceeds by seeking

    trends in the input signals making adaptations in the according to the network function. At

    present unsupervised learning is not fully understood and still the subject of much research.

    An unsupervised learning scheme is illustrated in Figure 3.8.

    X1

    X2

    e error Learning signal

    t

    Input

    training data

  • 7/27/2019 Trabalho Proposto 1

    28/85

    20

    Figure 3.8 Unsupervised Learning

    3.5 Learning rules

    Hebbs Rule was the first rule developed. The rule declares that when a neuron receives an

    input from another neuron, and if both are highly active, then the weight between the

    neurons should be strengthened. Kohonens learning rule is a procedure whereby

    competing processing elements contend for the opportunity to learn. The only permitted

    output is from the winning element, furthermore this element plus its adjacent neighbours

    are permitted to adjust their connection weights. It should also be noted that the size of the

    neighbourhood may adjust during this training period. The Back propagation learning

    algorithm is perhaps the most popular learning algorithm. The net simply propagates the

    pattern inputs to outputs to produce its own patterns comparing this with the desired output,

    the difference being the error. If no error is present learning stops, however if error is

    present, it is back propagated to change weights and biases, this recurs until no error is

    present.

    3.6 Neural Network Limitations

    Neural Networks do have a number of limiting factors including training times, opacity and

    local minima. Neural networks may require exhaustive training times especially with large

    dimensional problems due to the increased number of synaptic weights required to be

    adjusted. However this problem is gradually disintegrating with ever-increasing computer

    processing capabilities. Opacity is associated with Neural Networks that operate as black

  • 7/27/2019 Trabalho Proposto 1

    29/85

    21

    boxes. In a Neural Network that operates as a black box, only the inputs and outputs are

    visible, thus it is difficult to relate to the parameters of the system under consideration to the

    internals of the network. Subsequently it is difficult to obtain an intuitive feel for its

    operation as their performance can only be measured statistically. A major concern

    associated with training is the possibility of becoming trapped in a local minima, see Figure

    3.9 below.

    Figure 3.9 Local & global minimum

    The global minimum represents the lowest point on the graph, which is the optimal solution

    for the problem. The majority of training algorithms operate by travelling down these slopes

    until they find the lowest point. The training algorithm may become trapped in a non-

    optimal solution i.e. local minimum. There are a number of possibilities to overcome this

    issue such as the use of momentum terms or Boltzman annealing. However there is a trade-

    off with increased training time.

    3.7 Applications

    Neural networks have the capability of adaptively controlling and modelling a wide range of

    non-linear process at a high level of performance. In particular to obtain models which

    describe system behaviour i.e. system identification. Modelling is extremely important as itprovides a tool for simulating effects of different control strategies techniques. Using the

    back propagation algorithm a feed-forward network can be trained to approximate arbitrary

    non-linear input-output mappings from a collection of example input and output patterns.

    This learning technique has been applied in a wide variety of pattern classification and

    modelling tasks. Neural networks have found several fields of application including

    medicine, finance, management and other signal processing applications.

  • 7/27/2019 Trabalho Proposto 1

    30/85

    22

    3.8 Summary

    Undoubtedly Neural Networks provide an extremely powerful information-processing tool.

    They are ideal for control systems because of their non-linear approximation capabilities,

    adaptive control and computational efficient due to their parallel architecture. Their ability

    to learn by example makes them both flexible and powerful. They also remove the need to

    explicitly know the internal structure of a specific task. Considering the specific control

    problem of the inverted pendulum, it is apparent that Neural Networks have certain features

    which make them extremely suitable for identification and control of such a challenging

    control problem.

  • 7/27/2019 Trabalho Proposto 1

    31/85

    23

    Chapter 4

    System Identification

    The modelling and identification of linear and non-linear dynamic systems through the use

    of measured experimental data is a problem of considerable importance in engineering [17],

    and has duly received much attention. System identification is essentially a process of

    sampling the input and output signals of a system, subsequently using to respective data to

    generate a mathematical model of the system to be controlled i.e. it is procedure whereby a

    model is developed. Figure 4.1 depicts a system inputs/outputs. The motivation behind

    system identification is to obtain a model that enables the design and implementation of a

    high level performance control system, while providing an insight into system behaviour,

    prediction, state estimation, simulation etc. [18]

    Figure 4.1 Input, Output, Disturbances of a System

    Identification of multivariable systems is an extremely difficult problem due to the coupling

    between various inputs and outputs, further complicated when systems are non-linear [19].

    Neural networks are becoming increasingly recognised for this purpose due to their

    attributes parallelism, adaptability, robustness and their inherent capability to handle non-

    linear systems. Intuitively the inverted pendulum is extremely unstable, further to this from

    the modelling of the pendulum in Chapter 2 the system is open-loop unstable. However

    stability is a necessary criterion for system identification. Duly the system was placed in a

    closed loop and stabilised using PID control. As the system is in a closed loop, it is

    desirable that little of dynamics of the controller be seen at the system output. To achieve

    this the controller was de-tuned i.e. the control was left loose, this ensures that the

    input/output data generated emphasised the dynamics of the pendulum, thus is suitable data

    for system identification.

    Systeminput Uoutput Y

    disturbance e

  • 7/27/2019 Trabalho Proposto 1

    32/85

    24

    Before neural networks are directly used for system identification, conventional linear

    techniques such as auto regressive with exogenous input (ARX) and recursive auto

    regressive moving average with exogenous input (RARMAX) will be investigated and

    applied to the linear pendulum.

    4.1 System Identification Procedure

    System identification is essentially the process of adjusting the parameters of the model

    until the model output resembles the output of the real system. The procedure for system

    identification can be viewed graphically in Figure 4.2. The procedure can be categorised

    into three main stages [20]:

    Experimental input/output data from the process that is being modelled is required. With

    respect to the inverted pendulum system this would consist of the input force on the cart

    and the pendulum angle.

    The second stage is to choose which model structure to use.

    Subsequently the parameters of the model will be adjusted until the model output

    resembles the system output.

    Figure 4.2 System Identification Procedure

    Experimental

    Design

    Data

    Choose

    Model SetChoose

    Criterion

    of Fit

    Calculate Model

    Validate

    Model

    Prior Knowledge

    Not OK

    Revise

    Ok use it.

  • 7/27/2019 Trabalho Proposto 1

    33/85

    25

    4.2 Conventional linear system identification

    The ARX model structure is a simple linear difference equation which relates the current

    outputy(t)to a finite number of past outputsy(t-k)and inputs u(t-k).

    )1()()()1()( 11 +++=++= nbnktubnktubnatyatyaty nbna , (4.1)

    or in more compact form

    )()(

    1)(

    )(

    )()( te

    qAnktu

    qA

    qBty += , (4.2)

    Thus the ARX structure is defined by the three integers na, nb, and nk. nais the number of

    poles and nb-1is the number of zeros, while nkis the pure time-delay in the system. For a

    system under sampled-data control, generally nk is equal to 1. The main method used to

    estimate the aand bcoefficients in the ARX model structure is the Least Squares method. It

    proceeds by minimising the sum of squares of the right-hand side minus the left-hand side

    of the expression above, with respect to a and b [21].

    The RARMAX is an extension from the ARMAX structure, which in turn is an extension

    from the ARX model structure. RARMAX model recursively estimates the a and b

    coefficients in the ARMAX model structure. However, the ARMAX structure also includes

    an extra C parameter in the noise spectrum model. Consequently RARMAX provides

    greater accuracy.

    )()(

    )()(

    )(

    )()( te

    qA

    qCnktu

    qA

    qBty += , (4.3)

    The data from model of the linear pendulum is exported to the Matlab workspace and

    subsequently split into estimation and validation data. Changing the initial seed of the

    excitation signal in the simulation creates the validation data. The ARX model is

    implemented using Matlab functions as follows:

    input_estim = force_1; %input data for estimation

    output_estim= theta_1; %output data for estimation

    input_val= force_2; %input data for validation

    output_val= theta_2; %output data for validation

    orders = [4 5 1]; % defines model structure

  • 7/27/2019 Trabalho Proposto 1

    34/85

    26

    arx_model = arx([output_estim input_estim],orders); %arx function

    compare([output_val input_val],arx_model); %compare function

    The parameters of ARX model are chosen heuristically and subjectively. The compare

    function is used to compare the model output with the validation data. Initially the open

    loop unstable model of the inverted pendulum is modelled. A criterion for system

    identification is that the system is stable. As the open loop system is inherently unstable the

    ARX model completely fails to identify the pendulum as expected, see Figure 4.3.

    Figure 4.3 ARX model output with measured output

    Subsequently input/output data is generated from the controlled closed loop model of the

    inverted pendulum. It is anticipated that the ARX model should identify the stable model.

    Further to this as the complexity of the models increased the accuracy should also increase;

    from the results obtained this was proven to be correct. Table 4.1 shows the different

    parameters tested and the resulting performance from these models.

  • 7/27/2019 Trabalho Proposto 1

    35/85

    27

    ARX [na nb nk] ARX performance1 1 1 21.94%

    2 2 1 93.28%

    3 2 1 93.28%

    3 3 1 99.90%4 2 1 93%

    4 3 1 100%

    Table 4.1 ARX model performance

    Figure 4.4 shows the best model performance tracking both actual output from the system

    and the model output. From the results the ARX model identifies the pendulum dynamics

    with extremely good accuracy.

    Figure 4.4 ARX [4 3 1] model output and measured output

    Having identified the linear system using the arx model, the rarmax subsequently is tested.

    The results obtained are marginally better to that using the arx method, see Table 4.2 and

    Figure 4.5. This is expected as this method includes an additional Cparameter in the noise

    spectrum model. The Matlab script for this is implemented as follows

    orders = [3 3 3 1]; %model structure

    [rarmax_model,yhat]=rarmax([output_estim input_estim],orders,'ff',); % function

  • 7/27/2019 Trabalho Proposto 1

    36/85

    28

    RARMAX [na nb nc nk ] RARMAX performance

    2 1 1 1 98.98%

    2 2 2 1 99.01%

    3 2 2 1 99.34%

    3 3 2 1 99.58%

    3 3 3 1 100%

    Table 4.2 RARMAX model performance

    Figure 4.5 RARMAX [3 3 3 1] model output and process output

    The results using linear identification techniques (arx, rarmax) verify that the system must

    be stabilised before identification can be performed. Also these conventional linear

    identification techniques performed extremely well in modelling the dynamics of the linear

    pendulum. However their representation ability of non-linear systems is restricted. For

    completeness the arx and rarmax are tested to verify this. The input-output data for the non-

    linear pendulum is generated in a similar manner as previously for the linear model. Again

    parameters for both models are chosen heuristically and subjectively. Figure 4.6 shows the

    best model identified. As expected these conventional identification techniques cannot

    identify the full dynamics of the non-linear pendulum.

  • 7/27/2019 Trabalho Proposto 1

    37/85

    29

    Figure 4.6 RARMAX [4 3 3 1] model output and validation data

    4.3 Non-linear System Identification using NARMAX

    Linear identification techniques are well established. However their representation ability of

    non-linear processes is clearly limited. Subsequently, non-linear black-box model structures

    have been developed, however they are still the subject of much debate. One such technique

    is non-linear auto regressive moving average with exogenous inputs or narmax. The narmax

    model can be described by

    )())1(),...,(),1(),.....,(()1( tentutuntytyhty uy +++=+= ))

    )

    )

    , (4.4)

    The main problem with narmax is how to construct a model that easily estimated and used

    to construct a systems dynamics in practical terms [22]. The main disadvantage in the

    narmax estimation procedure is the need to select the most useful terms to be included in the

    model, which are chosen from a large number of available model terms usually running into

    thousands. This presents the most challenging procedure in the estimation of narmax

    structures since it is dependent on factors like the sampling frequency and prior knowledge

    about the system orders [23]. As such non-linear identification techniques such as narmax

    do not offer a suitable solution for system identification in practical terms.

  • 7/27/2019 Trabalho Proposto 1

    38/85

    30

    4.4 System Identification using Neural Networks

    In this section neural networks are implemented for the identification of both the linear and

    non-linear inverted pendulum models. In Chapter 3, different neural networks structures

    were examined, consequently two structures were identified as suitable for the system

    identification of the inverted pendulum, feed-forward and recurrent (Elman) networks. A

    common structure for achieving system identification using neural networks is forward

    modelling, Figure 4.7. This form of learning structure is a classic example of supervised

    learning. The neural network model is placed in parallel with the system both receiving the

    same input, the error between the system and network outputs is calculated and

    subsequently used as the network training signal.

    Figure 4.7 Forward modelling of inverted pendulum using neural networks

    In order to provide targets for the network, the previously developed simulink models of the

    inverted pendulum with feedback control are used. The control force is used as the input to

    the neural network while the target for the network is the angle of the pendulum theta

    (radians). For completeness the linear model of the inverted pendulum shall be identified

    first. The first type of neural network tested is the feed-forward. Using Matlab it is possible

    to develop multi-layer perceptons. However this shall be restricted to either one or two

    hidden layers, as research previous has shown this to be sufficient for the identification of

    non-linear systems. It is also expected that increasing the number of neurons in the hidden

    layer will improve the models accuracy.

    Inverted

    Pendulum

    Learning

    Algorithm

    System i/p System o/p

    Adjust weights

    +

    -

  • 7/27/2019 Trabalho Proposto 1

    39/85

    31

    A feed-forward back propagation network is created using Matlab script as follows

    net = newff([-10 10],[10 1], {'tansig' 'purelin'},'trainlm');

    net.trainParam.epochs = 400;

    net.trainParam.lr = 0.0001;

    net = train(net,in(1:2000)',theta(2:1000)');

    In the example above a two-layer feed-forward network is created. The network's input

    ranges from [-10 to 10]. The first layer has ten tansig neurons, the second layer has one

    purelin neuron. The is the standard set-up of activation functions for multi-layer perceptons,

    the hidden layer has non-linear functions however the output layer always has a linear

    activation function. The trainlm network training function is used. Back-propagation

    updates the weights. The number of epochs and learning rate can be set and adjusted. By

    examining the training diagram it can be determined when the network is sufficiently

    trained, and whether the convergence is too fast; this can sometimes account for getting

    struck in a local minimum, see Figure 4.8.

    Figure 4.8 Neural network training

    When the network is sufficiently trained, the network is exported to the simulink

    environment using the gensim command. To ensure the network is adequately validated

    the initial seed of the input signal must be changed. The mean squared error and a

    comparison of the systems and networks output calculate the quality of each model. The

    mean squared error is not alone sufficient to determine to the quality of the model as there

  • 7/27/2019 Trabalho Proposto 1

    40/85

    32

    could be a low mean squared error and yet a poor prediction of the dynamics of the system.

    The simulink set-up for model validation may be viewed in Figure 4.9.

    Figure 4.9 Model validation set-up in simulink

    The following table 4.3 summarises the results obtained.

    NN Architecture Hidden Layers Neurons in Hidden Layer TrainingEpochs

    MSE

    feed-forward 1 4 400 0.000061404feed-forward 1 10 400 0.0029

    feed-forward 1 20 400 0.00007839

    feed-forward 1 35 400 0.001

    feed-forward 1 50 400 0.0013

    feed-forward 2 [4 2] 400 0.000059481

    feed-forward 2 [10 4] 400 0.000060253

    feed-forward 2 [20 10] 400 0.00040555

    feed-forward 2 [30 20] 400 0.0002063

    Table 4.3 Summary of feed-forward neural network performance

    A nominal number of neurons in the hidden layer were sufficient to identify the model

    extremely well, above this threshold the performance was decreased. Tim Callinans work

    on identification of the inverted pendulum using feed-forward neural networks, accredited

    this to the fact, it is in a closed loop as such de-tuning the controller has a greater effect on

    the models performance than an increase in the number of neurons [24]. The lowest mean

    squared error and overall best performance was achieved using two hidden layers with four

    and two neurons respectively. This is expected, as previously discussed, an increase in

    hidden layers should improve the models performance. Figure 4.10 and 4.11 show the

  • 7/27/2019 Trabalho Proposto 1

    41/85

    33

    optimum performance obtained using one, and two hidden layer feed-forward networks

    respectively, plotting the process output against the model output. Overall the feed-forward

    networks model the process well, predicting the pendulum angle with a low MSE error.

    Figure 4.10 feed-forward network, 1 hidden layer, 4 hidden neurons

    Figure 4.11 feed-forward network, 2 hidden layers, 4 and 2 hidden neurons respectively

  • 7/27/2019 Trabalho Proposto 1

    42/85

    34

    The next type of ANN tested are Elman networks. As discussed in Chapter 3, Elman

    networks are expected to perform well, due to their feedback loop providing dynamic

    memory, subsequently making them suitable in the prediction of dynamic systems. The

    Elman network is implemented using Matlab functions as follows:

    net = newelm([-10 10],[10 5 1], {'tansig' 'tansig' 'purelin'},'trainlm');

    net.trainParam.epochs = 400;

    net.trainParam.lr = 0.0001;

    net = train(net,in(1:2000)',theta(1:2000)');

    In the example above, an Elman network with two hidden layers is created. The network's

    input ranges from [-10 to 10]. The first layer has ten tansig neurons, the second layer has

    five tansig neurons and finally the output layer has a single linear purelin neuron.

    After initially testing it is found the Elman models fail to predict the angle of the pendulum

    and the model predictions are completely out of range. Consequently several pre-processing

    techniques of signals for neural networks were implemented. It is found that scaling the

    input/output data for training has a small filtering effect; subsequently improving the

    networks performance during training. A scaling factor of ten was used, and all validation

    data is also scaled accordingly. By filtering the data in this way it decreases the possibility

    of the network getting caught in a local minimum. The improvement in results was

    extremely good and the models successfully predicted the pendulum angle. Table 4.4

    summaries the results obtained.

    NN Architecture Hidden Layers Neurons in Hidden Layer TrainingEpochs

    MSE

    Elman (Recurrent) 1 4 400 0.0334

    Elman (Recurrent) 1 10 400 0.0033

    Elman (Recurrent) 1 20 400 0.0014

    Elman (Recurrent) 1 25 400 0.0011

    Elman (Recurrent) 2 [4 2] 400 0.0023

    Elman (Recurrent) 2 [10 4] 400 0.00359

    Elman (Recurrent) 2 [15 10] 400 0.003

    Table 4.4 Elman network performance

    Despite some Elman models having a low MSE error, Figure 4.12 shows that they still

    failed to adequately identify the pendulum angle. It found however that an increase in the

    number of neurons and hidden layers significantly improved their performance, see Figure

    4.13.

  • 7/27/2019 Trabalho Proposto 1

    43/85

    35

    Figure 4.12 Elman network, 2 hidden layers, 4 and 2 neurons respectively

    Figure 4.13 Elman network, 2 hidden layers, 15 and 10 neurons respectively

    Overall both networks performed well. The feed-forward network slightly outperformed the

    recurrent network with a lower mean squared error. However both models identified the

    pendulum dynamics well. Having successfully identified the linear pendulum, the next step

    is the identification of the non-linear model. Identification of the non-linear model proceeds

    in the same manner as that for the linear model. Input-output data is generated, the networkis trained, imported into simulink and validation performed.

  • 7/27/2019 Trabalho Proposto 1

    44/85

    36

    The first network tested is the feed-forward network. Table 4.5 summarises the models

    tested and their performance. The mean squared error is low, and an increase in neurons in

    the hidden layer does improve performance as expected. This was not the case in identifying

    the linear model, as the controlled closed-loop model had a greater impact on the dynamics

    of pendulum seen at the output. Figure 4.14 and Figure 4.15 show the best models identified

    using one, and two hidden layer feed-forward neural networks respectively; plotting the

    process output against the model output. From the graphs the models identify the pendulum

    dynamics extremely well.

    NNArchitecture

    Hidden Layers Neurons in Hidden Layer Training Epochs MSE

    feed-forward 1 4 400 0.000051028

    feed-forward 1 10 400 0.000050314feed-forward 1 20 400 0.000048409

    feed-forward 1 35 400 0.00004851

    feed-forward 1 50 400 0.000046509

    feed-forward 2 [4 2] 400 0.000049856

    feed-forward 2 [10 4] 400 0.00005027

    feed-forward 2 [20 10] 400 0.000048053

    feed-forward 2 [30 20] 400 0.000046268

    Table 4.5 Summary of feed-forward network performance

    Figure 4.14 Feed-forward network, 1 hidden layer with 50 neurons

  • 7/27/2019 Trabalho Proposto 1

    45/85

    37

    Figure 4.15 Feed-forward network, 2 hidden layers, 30 and 20 neurons respectively

    The next step is to identify the non-linear pendulum model using the Elman recurrent

    network. It is found that similar to identification of the linear pendulum a scaling factor is

    required for the training data. This value is chosen heuristically and subjectively; the ideal

    scaling factor is determined to 1000. This improves results dramatically, however the

    network still performs poorly. Figure 4.16 shows the difference using the scaling factor. The

    models do not identify the dynamics with the same degree of accuracy as the feed-forward

    models.

    Figure 4.16 Difference between network trained with data scaled

  • 7/27/2019 Trabalho Proposto 1

    46/85

    38

    Clearly the scaling of the training data vastly improves the model performance; yet on

    examination of the best model identified, the Elman network performance is still inferior to

    the feed-forward networks. This is unexpected, as the presence of a feedback loop providing

    dynamic memory in the Elman network, should enhance their prediction of non-linear

    systems. However in cases where the required depth of memory is much larger than the size

    of the tapped delay line, a recurrent network may operate poorly; essentially the information

    needed to predict the future is not concentrated in the current sample neighbourhood [25].

    Thus the network is unable to fully identify the dynamics of the pendulum. Table 4.6

    summarises models tested and their performance and Figure 4.17 shows the best model

    identified.

    NNArchitecture

    Hidden Layers Neurons in Hidden Layer Training Epochs MSE

    Elman 1 4 400 0.1243

    Elman 1 10 400 0.0535

    Elman 1 20 400 0.3049

    Elman 1 25 400 0.3868

    Elman 2 [4 2] 400 0.6928

    Elman 2 [10 4] 400 0.1017

    Elman 2 [15 10] 400 0.05

    Table 4.6 Elman network performance

    4.17 Elman Network, 2 hidden layers, 15 and 10 neurons respectively

    The overall results show that the feed-forward network outperforms the recurrent Elman

    network identifying the non-linear models dynamics. Subsequently the report shall proceed

    focused primarily using feed-forward networks.

  • 7/27/2019 Trabalho Proposto 1

    47/85

    39

    4.5 Javiers Linearised Model

    Javiers linear model of the inverted pendulum is of considerable interest; as it is upon this

    model, controllers for the physical pendulum were developed, thus giving an intuitive feel

    for how accurate the models developed of the linear and non-linear inverted pendulum are.

    The following is the transfer function of the system [26].

    +

    +

    =

    =

    L

    gs

    M

    Fs

    sML

    M

    Fss

    M

    G

    GsG

    2

    2

    1

    1

    1

    )(

    using the following parameters

    F= 0.303Kg/s

    M= 0.091Kg

    L= 0.32898 m

    g= 9.8 m/s2

    Thus giving the following linearised model

    ++

    +=

    =

    ))()((

    )()(

    2

    1

    2

    1

    bsbsas

    skass

    k

    G

    GsG

    where

    30

    11

    46.5

    33.3

    2

    1

    ==

    ==

    k

    k

    b

    a

    This linearised system of the inverted pendulum is modelled using Matlab simulink. As this

    model proved extremely robust in the design of controllers for the physical system, it

    provides a good insight into the behaviour of the physical rig. Figure 4.18 shows the

    simulink set-up. The system is stabilised using a PID controller.

  • 7/27/2019 Trabalho Proposto 1

    48/85

    40

    Figure 4.18 Javiers Linearised Model

    The input to the system is a dither signal in the form of a Pseudo-random Binary sequence

    (PRBS). The signal has a spectral content rich in frequencies [27]. The objective of this

    excitation signal is to generate input-output data, which contains the process dynamics over

    the entire operating range. Figure 4.19 shows the pendulum angle theta.

    Figure 4.19 A comparison of Javiers Model & Non-linear Model

    Comparing the two models, although they do differ, they do display the same dynamics.

    Thus the next step is to identify Javiers model using neural networks. The same procedure

  • 7/27/2019 Trabalho Proposto 1

    49/85

    41

    that is used to identify the non-linear pendulum is adopted; and input-output data for

    training and validation is carried out using the same method. For completeness recurrent

    networks in the form of Elman are tested, a similar pattern emerges as with previous testing,

    the training data had to be scaled and the feed-forward networks out-performed them. Table

    4.7 shows the different feed-forward networks tested and their respective performance.

    NN Architecture Hidden Layers Neurons in Hidden Layer TrainingEpochs

    MSE

    Feed-forward 1 4 400 1.296E-08

    Feed-forward 1 10 400 2.3718E-08

    Feed-forward 1 20 400 1.2917E-08

    Feed-forward 1 35 400 1.2901E-08

    Feed-forward 1 50 400 1.2892E-08

    Feed-forward 2 [4 2] 400 1.2931E-08

    Feed-forward 2 [10 4] 400 1.2942E-08Feed-forward 2 [20 10] 400 1.2917E-08

    Feed-forward 2 [30 20] 400 1.2916E-08

    Table 4.7 Summary of feed-forward network performance

    Table 4.6 clearly shows the mean square error is extremely small. An increase in the number

    of hidden layers and the number of neurons in the hidden layer does improve performance

    slightly. However, with only one hidden layer and four neurons in this layer gives an

    extremely small mse. Figure 4.20 shows the best model identified, plotting the process

    output against the model output. The model identifies the system dynamics extremely well.

    Figure 4.20 Feed-forward NN, 2 hidden layers, 30 and 20 neurons respectively

  • 7/27/2019 Trabalho Proposto 1

    50/85

    42

    4.6 Summary

    In this chapter an overall feel for system identification has been presented and its associated

    procedure explained. Conventional identification techniques were first applied to the linear

    model of the inverted pendulum. These traditional techniques modelled the linear pendulum

    with good accuracy. However such traditional techniques cannot identify the complexity of

    the non-linear pendulum. Extensions have been made to these linear identification

    techniques in the form of non-linear armax. The main problem with non-linear armax is how

    to construct a model that is easily estimated and used to construct a systems dynamics in

    practical terms. As such system identification proceeded forward using neural networks.

    Two main network structures were used feed-forward and recurrent. Initially the linear

    pendulum model was identified. Both structures performed well. However it was found

    necessary to pre-process the data for the recurrent network by using a scaling factor,

    consequently having a filtering effect on the data. This helps prevent the network getting

    stuck in a local minimum during training. The next step was identification of the non-linear

    pendulum. Again both network architectures were tested. Similar results to the linear

    identification were obtained with pre-processing of training data required for the recurrent

    elman network. Overall the feed-forward networks out-performed the recurrent networks;

    this is unexpected, as recurrent networks with their dynamic memory should identify a

    dynamic system with good accuracy. This can be accredited to the fact the in some cases

    recurrent networks perform poorly if the length of the fixed line delays is smaller than the

    required length of memory to predict the next sample. Consequently the remainder of the

    report will use feed-forward networks exclusively. Finally in this chapter Javiers linearised

    model of the inverted pendulum was examined. This model is important because it is upon

    this controllers for the physical pendulum were developed. As such it gives an intuitive feel

    for the accuracy of the models identified. It is seen that overall the model contain mainly the

    same dynamics. For completeness Javiers model is also identified using both feed-forwardand recurrent networks. Subsequently in the next chapter identification of the physical

    pendulum is examined and implemented.

  • 7/27/2019 Trabalho Proposto 1

    51/85

    43

    Chapter 5

    Real Time Identification

    The pendulum rig is comprised of a pole mounted on a cart free to swing in only a vertical

    plane. The cart is driven by a DC motor and allowed to move on a rail of limited length.

    Two optical encoders are used to detect pendulum angle and cart position. The two output

    signals are received by a control algorithm via the interface card, which subsequently

    determines the control action necessary to keep the pendulum upright. The control signal is

    limited within a normalised range from 1 to 1. Figure 5.1 shows the pendulum control

    system.

    Figure 5.1 Set-up of pendulum rig

    The control algorithm is in Matlab. Figure 5.2 shows the real time kernal (RTK) in the

    Matlab environment. The RTK is an encapsulated block implementing the control tasks.

    The input to the RTK block can be in the form of an excitation signal or desired cart

    position. The outputs from the RTK contains all data regarding pendulum angle, angular

    velocity, cart position, cart velocity and the control value. There is however no feedbackcontrol loop because the controller is embedded in the RTK

    Limit switchCart & angle

    sensor

    DC motor

    &position sensor

    measurement

    DC motor driver

    &

    interface

    control

    Control

    Algorithms

  • 7/27/2019 Trabalho Proposto 1

    52/85

    44

    Figure 5.2 Real Time Task in simulink environment

    5.1 Closed-loop controller

    The experiment starts with the pendulum in a downward position. The pendulum is steered

    to its upright unstable position and subsequently kept erect by the linear-Quadratic (LQ)

    controller. As such two independent control algorithms are required.

    Swinging algorithm

    Stabilising algorithm

    Only one control algorithm is active in each control zone. Figure 5.3 shows these zones.

    Figure 5.3 Zones of Control Algorithms

    Stabilisation

    zone

    Swinging

    zone

  • 7/27/2019 Trabalho Proposto 1

    53/85

    45

    The swinging control algorithm is a heuristic one, based on energy rules and has the form

    Frictionusignuu oldold )(+= , (5.1)

    where control uis the a normalised value 1 to 1.

    The linear-quadratic to keep the inverted pendulum stabilised has the form

    )( 44332211 KKKKu +++= , (5.2)where

    1 = desired position of the cart measured position of the cart,2 = desired angle of the pendulum measured angle of the pendulum,3 = desired velocity of the cart observed velocity of the cart,

    4 = desired angular velocity of the pendulum observed angular velocity of the pendulumK1.K4 are positive constants.

    The optimal feedback gain vectorK= [K1.K4 ] is calculated such that the feedback law

    u = -K; (5.3)

    where = [1..4] minimises the cost function

    dtRuuQxxIntegralJ '' += , (5.4)

    where Q and R are the weighting matrix [28].

    5.2 Identification of physical system

    At this stage there is a closed loop stabilised system, this is necessary for identification of an

    open-loop unstable system. The non-linear model identified using neural networks is first

    compared with the physical rig. The input to both the physical system and the model is a

    small dither signal is to generate input-output data, which contains the non-linear process

    dynamics over the entire operating range. Both systems are in a controlled closed loop.

    Comparing the pendulum angle of each the dynamics are similar thus the modelling of the

    system in simulink has been accurate, see Figure 5.4

  • 7/27/2019 Trabalho Proposto 1

    54/85

    46

    Figure 5.4 NN non-linear model output & physical system output

    Having confirmed that the two systems dynamics are similar, the next step is the

    identification of the real rig. The data for this is generated in a similar manner, the

    experiment starts with the pendulum in a downward position. Figure 5.5 shows pendulum

    angle during test.

    Figure 5.5 Pendulum angle of Real System

  • 7/27/2019 Trabalho Proposto 1

    55/85

    47

    Having generated input-output data for the physical system, the next stage is to develop a

    neural network, which identifies the pendulum angle of the physical rig. A feed-forward

    network is used to identify the pendulum angle. Training of the network is performed and

    subsequently imported into the simulink environment. Validation of the network is

    performed on-line, see Figure 5.6. A single hidden layer in the neural network is adopted,

    and a different number of neurons in the hidden layer are tested.

    Figure 5.6 Validation set-up

    Several different models were tested but unable to identify the pendulum angle, see Figure

    5.7. The problem exists that two different controllers are used to swing up and stabilise the

    pendulum, thus depending on what zone the pendulum is in, the output is calculated in a

    different manner; thus effecting the data for training of the neural network and subsequent

    identification.

    Figure 5.7 feed-forward NN, 1 hidden layer with 75 neurons

  • 7/27/2019 Trabalho Proposto 1

    56/85

    48

    Given that a neural network can approximate data on which it has not been trained it is

    decided to train the neural network using the data from the stabilised zone and see if it can

    approximate the swinging up action. Thus the neural network is re-trained using only the

    input-output data when the pendulum is in the stabilised zone using the linear-quadratic

    control i.e. it is not trained during the swing up action using the swinging algorithm.

    Figure 5.8 shows the best model obtained.

    Figure 5.8 feed-forward neural network, 1 hidden layer with 75 neurons

    From the diagram a vast improve in the model is observed, it identifies the pendulum angle

    in the stabilised region extremely well and also models well, the pendulum swing up

    motion, on which it has not been trained. This is possible because of neural networks ability

    to approximate functions on which they have not been trained. At stage a suitable model has

    been developed for the physical rig. However this must be subjected to further tests. The

    model identified must be robust in order to achieve neuro control. Subsequently a

    disturbance is added on-line to see how the model responses. See Figure 5.9 for set-up.

  • 7/27/2019 Trabalho Proposto 1

    57/85

    49

    Figure 5.9 System set-up with Disturbance

    From Figure 5.10 it can be seen that the model correctly identifies the pendulum angle

    during the disturbance.

    Figure 5.10 Pendulum angle during disturbance

    To further test the model the dither signal is increase so that the system is unstable

    throughout the experiment. Figure 5.11 on the following page shows that the model still

    identifies the pendulum angle. Thus an accurate model of the inverted pendulum has been

    developed.

    Disturbance

  • 7/27/2019 Trabalho Proposto 1

    58/85

    50

    Figure 5.11 Pendulum Angle, with large excitation signal

    5.3 Summary

    In this chapter the physical rig set-up is discussed and its closed-loop controllers. The

    system comprises of two controllers, one to swing-up the pendulum and one to maintain it

    in a stabilised region. Neural networks had difficulty in identifying the process, as

    depending on what zone the pendulum is in, the output is calculated in a different manner,

    as such the output from the two controllers do not relate to each other. Neural networks can

    approximate data on which they have not been trained; consequently the network is trained

    using only the data from the pendulum in the stabilised region. The neural network

    subsequently identifies the pendulum angle with good accuracy and successful predicts the

    swing-up action. The model is then subjected to further experiments to test its robustness.

    The model successfully identifies the process when subjected to a disturbance and a larger

    excitation signal. Subsequently the next stage is neuro-control, which is dealt with in the

    following chapter.

  • 7/27/2019 Trabalho Proposto 1

    59/85

    51

    Chapter 6

    Neuro Control

    The inverted pendulum is open loop unstable, non-linear and a multi-output system. Thephysical rig has one input a normalised control output value between 1 and 1, and two

    outputs the pendulum angle theta and the cart position. A model of the physical rig has been

    identified using static feed-forward networks modelling the pendulum angle. Thus the next

    step is neuro control. However before a neuro-controller is developed, a comparison

    between standard linear techniques of control such as PID and neuro control is made.

    Subsequently the different techniques of neuro control are discussed and a control

    developed.

    Standard linear control techniques such as PID cannot map the complex non-

    linearitys of the pendulum system. They have been used to control the physical rig, but

    based on the condition the experiment starts with the pendulum in a stabilised zone. Even at

    this its control of the system is extremely limited. ANNs have the capability of adaptively

    controlling and modelling a wide of non-linear processes at a high level of performance.

    The inverted pendulum is a SIMO system; in order to have full-state feedback control

    several PID controllers would be necessary. However due to neural networks parallel

    nature, a single neural network is sufficient.

    Before a neuro-controller is developed, the merits of the main types of neuro-control

    are discussed. The main types of neuro control include supervised, unsupervised, model

    reference, direct inverse and adaptive.

    6.1 Supervised Control

    In supervised control the neural network uses an existing controller to learn the control

    action. One may question why mimic an existing controller if it performs satisfactory. The

    problem arises that traditional controllers may operate well around a specific operating

    point. However if a disturbance or uncertainty occurs these traditional controllers fail. The

    advantage of neuro-control is that the network can adjust and update its weights. A neuro-

    controller can also approximate on data on which it has never been trained. Supervised

    control proceeds with a teacher providing the control output for the neural network to learn.

    The simplest approach to this method is to teach the network off-line; subsequently the

    neural network is placed in the feedback loop, see Figure 6.1

  • 7/27/2019 Trabalho Proposto 1

    60/85

    52

    Figure 6.1 Supervised learning using existing controller

    6.2 Unsupervised Control

    Unsupervised control does not require a prior knowledge. However the behaviour of such a

    network may be extremely complex; at best unsupervised control is still not fully

    understood. In unsupervised learning the neural network tests different states determining

    which produces the correct control action. The learning time is computationally inefficient.

    However an unsupervised neuro-controller can deal with complex non-linear control.

    Anderson et al [29] developed an unsupervised controller for the inverted pendulum,

    however a certain amount of prior knowledge is incorporated; in that failure signal is

    supplied to the neural network based on pendulum angle and cart position

    6.3 Adaptive neuro control

    The main advantage of adaptive control is the ability to adapt on-line. This is achieved by

    presenting the neuro controller with an error signal. This is calculated by subtracting the

    actual output from the desired output. Subsequently the error is used to adjust the weights

    on-line, see Figure 6.2.

    PLANT

    Control

    U Y

    error

    Update weights

  • 7/27/2019 Trabalho Proposto 1

    61/85

    53

    Figure 6.2 Adaptive neuro control

    6.4 Model Reference Control

    Model reference control differs from adaptive neuro control, in that the desired closed loop

    response is specified through reference model. Thus, the error signal is calculated using the

    reference model. The neuro-control forces the plant output similar to the reference model

    output, see Figure 6.3

    Figure 6.3 Model Reference Control

    PLANTU Y

    error signal used to

    adapt weights

    Desired Response

    Y

    PLANT

    Reference Model

    -

    +

    U

  • 7/27/2019 Trabalho Proposto 1

    62/85

    54

    6.5 Direct inverse control

    In direct inverse control the neural network is trained to model the inverse of the plant. The

    plant output is inputted to the neuro controller; subsequently the neuro controllers output is

    compared with the plant input and network trained, see Figure 6.4. The main difficulty with

    this method is that the inverse model must be extremely accurate; as such this method is

    limited to open-loop stable systems [30]. This can be accredited to the fact in a closed-loop

    too much of the plant dynamics are removed, subsequently an accurate model cannot be

    identified.

    Figure 6.4 Direct inverse control

    Considering the different control techniques possible with respect to the inverted pendulum

    supervised emerges as a suitable solution. Inverse control is not possible, as previously

    stated, an extremely accurate inverse model of the plant is required. However this is

    unobtainable as the inverted pendulum is open-loop unstable, and the closed loop model

    contains some of the dynamics of the controller. Anderson has proved unsupervised control

    of the inverted pendulum possible. However this is an extremely complex method not yet

    fully understood. Given the time constraints of the project this is not a viable technique.

    Similar to inverse model control, model reference control also requires an accurate model of

    the plant, as such this method is limited to an open-loop stable system. Given that there is an

    existing controller for the inverted pendulum rig based on a swinging algorithm and a

    stabilising algorithm, supervised control offers the best solution. Consequently neuro-

    control proceeds focused using supervised neuro-control.

    PLANTU Y

    +_

  • 7/27/2019 Trabalho Proposto 1

    63/85

    55

    6.6 Neuro Control in Simulink

    It is decided to develop a neuro-controller using supervised learning. Using the existing

    feedback controller for the non-linear pendulum a feed-forward network is trained to model

    the controller. The model is developed and trained using the same techniques covered in

    system identification; with the exception that the input to train the network is the angle theta

    and the output is the controller input. When training is complete the model imported into the

    simulink environment and placed in the feedback loop replacing the existing controller, see

    Figure 6.5.

    Figure 6.5 Neuro Controller in Simulink

    Figure 6.6 shows the pendulum angle controlled using the neuro controller, clearly the

    controller maintains stability, keeping the pendulum up-right.

    Figure 6.6 Pendulum angle using neuro controller in Simulink

  • 7/27/2019 Trabalho Proposto 1

    64/85

    56

    The next step is placing the controller in closed-loop with the non-linear pendulum model

    identified using neural networks, see Figure 6.7.

    Figure 6.7 neuro control of non-linear pendulum model

    Figure 6.8 show the neuro-controller also stabilises the model identified using neural

    networks.

    Figure 6.8 neuro control of non-linear model

    6.7 Real time neuro-control

    At this stage neuro-control of the non-linear pendulum model and the model identified using

    neural networks has been carried out in the simulink environment. Subsequently the next

    step is neuro control of the physical system. It is possible to test and develop controllers for

    the physical rig using the external controller function. The external controller is a file, which

    contains the control routine, which is accessed at the interrupt time. The control algorithm

  • 7/27/2019 Trabalho Proposto 1

    65/85

    57

    must be written in C code and a dynamic link library created. Thus the steps to develop the

    neuro-controller can be categorised into four main stages:

    Train the neural network offline, import into simulink.

    Validate model online.

    Code the neuro-controller in C.

    Test online.

    Up until this point system identification has primarily focused on the pendulum

    angle. In order to achieve control of the physical system both the pendulum angle theta and

    the cart position must be taken into account due to the limited length of rail. The neural

    network training procedure is carried out in a similar manner as previous, with the exception

    the input for training is the pendulum angle and the cart position. The output for training is

    the control output. The Matlab script is as follows:

    tempP = [angle';position'];

    net = newff([-3.5 3.5; -1 1],[20 1], {'tansig' 'purelin'},'trainlm');

    net.trainParam.epochs = 1000;

    net.trainParam.lr = 0.0001;

    net = train(net,tempP,theta');

    Feed-forward networks are used and different parameters are tested i.e. the number of

    hidden layers and the number of neurons in the hidden layer. They are kept to a minimum so

    that the subsequent C coding of the model is easier. Similar to identification of the physical

    rig, the network is trained using only data form the stabilised zone, and the network is

    allowed approximate the swing-up action. The next step is to validate the model before

    implementing it in C. Figure 6.9 shows validation set-up. It is found that the optimum

    number of hidden layers is 1 with 20 neurons in each layer respectively.

  • 7/27/2019 Trabalho Proposto 1

    66/85

    58

    Figure 6.9 Validation set-up

    The control output from the existing controller is compared with the output from the neuro-

    controller, it is found that the two are similar with a low mean squared error, see Figure

    6.10.

    Figure 6.10 Neuro Controller output

    The neuro-controller structure is shown in Figure 6.11, on the following page.

  • 7/27/2019 Trabalho Proposto 1

    67/85

    59

    Figure 6.11 Neuro Controller

    Structure

    N

    N

    X1

    X2

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    N

    Tansig

    b1

    b2

    b11

    b12

    b13

    b10

    b9

    b8

    b7

    b6

    b5

    b4

    b3

    b18

    b17

    b16

    b15

    b14

    b19

    N

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig

    Tansig