“navbot” - agente robótico autónomo com aprendizagem ... - autonomous... · estratégias de...

201
Universidade de Aveiro Departamento de Electrónica e Telecomunicações “NAVBOT” - Agente robótico autónomo com aprendizagem neuronal de estratégias de mapeamento e navegação autónomas Pedro Manuel Casal Kulzer Lic. em Eng. Electrónica e Telecomunicações pela UNIVERSIDADE DE AVEIRO AVEIRO, SETEMBRO DE 1996

Upload: lytuyen

Post on 01-Jan-2019

245 views

Category:

Documents


0 download

TRANSCRIPT

Universidade de Aveiro Departamento de Electrónica e Telecomunicações

“NAVBOT” - Agente robótico autónomo com aprendizagem neuronal de estratégias de mapeamento e navegação autónomas

Pedro Manuel Casal Kulzer

Lic. em Eng. Electrónica e Telecomunicações pela

UNIVERSIDADE DE AVEIRO

AVEIRO, SETEMBRO DE 1996

Universidade de Aveiro Departamento de Electrónica e Telecomunicações

“NAVBOT” - Autonomous robotic agent with neural network learning of autonomous

mapping and navigation strategies

Pedro Manuel Casal Kulzer

Lic. in Electrical Engineering by the

UNIVERSITY OF AVEIRO, PORTUGAL

AVEIRO, SEPTEMBER 1996

Estratégias de navegação de agentes autónomos com redes neuronais

i

“NAVBOT” - Agente robótico autónomo com aprendizagem neuronal de estratégias de mapeamento e

navegação autónomas

- um modelo comportamental -

Apesar dos agentes autónomos serem ainda alvo de investigações recentes, parecem ser um campo muito promissor, onde o progresso feito

de pequenas conquistas levará, eventualmente, a uma conquista e ajuda maior para a Humanidade.

Pedro Manuel Casal Kulzer

Lic. em Eng. Electrónica e Telecomunicações pela

UNIVERSIDADE DE AVEIRO

Tese submetida para a satisfação parcial dos requisitos do programa de Mestrado em ENGENHARIA

ELECTRÓNICA E TELECOMUNICAÇÕES

Execução por: Eng. Pedro C. Kulzer (Departamento de Electrónica e Telecom., Univ. de Aveiro)

Orientação por: Prof. Dr. Francisco C. Vaz (Universidade de Aveiro, INESC)

Prof. Dr. Keith L. Doty (Machine Intelligence Laboratory, University of Florida) Prof. Dr. José C. Príncipe (Computational NeuroEngineering Laboratory, Univ. of Florida)

Universidade de Aveiro Departamento de Electrónica e Telecomunicações

AVEIRO, SETEMBRO DE 1996

Estratégias de navegação de agentes autónomos com redes neuronais

ii

“NAVBOT” - Autonomous robotic agent with neural network learning of autonomous mapping and navigation

strategies

- a behavioral model -

Although autonomous agents are still subject of early research, it seems to be a very promising field, where the progress made out of small achievements will eventually lead to an overall achievement and help for Mankind.

Pedro Manuel Casal Kulzer

Lic. in Electrical Engineering by the

UNIVERSITY OF AVEIRO

Dissertation submitted for partial satisfaction of the requirements of the Master’s course in ELECTRICAL

ENGINEERING Executed by: Eng. Pedro C. Kulzer (Departament of Electronics and Telecomm., Univ. of Aveiro)

Oriented by: Prof. Dr. Francisco C. Vaz (Universidade of Aveiro, INESC)

Prof. Dr. Keith L. Doty (Machine Intelligence Laboratory, University of Florida) Prof. Dr. José C. Príncipe (Computational Neuroengineering Laboratory, Univ. of Florida)

Universidade de Aveiro Departamento de Electrónica e Telecomunicações

AVEIRO, SEPTEMBER 1996

Estratégias de navegação de agentes autónomos com redes neuronais

iii

Resumo

Neste trabalho, é tentada a implementação de um agente robótico autónomo

minimalista com capacidades de navegação avançadas, as quais dependem em

arquitecturas especializadas com base em redes neuronais artificiais. Pretende-se

mostrar que uma plataforma robótica comparativamente simples e dotada de

componentes baratos é capaz de realizar tarefas complexas de navegação, tais como

dead-reckoning (intuição espacial), circunscrição de pistas visuais e correspondente

discriminação / reconhecimento, e a construção eficiente de mapas e seu uso para

navegações futuras com a tomada de atalhos e voltas. São usados shaft-encoders

para a implementação de uma bússola interna grosseira, o qual permite a realização

de integrações dos caminhos, usadas para a avaliação e posterior cálculo de

trajectórias. Os shaft-encoders também são usados para a sincronia geral da

velocidade nas redes de reconhecimento. Sensores de infravermelhos são usados

para a detecção e circunscrição de pistas visuais. Os dados resultantes desta

circunscrição são usados para construir ou alimentar redes neuronais em anel fechado

já existentes, cuja tarefa é a de discriminação / reconhecimento de pistas visuais.

Estas redes neuronais incorporam informação espacial bem como temporal, para

melhorar o seu desempenho global. Após esta construção / reconhecimento é

construído o mapa e guardadas as redes de reconhecimento das pistas visuais, bem

como todas as distâncias conhecidas relativamente às outras pistas visuais. Esta

informação de mapa pode depois ser utilizada para cálculos futuros de trajectórias,

onde os atalhos e voltas representam um papel especial na eficiência de um agente.

Antes de realizar a implementação final, é feita uma pesquisa bibliográfica inicial, a

qual permite saber o que existe de actual neste campo da navegação em agentes

autónomos e alguns campos relacionados.

Palavras-chave: agentes autónomos, estratégias de navegação, construção de mapas, uso de mapas, mapas cognitivos, redes neuronais, hippocampus, neurónios posicionais, zonas posicionais, propagação de actividade, pistas visuais, inspiração biológica, minimalismo, simplicidade.

Estratégias de navegação de agentes autónomos com redes neuronais

iv

Abstract

In this work, it is attempted to implement a minimalist autonomous robotic agent with

advanced navigation abilities, which rely on specialized architectures based on artificial

neural networks. It is intended to show that a comparatively simple robot platform with

cheap components, is able to perform complex navigation tasks such as dead-

reckoning (navigation through spatial intuition), landmark circumvention and

corresponding discrimination / recognition, and efficient map construction and use for

future navigation with shortcuts / detours. Shaft-encoders are used for the

implementation of a coarse internal compass, which enables path integration used for

trajectory evaluation and future computation. They are also used for general speed

synchronization for the recognition networks. Infra-red sensors are used for landmark

detection and circumvention. The resulting data from this circumvention is used on-line

to build or feed already existent circular neural networks that have the task of landmark

discrimination / recognition. These neural networks incorporate space as well as time

information to enhance the performance. After this construction / recognition, map

construction is performed in that the landmarks’ recognition networks are stored, as well

as their already known displacements relative to other landmarks. This map information

can then be used for future trajectory computations, where shortcuts and detours play a

special role in agent efficiency. Before realizing the final implementation, an initial

bibliographic research will allow to get to know the state-of-the-art in this field of

navigation in autonomous agents and some related fields.

Keywords: autonomous agents, navigation strategies, map construction, map usage, cognitive maps, neural networks, hippocampus, place neurons, place fields, activity propagation, visual landmarks, biological inspiration, minimalism, simplicity.

Estratégias de navegação de agentes autónomos com redes neuronais

v

Agradecimentos

por ordem alfabética, em que todas as contribuições foram especialmente valiosas: Ao Eng. António Lebre Branco, aluno de Mestrado na Universidade de Aveiro, pelas

conversas valiosas onde eram discutidos certos pormenores mais difíceis deste trabalho, bem como pelo apoio moral incondicional.

À Christina Willrich e ao Richard Man da ImageCraft, pela sua ajuda no uso do compilador de “C” ICC11 e na remoção de bugs, e por me terem deixado usar as suas últimas versões do programa ainda em fase de testes.

Ao Prof. Dr. Francisco Cardoso Vaz, director do pólo INESC de Aveiro, pela sua orientação geral sobre os trâmites de uma Tese, e por ir lembrando quais os passos a dar.

Ao Prof. Dr. Gregor Schöner, director do Centre de Recherche en Neurosciences Cognitives, C.N.R.S Marseille, França, pelas suas aulas que permitiram começar a dar uma perspectiva diferente dos agentes autónomos.

À JNICT-Junta Nacional de Investigação Científica e Tecnológica, pela concessão da Bolsa de Mestrado PRAXIS XXI / BM / 836 / 94, sem a qual seria muito difícil suportar os custos inerentes resultantes deste trabalho.

Ao Prof. Dr. José Carlos Príncipe, director do Laboratório de Neuroengenharia da Universidade da Florida, pelas suas ajudas nos pormenores teóricos e práticos em certos aspectos deste trabalho, bem como pela minha visita ao seu laboratório.

Ao Prof. Dr. Keith L. Doty, director do Laboratório de Inteligência de Máquina na Universidade da Florida, pela sua constante orientação e sabedoria na área dos agentes autónomos, bem como pela minha visita ao seu laboratório.

Ao Eng. Kelly Snow, por ter construído e cedido a plataforma robótica sobre a qual foi realizado todo este trabalho.

Ao pessoal do Laboratório de Inteligência de Máquina, pelas suas contribuições e conversas pontuais em certos aspectos deste trabalho.

Ao Eng. Neil Euliano, aluno de Doutoramento no Laboratório de Neuroengenharia da Universidade da Florida, pela sua ajuda no desenvolvimento dos mecanismos neuronais ligados ao reconhecimento.

Aos Engs. Scott Jantz e Eric de la Iglesia, por várias dicas dadas. À Universidade de Aveiro e Universidade da Florida, pelos meios disponibilizados para a

realização desta Tese. A todos cujas conversas e dicas foram importantes para a progressão desta Tese.

Estratégias de navegação de agentes autónomos com redes neuronais

vi

Acknowledgments

in alphabetic order, where all contributions were specially valuable: To Eng. António Branco, Master’s student at the University of Aveiro, for the valuable

conversations where certain aspects of this work were discussed, as well as for his unconditional moral support.

To Christina Willrich and Richard Man from ImageCraft, for their help in using the ICC11 C compiler and in removing bugs, and letting me experiment their latest versions still in tests.

To Prof. Dr. Francisco Cardoso Vaz, director of the INESC at Aveiro, for his general tutoring about the aspects of a Thesis, and for reminding what steps to take.

To Prof. Dr. Gregor Schöner, director of the Centre de Recherche en Neurosciences Cognitives, C.N.R.S Marseille, France, for its course which enabled to start a different view on autonomous agents.

To JNICT-National Joint for Scientific and Technological Research, for the concession of the scholarship PRAXIS XXI / BM / 836 / 94, without which it would be very difficult to support the resulting costs of this work.

To Prof. Dr. José Carlos Príncipe, director of the Neuroengineering Laboratory at the University of Florida, for his help in certain theoretical and practical aspects of this work, as well as for my visit to his laboratory.

To Prof. Dr. Keith L. Doty, director of the Machine Intelligence Laboratory at the University of Florida, for his constant advisement and knowledge in the area of autonomous agents, as well as for my visit to his laboratory.

To Mr. Kelly Snow, for having built and conceived the robotic platform on which all this work was done.

To the people of the Machine Intelligence Laboratory, for their punctual contributions and conversations about certain aspects of this work.

To Eng. Neil Euliano, Ph.D. student in the NeuroEngineering Laboratory at the University of Florida, for his help in the development of the neural mechanisms of recognition.

To Engs. Scott Jantz e Eric de la Iglesia, for giving various tips and tricks. To the University of Florida and University of Aveiro, for the disposal of means to do this

Thesis. To everyone else, whose conversations and tips were important for the ongoing of this

Thesis.

Estratégias de navegação de agentes autónomos com redes neuronais

vii

Abreviaturas

2D - 2 Dimensões

3D - 3 Dimensões

AA - Agente(s) Autónomo(s)

IA - Intelligência Artificial

EEG - ElectroEncefaloGrama

HC - HippoCampus

IM - Informação Métrica

IT - Informação Topológica

LCD - Display de Cristais Líquidos

NP - Neurónio(s) Posicional(is)

PV - Pista(s) Visual(is)

RAM - Random Access Memory

RNA - Rede(s) Neuronal(is) Artificial(is)

RF - Rádio Frequência

WTA - Rede ou mecanismo de Winner-Take-All

ZP - Zona(s) Posicional(is)

Convenções

• [Autor, 19xx] - referências publicadas

• {Autor, 19xx} - material não publicado ou conversações referidas num artigo

• palavra / expressão em itálico - palavra / expressão técnica usada pela primeira vez numa secção, palavra / expressão Inglesas, ou expressão que deve ser distinguida do restante texto.

• palavra em bold - detalhe importante numa explicação

Estratégias de navegação de agentes autónomos com redes neuronais

viii

Índice

I. INTRODUÇÃO ...................................................................................................................................................................1

I-1 RAZÃO E MOTIVAÇÃO PARA OS AGENTES AUTÓNOMOS ....................................................................................................1

I-2 OBJECTIVOS DESTE TRABALHO ........................................................................................................................................2

I-3 CARACTERÍSTICAS DO AGENTE AUTÓNOMO .....................................................................................................................3

I-4 ASPECTOS GERAIS ............................................................................................................................................................4

I-4.1 Inteligência artificial em agentes autónomos..........................................................................................................4

I-4.2 Navegação e uso de mapas......................................................................................................................................5

I-4.3 Reconhecimento de lugares e pistas visuais ............................................................................................................7

II. PROCESSOS BIOLÓGICOS ...........................................................................................................................................9

II-1 INTRODUÇÃO ..................................................................................................................................................................9

II-2 PROCESSOS TEMPORAIS E RELACIONAIS ........................................................................................................................10

II-3 CONSTRUÇÃO DE MAPAS E ENCONTRO DE LUGARES .....................................................................................................11

II-3.1 Evidência da construção de mapas nos organismos............................................................................................12

II-3.2 Particularidades da construção de mapas biológicos .........................................................................................13

II-3.3 O fenómeno do hippocampus ...............................................................................................................................14

III. TEORIAS, MODELOS E IMPLEMENTAÇÕES EXISTENTES.............................................................................18

III-1 MODELAMENTO DAS ESTRUTURAS E FUNÇÕES DO HC.................................................................................................18

III-2 MODELOS INSPIRADOS NA BIOLOGIA............................................................................................................................19

III-3 NAVEGAÇÃO ATRAVÉS DE INTELIGÊNCIA ARTIFICIAL...................................................................................................21

IV. DESENVOLVIMENTO DO PROJECTO ...................................................................................................................23

IV-1 SISTEMA REACTIVO BÁSICO.........................................................................................................................................23

IV-2 BÚSSOLA INTERNA E INTEGRAÇÃO DO CAMINHO..........................................................................................................23

IV-3 DISCRIMINAÇÃO E RECONHECIMENTO DE PISTAS.........................................................................................................24

IV-4 CONSTRUÇÃO DO MAPA ..............................................................................................................................................27

IV-4.1 Aspectos básicos..................................................................................................................................................27

IV-4.2 Adicão de uma nova pista visual.........................................................................................................................28

IV-4.3 Navegação baseada no mapa .............................................................................................................................28

V. UMA IMPLEMENTAÇÃO REAL ................................................................................................................................31

Estratégias de navegação de agentes autónomos com redes neuronais

ix

V-1 DISCRIMINAÇÃO E RECONHECIMENTO DE PISTAS ..........................................................................................................32

V-2 CONSTRUÇÃO DE MAPAS ..............................................................................................................................................33

V-3 TESTES REAIS, CALIBRAGENS E RESULTADOS................................................................................................................34

V-3.1 Comentários .........................................................................................................................................................38

VI. PALAVRAS FINAIS ......................................................................................................................................................39

VI-1 COMENTÁRIOS ............................................................................................................................................................40

VI-2 LIÇÕES A TIRAR DESTA TESE........................................................................................................................................41

VI-3 TRABALHO E MELHORIAS FUTURAS .............................................................................................................................41

ANEXO

Estratégias de navegação de agentes autónomos com redes neuronais

x

Prefácio

Este trabalho refere-se principalmente a estratégias de navegação, e aspectos relacionados tais

como o reconhecimento de pistas visuais e dinâmicas neuronais.

A intenção do seguinte relatório é de fornecer informação acerca de trabalho passado e presente,

servindo como referência para interesses futuros nesta área. Está organizado em seis capítulos

principais: no primeiro, são apresentados os objectivos deste trabalho e são explicadas algumas

considerações introdutórias sobre agentes autónomos. Aqui também são expostos os principais pontos

de investigação deste trabalho. No segundo, é realizado e analisado um resumo do trabalho passado

mais relevante relacionado com a área biológica experimental deste campo de investigação, bem como

alguns modelos e teorias. Esta exposição do trabalho passado é propositadamente feita apenas depois

das considerações do primeiro capítulo, por duas boas razões: em primeiro lugar, o leitor tem de se

consciencializar dos aspectos e problemas que vão ser discutidos ao longo deste trabalho e, em

segundo lugar, já havia a preocupação de desenvolver teorias e modelos para navegação, antes de se

obterem dados biológicos experimentais. No terceiro, são apresentadas, analisadas e comentadas

teorias, modelos e implementações biológicas e artificiais. No quarto capítulo, após uma extracção das

ideias-base que irão desempenhar um papel fundamental neste trabalho, são discutidos os

mecanismos a utilizar para cada módulo ou nível de competência para uma implementação final de um

agente autónomo. Finalmente, no quinto, é descrita a implementação real, onde todos os mecanismos

são finalmente escolhidos e onde resultados intermédios dos diversos módulos são mostrados e

comentados. Aqui também se realizam os testes e são apresentadas simulações e resultados. No final,

são deixadas algumas linhas-guia para possível trabalho futuro e é apresentada uma extensa lista das

referências, bibliografia, software e hardware utilizados.

Ao longo de todo o texto, significados e explicações são dispostas em rodapé para facilitar a leitura

daqueles que não precisem delas, e para não sobrecarregar o texto.

A fluência deste relatório pretende dar uma profundidade progressiva nos aspectos que são

tratados. Iniciando com texto introdutório superficial, será cada vez mais específico e detalhado.

Essencialmente, são tratados os mesmos aspectos em cada capítulo, mas a níveis diferentes. Desta

forma, passa-se da apresentação de conceitos, implementações e teorias, problemas e limitações,

propostas de implementação, para a implementação final e testes.

Será dada uma especial ênfase a imagens ao longo de todo o texto, já que elas fornecem a melhor

ideia do que se está a passar. São usadas legendas longas com insistência, para explicar as imagens

in-loco, em vez de as referenciar a meio do texto. Isto mantém a continuidade das ideias sem ter de

interromper a leitura do texto para procurar as imagens. O propósito principal disto, é de tornar este

relatório o mais comunicativo possível. É necessário comunicação e não texto seco sem imagens.

“ Se não consigo desenhá-lo, então é porque não o entendo ” - Albert Einstein -

Estratégias de navegação de agentes autónomos com redes neuronais

xi

Preface

This work concerns mainly navigation strategies and related aspects like landmark recognition and

neural dynamics.

The following report is intended to provide information about previous and the present work, serving

as a reference for future interests in this area. It is organized in six main chapters: in the first one, the

objectives of this work are presented and some introductory considerations concerning autonomous

agents are exposed and explained. In the second chapter, an abstract of the most relevant previous

research related to the biological experimental area of this research field, is presented and analyzed, as

well as some models and theories. This exposure of existing biological work is only made after the

second chapter, for two good reasons: first, the reader must be aware of the issues and problems that

are going to be discussed throughout this work; second, there was already a concern about studies and

theories for navigation, before biological experimental data started to become available. In the third,

existing biological and artificial theories, models and implementations, are presented, analyzed and

commented. In the fourth, after a small extraction of the base ideas that are going to play a major role in

this work, the mechanisms for each module or competence level of a final implementation of an

autonomous agent are discussed. Finally, in the fifth one, the real implementation is made where all the

mechanisms are finally chosen, and where intermediate results of the various modules are displayed

and commented. Also, simulations and test results are presented., in the sixth, the whole

implementation is put to test and results are presented. At the end, some future work guide-lines are

presented, as well as an extensive list of references and bibliography is presented, which were used.

During the whole text, meanings and explanations about certain words applied to this field of

research are made in footnotes to ease the reading for those who do not need them, and to avoid

overcrowding.

The fluency of this report is intended to provide increasing depth on the aspects that are treated.

Starting with superficial introductory text, it gets more and more specific and detailed. Essentially, the

same aspects are treated in each chapter, but in a different level. This way, one passes through the

presentation of concepts, implementations and theories, problems and limitations, implementation

proposals, and the final implementation and tests.

Pictures are strongly emphasized throughout the whole text, since they give the best idea of what is

happening. Long captions are used with insistence to explain the pictures on-the-spot, instead of

referring to them in the middle of the text. This maintains the continuity of the ideas, without having to

search for pictures and breaking the reading sequence. The main aim of all this, is to make this report

and corresponding work as communicative as possible. It is communication what is needed, not boring

text without pictures.

“ If I cannot draw it, then it’s because I don’t understand it ” - Albert Einstein -

CCCCAPÍTULO APÍTULO APÍTULO APÍTULO IIII

“É pelo princípio que se começam todas as coisas” - Anónimo

I - Introdução

1

VI-1. Razão e motivação para os agentes autónomos Existe uma multitude de aplicações para os chamados “Agentes Autónomos” (AA). Assumindo que

estes AA possuem intrinsecamente alguma capacidade de navegação, tal como a construção de mapas ou, pelo menos, estratégias de alcance de alvos (evitar obstáculos, seguir luzes, etc.), podem ser imediatamente indicados alguns campos de aplicação potencial:

• Exploração extraterrestre - sondas espaciais, carros para superfícies planetárias [Gat, Desai,

Ivlev, Loch & Miller, 1994].

• Exploração subaquática - exploração e reparação a grandes profundidades [Herman & Albus, 1988] [Herman et al, 1988], submarinos autónomos [Buholtz, 1996].

• Operação em ambientes tóxicos e perigosos - robôs de inspecção para áreas com radiações perigosas, detecção de falhas e fugas de radiação, limpeza de zonas de acidentes nucleares*, campos de petróleo e gás natural, minas, operações de limpeza em campos de minas deixados pelas guerras.

• Sistemas de vigilância e segurança - robôs de segurança para a detecção e perseguição de intrusos [Branco & Kulzer, 1995].

• Operações de salvamento - inundação de minas, acidentes nucleares, acidentes de aviação.

• Ajudas na indústria - robots de transporte autónomo e inteligente, operações industriais perigosas e outras [Arkin & Murphy, 1990].

• Explicar fenómenos fisiológicos - para além de todas as imagináveis aplicações reais enumeradas acima, existe também muito trabalho pontual que tenta “apenas” conseguir compreender certos fenómenos fisiológicos, especialmente a nível das áreas cerebrais ligadas ao mapeamento e navegação em animais. Essa compreensão pode depois ser aplicada na realidade com maior ou menor sucesso.

* Isto poderia ter salvo milhares de vidas de trabalhadores que tiveram de fazer as operações de limpeza da central nuclear de Chernobyl.

Claro que os AA actuais ainda não possuem a sofisticação necessária que permita esse tipo de trabalho especializado, mas podem vir a ser uma realidade um dia.

I - Introdução

2

Apesar dos AA ainda serem alvo de investigações recentes, parecem ser um campo muito prometedor, onde o progresso composto por pequenos feitos poderá eventualmente um dia levar a um feito maior e uma ajuda para a Humanidade. Assim, torna-se uma motivação muito grande tentar realizar algo que contribua neste campo ainda tão “jovem”.

Note-se que este campo dos AA pouco tem a ver com o campo da Robótica Clássica [Fu, Gonzalez & Lee, 1987]. Esta última centra-se mais na concepção de mecanismos não-autónomos, controlados por sistemas centralizados e que realizam tarefas repetitivas e determinísticas, tais como braços e outros mecanismos robóticos numa linha de montagem duma fábrica.

VI-2. Objectivos deste trabalho O objectivo principal deste trabalho é o de apresentar o trabalho anterior mais importante feito na

área das estratégias de navegação e aspectos relacionados de AA. Essa apresentação será compacta e comentada, mencionando os aspectos mais importantes e interessantes de cada tópico de investigação que deve ser retido de trabalhos anteriores. Serão apresentados resumos mais ou menos detalhados de teorias, modelos, simulações e implementações práticas.

Após ter ampliado os horizontes neste crescente campo de investigação, baseado em Inteligência Artificial (IA) e outras teorias mais biológicas e onde ainda não existe muita coerência em termos de quais os melhores mecanismos a usar, será realizada uma implementação prática a qual enfatiza certos mecanismos e demonstra a sua funcionalidade. O modelo implementado será inspirado em muitos já existentes e tenta seguir de perto duas ideias-guia: simplicidade e minimalismo, no entanto sem pôr em causa o desempenho. Simplicidade pelo facto do número de módulos, sistemas e componentes ser mantido ao estritamente necessário e minimalismo pelo facto desses módulos, sistemas e componentes serem o mais simples possível. Mais uma vez, é imposto um certo limite a estas ideias, quando se deseja um determinado nível de desempenho. Este nível de desempenho será exposto aquando da realização da implementação real.

A sequência de trabalhos nesta tese será a seguinte: • Realizar uma extensiva pesquisa bibliográfica, para se familiarizar com o trabalho teórico e

experimental anterior, no campo da navegação autónoma, focando nas estratégias de mapeamento. Enquanto esta pesquisa é realizada, focar ainda mais num determinado tópico. Este tópico escolhido foi o do Fenómeno do Hippocampus

*. No final desta pesquisa, fazer um resumo dos aspectos mais importantes retidos. Para além disso, durante a exposição do trabalho de outros autores, fazer os comentários apropriados, tendo em conta a implementação feita neste trabalho.

• Pensar em ideias para a implementação dum AA real. Aqui, todo o trabalho anterior servirá de ponto de partida e de conjunto de ideias de base.

• Realizar a implementação propriamente dita, construindo progressivamente os diversos módulos. Descobrir problemas e limitações destes módulos. O AA deveria ser capaz de realizar atalhos, voltas e trajectórias mais simples entre Pistas Visuais (PV).

• Testar toda a implementação que foi conseguida no espaço de tempo disponível, num ambiente de teste. Não é absolutamente necessário que o AA se comporte de forma 100% precisa e robusta. Apenas se pretende demonstrar se, da forma que a implementação foi feita, se produz ou não um comportamento global suficiente. Espera-se que seja capaz de se comportar suficientemente bem, de modo a sobreviver à complexa tarefa de navegação entre PV.

* O Hippocampus é uma estrutura cerebral que será detalhada mais tarde.

I - Introdução

3

Durante este trabalho, serão seguidas algumas ideias-guia, o mais de perto possível: • Em vez de utilizar IA estrita, com estruturas if-then-else, serão usados extensivamente também

mecanismos neuronais. Estruturas if-then-else serão apenas usadas onde estas simplifiquem o trabalho. Assim, todos os mecanismos deveriam ser biologicamente inspirados, com a intenção de mostrar alguns mecanismos funcionais que podem existir em animais e no Homem.

• Tentar fazer sempre que os modelos implementados satisfaçam a maior parte dos dados experimentais disponíveis. Já que alguns até podem estar errados, pontos que não sejam confirmados pela experiência possuem ainda alguma plausibilidade biológica, e podem eventualmente até trazer alguma luz sobre os reais mecanismos biológicos subjacentes.

• Tentar interligar e relacionar diferentes fontes de dados experimentais e teorias, numa teoria global de reconhecimento e mapeamento. Teorias que inicialmente pareciam não relacionadas, poderão passar a sê-lo num modelo mais global. Cada autor estudou problemas, dados e teorias separadas, pelo que é necessário efectuar algumas “colagens”.

• Dar preferência a modelos e teorias que possam ser paralelizadas e distribuídas em hardware paralelo. Estas possuem a potencialidade de serem muito mais rápidas que as centralizadas, e podem fornecer dados para investigações futuras.

• Dar preferência a mecanismos de aprendizagem on-line*, em vez de mecanismos onde os dados

completos têm de ser apresentados de uma só vez . Por outras palavras, o AA deverá ser capaz de aprender enquanto realiza as suas tarefas.

• Tentar sempre utilizar valores relativos a outros, para evitar erros cumulativos e deixar apenas os erros locais (relativos) que são usualmente pequenos. Isto deveria ser feito em tópicos como deslocamentos, reconhecimento de PV e mapeamento.

• Manter o custo do hardware o mais baixo possível, usando sensores, processadores, plataformas, etc., baratos.

VI-3. Características do agente autónomo O AA será baseado na plataforma “THOMAS” [Snow, 1995] anteriormente implementada. Esta

plataforma recebeu diversas melhorias, tais como um Display de Cristais Líquidos (LCD), um teclado, um transceptor de Rádio-Frequência (RF), entre outras. Além disso, foi desenvolvido todo um sistema operativo para comando das capacidades mais básicas do AA, tais como o accionamento dos motores, LCD, teclado, rotinas e interrupções de tempo-real, etc. O THOMAS já possuía um conjunto de sensores para evitar obstáculos e circunscrever PV, bem como um par de shaft-encoders† para a medição dos movimentos das rodas.

As características finais desejadas para o AA que surgirá deste trabalho, serão as seguintes: • Mecanismo de reacção básica

• Discriminação e reconhecimento de PV

• Construção dum mapa do ambiente

• Navegação baseada no mapa, para atingir alvos, realizar atalhos e voltas.

* Esta palavra significa algo como “simultaneamente a”, “processo constante”, “enquanto…”, querendo dizer que a aprendizagem referida

se realiza enquanto o AA se comporta normalmente e sem necessidade de paragem. † Estes são dispositivos colocados nos eixos das rodas e que produzem impulsos com uma frequência proporcional à velocidade

daquelas.

I - Introdução

4

VI-4. Aspectos gerais Sob a noção de navegação, entende-se a capacidade dum AA ser capaz de realizar o seu caminho

num ambiente onde o AA deverá existir. Comportamentos como evitar obstáculos e atingir alvo são algumas das mais importantes tarefas de navegação. O trabalho presente irá tratar de duas outras tarefas importantes que são o reconhecimento de PV e a construção / uso de mapas. Estas duas últimas tarefas melhoram substancialmente a capacidade de sobrevivência do AA, bem como a eficiência na execução das suas tarefas. Por outras palavras, o AA ficará mais inteligente e robusto.

VI-4.1 Inteligência artificial em agentes autónomos É sabido que é muito fácil implementar AA que realizem tarefas muito simples, tais como evitar

paredes em pequenos brinquedos. Um sistema reactivo muito simples permite que brinquedo tenha reacções básicas. Por outro lado, não é nada fácil integrar muitas reacções e comportamentos diferentes, bem como implementar inteligências superiores em tais AA. A maior dificuldade situa-se na escolha das estruturas de hardware e software para tais mecanismos complexos. A escolha de arquitecturas e calibragens é muito difícil, mesmo em ambientes laboratoriais altamente controlados, para já não falar de ambientes naturais.

Quando é construído um AA, não se pode esperar obter uma entidade capaz de compreender expressões como “Por favor, vai buscar uma cerveja ao frigorífico, para mim”. Para nós, é simples e imediato, mas para o AA, uma situação imprevista é suficiente para que o AA fique totalmente baralhado e perdido. Nós não pensamos nos ângulos a dar às junções dos braços, condições de iluminação, segmentação da imagem, reconhecimento de objectos, nem sequer em erros ou falhas, porque tudo acontece de uma forma natural. No entanto, para um AA, estas são tarefas quase impossíveis de serem executadas de forma robusta. Se é desejado um AA robusto, então este terá de possuir comportamentos robustos a começar de baixo: no nível reactivo [Brooks, 1985]. Por outras palavras, a IA clássica poderá não ser a melhor forma de realizar as coisas. De acordo com [Brooks, 1985], a esperança de que a programação simbólica de alto nível levará a uma inteligência semelhante à Humana, é uma fé infundada. Enquanto que a IA se baseia numa hipótese de sistema simbólico, Brooks insiste mais numa hipótese de base física. Enquanto que a primeira decompõe a inteligência Humana em módulos funcionais interdependentes que gerarão o comportamento desejado (arquitectura top-down), a segunda compõe diversos módulos comportamentais cuja coexistência origina a emergência de comportamentos mais complexos (arquitectura bottom-up).

A inovadora ideia de Brooks baseia-se na esperança de que “o melhor modelo do mundo é o próprio mundo, que contém toda a informação necessária, e está sempre actualizado”, enquanto que a IA clássica tem a ideia de que se deveria ter um forte modelo centralizado que represente o ambiente exterior. Por outras palavras, em vez de construir modelos de alto-nível sobre os quais o AA actua, deixe-se o AA actuar directamente com o mundo físico real. Enquanto que a IA tenta demonstrar elevados níveis de comportamento sofisticado em condições laboratoriais muito específicas, com a esperança de generalizar a sua robustez, este novo paradigma de IA de Brooks tenta implementar AA robustos em ambientes muito ruidosos e de baixo-nível, com a esperança de generalizar para tarefas mais sofisticadas. Brooks chegou mesmo a argumentar contra a necessidade de representações internas do ambiente [Brooks, 1987] e a mostrar AA complexos [Brooks, 1989].

Quando um AA é programado, tem-se de descer ao seu nível de existência isto é, realizar uma espécie de ligação mental com a “mente” do AA. O nosso cérebro tem a capacidade inata de abstrair imagens* tal como “o livro está sobre a mesa”, “eu estou longe da mesa” ou “é o livro sobre Biologia”,

* Falando de imagens, quer-se englobar tudo o que é sensorial e que converge para o cérebro, tal como a visão, tacto, audição, cheiro e

sabor, bem como combinações destes. Por exemplo, quando se pensa no prato de comida favorito, o cérebro é inundado com imagens de visão, cheiro e sabor, em ressonância.

I - Introdução

5

sem sequer se preocupar com os pormenores técnicos mencionados anteriormente. Esta é a razão pela qual temos, por vezes, uma dificuldade tão grande em compreender os mecanismos de baixo-nível subjacentes ao comportamento do AA artificial. O erro mais comum e mais incapacitante aquando da construção e avaliação do comportamento dum AA, é o de esquecer que o AA não “vê” o que nós “vemos”. Por vezes, certos comportamentos bizarros do AA são perfeitamente correctos em termos de seguimento do programa que contêm, mas o observador fica perplexo e não consegue compreender a razão deste “comportamento errado”.

Para compreender totalmente o que se está a passar com um AA, o observador não pode ver mais nada senão a informação que o AA realmente recebe e processa, bem como as capacidades realmente programadas em face dessa informação. Limitações de hardware e falhas de software têm de ser cuidadosamente estudadas. Essencialmente, não se devem esquecer os seguintes pontos:

• Um AA apenas vê e opera sobre dados sensoriais fornecidos pelos seus sensores limitados.

• Um AA não consegue realizar tarefas impossíveis de se realizarem com aqueles sensores.

• Um AA não prevê situações imprevistas, apenas reage segundo os comportamentos internos.

Simplicidade é outra palavra-chave que tem de ser mantida em mente quando se implementa um

AA {Doty, 1995} (in [Kulzer, 1995]). Se os mecanismos dum AA forem mantidos simples, é muito mais fácil melhorá-los, depurá-los e inspeccioná-los. Por outras palavras, não se deve complicar sistemas que podem ser implementados de forma simples. Aqui, um exemplo gritante é o uso de sensores IR muito simples, ruidosos e baratos para evitar obstáculos [Kulzer, 1995] [Seed, 1994] [Branco & Kulzer, 1995], em vez de usar mecanismos de câmaras stereo, computacionalmente intensivos [Braunegg, 1993] [Dhond & Aggarwal, 1989] [Tsuji, 1986] [Kriegman, Triendl & Binford, 1989]. Simplicidade também está relacionada com agentes minimalistas, onde apenas é usado um número mínimo de mecanismos para se conseguir o comportamento desejado.

Quando se tenta implementar um mecanismo que tenta imitar o comportamento ou resultados de um comportamento biológico presente em animais, tal como o mapeamento, reconhecimento de PV, etc., existem três problemas principais que podem estar na origem do fraco desempenho obtido:

• O mecanismo pode estar implementado de forma completamente diferente no organismo.

• Os dados que o mecanismo recebe podem ser completamente diferentes do caso biológico.

• Pode-se estar a interpretar algo mal, do ponto de vista de resultados ou de implementação.

Todo o problema de quebra-cabeças neste campo de investigação, consiste em descobrir qual dos problemas referidos acima é o mais limitativo.

VI-4.2 Navegação e uso de mapas Neste trabalho, será enfatizada a navegação autónoma, onde as PV, técnicas básicas de navegação

e os mapas desempenharão os principais papéis. O sistema implementado neste trabalho terá em conta apenas objectos fixos no espaço. Exemplos de implementações com objectos móveis são exemplificados em [Griswold & Eem, 1990] [Fujimura & Samet, 1989].

O sistema básico de navegação dá ao AA as capacidades básicas, tais como evitar obstáculos, seguir paredes, seguir luzes, etc. Também permite ao AA satisfazer comandos de mais alto-nível como encontrar PV. Um planeamento de alto-nível coordena aquelas acções de navegação, de forma a se atingir um determinado alvo ou alvos desejados. Este planeamento pode surgir da utilização dum mapa interno que é também autonomamente construído pelo AA.

I - Introdução

6

O mapeamento é uma característica que pode melhorar grandemente a capacidade e eficiência dum AA atingir um alvo desejado. Tomando outra vez como exemplo o AA de segurança [Branco & Kulzer, 1995], este é 100% capaz de vigiar quartos num prédio e eventualmente perseguir intrusos. A questão aqui, é uma de total ineficiência, porque não passa de um sistema reactivo. Não sabe se já entrou num determinado quarto, se já terminou a inspecção a um quarto e que, por conseguinte, deveria sair de lá, para onde deveria ir a seguir, etc. No caso de perder um intruso (estímulo), então esquece-se da perseguição e retorna à vigilância normal. Mecanismos de mapeamento fornecem uma memória do passado, onde as acções e visões anteriores podem ser armazenadas para uso e manipulação futuros. Note-se que o mapeamento não é necessário quando não se requer uma inteligência espacial superior. Neste tipo de AA, a única inteligência necessária é a de ser capaz de reconhecer PV ou lugares únicos, através dum sistema exterior de emissores que permita ao AA saber a sua posição em todos os instantes. Um exemplo é um AA que segue um arranjo de emissores exteriores [Leonard & Durrant-Whyte, 1991].

Segundo [Penna & Wu, 1993], algumas das questões-chave que surgem aquando da construção de mapas, são as seguintes:

• Tipo de representação - como é que deverá ser construído um mapa a 2D ou 3D a partir dos

dados sensoriais e como é que essa informação deverá ser integrada de forma a permitir um armazenamento e leitura eficientes?

• Actualização - como é que o mapa deverá ser actualizado de forma a melhor se adaptar às mudanças no ambiente?

• Dimensão - pode ir desde 2D (superfícies terrestres) até 3D (ar, subaquático, espaço).

• Limites - um ambiente pode ser limitado (labirinto) ou virtualmente ilimitado (mar, espaço, terreno, ar). Ambientes limitados colocam maiores problemas à eficiência de armazenamento.

• Estrutura - o ambiente pode ser altamente estruturado (laboratório, labirinto) ou muito pouco estruturado (natureza exterior). Ambientes não-estruturados colocam grandes dificuldades nas tarefas de reconhecimento.

• Estacionaridade das pistas visuais - as PV podem ser estacionárias (rochas, montanhas, paredes) ou não-estacionárias (sol, lua). PV não-estacionárias apresentam dificuldades evidentes no reconhecimento de lugares.

• Pistas visuais visíveis e distinguíveis - as PV podem ser visíveis ou parcialmente escondidas, podendo ainda ser distinguíveis ou não. São colocadas exigências nas técnicas de completamento e discriminação de padrões.

• Número de pistas visuais - este número pode ser conhecido ou não. Na maior parte dos casos, não é conhecido e as PV são armazenadas à medida que vão aparecendo.

• Acessibilidade das pistas visuais - as PV podem ser acessíveis ou não. No caso de PV acessíveis, podem-se usar métodos de reconhecimento de PV baseados na sua circunscrição [Doty, Caselli, Harrison & Zanichelli] [Caselli, Doty, Harrison, & Zanichelli]. Se não, podem ser usados outros, tais como a visão [Braunegg, 1993], sonar [Drumheller, 1987] [Elfes, 1987] [Barshan & Kuc, 1992] [Watanabe & Yoneyama, 1992] e laser.

• Norte global - a presença de uma referência direccional global pode simplificar bastante o mapeamento e mecanismos de reconhecimento, mas retira alguma autonomia ao AA que depende da precisão e disponibilidade dessa fonte de informação. Técnicas locais são preferidas.

• Distância e ângulos entre as pistas - a distância e ângulos podem ser ou não mensuráveis. Se o forem, então o AA pode usar essa informação para reconhecer PV e correspondentes lugares.

I - Introdução

7

• Observações contínuas ou discretas - muitas PV similares podem ser discriminadas através de observações contínuas, contrariamente ao caso de observações discretas onde apenas são tomadas fotografias.

• Presença de obstáculos - como distinguir entre obstáculos e PV? Que características os distinguem?

A questão do reconhecimento de lugares tem sido influenciada por técnicas de “força-bruta” de IA [Moravec, 1981] [Zhang & Faugerhaus, 1992] que constróem uma representação métrica a 3D do ambiente, e tem sido limitada pela potência computacional necessária, bem como pelos inerentes erros da visão stereo [Braunegg, 1993] [Dhond & Aggarwal, 1989] [Tsuji, 1986] [Kriegman, Triendl & Binford, 1989] e do sonar [Drumheller, 1987] [Elfes, 1987] [Watanabe & Yoneyama, 1992]. Mais recentemente, apareceram algumas técnicas de mapas topológicos que aprendem relações geométricas entre PV distintas [Kuipers, 1978] [McDermott & Davis, 1984] [Levitt, Lawton, Chelberg, & Nelson, 1987] [Kuipers & Levitt, 1988] [Bachelder & Waxman, 1994] [Penna & Wu, 1993].

VI-4.3 Reconhecimento de lugares e pistas visuais Existem dois mecanismos fundamentais à navegação autónoma com capacidades de planeamento e

construção de mapas espaciais: detecção e eventual reconhecimento único de PV e lugares. As PV podem ser vistas como marcos do ambiente que podem ser distinguidos, baseados em informação sensorial de vários tipos (visão, cheiro, tacto). Os problemas principais que surgem, estão relacionados com a sua distinguibilidade e observabilidade, tal como referido em [Penna & Wu, 1993].

Primeiro, é necessário que o sistema de detecção separe as PV do fundo, fazendo-as observáveis. Este fundo pode ser ruído ou outros objectos (numa imagem de visão), uma mistura de cheiros (num detector de cheiro), ou qualquer outro tipo de dados confusos que “escondem” a PV, e tem a ver com a capacidade de discriminação do sistema de detecção*. Isto já levanta problemas relacionados com a segmentação da imagem e detecção de características. De seguida, o sistema de reconhecimento ou de classificação tem de ser capaz de “etiquetar” a PV detectada, para que o AA tenha alguma informação de onde se encontra no ambiente para iniciar uma navegação correcta através do mapa. Em ambientes a 2D e 3D, estas tarefas de reconhecimento tornam-se mais difíceis pois existem muito mais variantes e graus de liberdade, tanto para os ambientes, para as suas PV e para os próprios sistemas. Tem de ser efectuada uma redução do espaço de entrada, para que se consiga realizar um sistema eficiente e que cumpra os seus propósitos†. O tratamento dos erros em ambientes não-estruturados é uma necessidade [Sutherland & Thompson, 1994] [Beckerman & Oblow, 1990].

* Num receptor de rádio, existe um limiar inferior acima do qual os sinais podem ser detectados. Isso ainda não garante que estes sejam

reconhecidos como símbolos especiais. Essa é a tarefa do classificador. † O córtex visual no cérebro realiza uma compressão espantosa dos dados visuais, através da abstracção de linhas e cantos. Estas

detecções de linhas e cantos vão depois alimentar as áreas mais avançadas do cérebro, tais como o hippocampus que é julgado como sendo a área onde ocorre o mapeamento espacial dum organismo vivo, que será detalhado mais tarde.

CCCCAPÍTULO APÍTULO APÍTULO APÍTULO IIIIIIII

“Sempre copiámos a Natureza, obtendo assim a civilização; devíamos sim aprender com ela para conseguir a nossa própria natureza”

- Frank J. Martin

II - Processos biológicos

9

Para além da investigação no campo dos AA artificiais, para se conseguirem melhores máquinas

que sejam capazes de imitar as capacidades Humanas de navegação inteligente, também existe um enorme campo de investigação onde se tenta compreender e imitar processos mais biológicos de construção de mapas e reconhecimento de PV, através de modelos plausíveis. Plausibilidade significa que o modelo produz comportamentos nalgum aspecto semelhantes aos observados em dados experimentais. Aqui, os tópicos variam entre comportamentos de baixo-nível como seja o Electroencefalograma (EEG) em certas áreas e neurónios do cérebro, e mecanismos de alto-nível relacionados com o planeamento e pensamento Humanos.

As descobertas que sejam feitas, podem depois ser testadas nas arquitecturas artificiais, na esperança de se conseguir um melhor desempenho das máquinas. De facto, espera-se um desempenho biológico, tal como a fiabilidade, robustez e eficiência. Apenas quando se compreender as soluções envolvidas na Natureza, é que se podem fazer simplificações e transferências devidas às limitações da tecnologia actual. É exactamente esta a tendência corrente, já que se disponibiliza cada vez mais dados resultantes de investigações experimentais.

É claro que as soluções ou tendências biológicas não são tão utilizadas em áreas onde é necessária a implementação de soluções algorítmicas para um conjunto de aplicações determinísticas. Em linhas de montagem, computadores digitais, calculadoras, sistemas de controlo clássico e outros, são exigidas qualidades fulcrais como a precisão, repetibilidade e previsibilidade. É curioso notar que actualmente se tenta fundir estas duas realidades, para se obter o melhor de cada uma. Exemplos disto são sistemas de controlo neuronal e sistemas de lógica fuzzy.

VI-1. Introdução Organismos biológicos adaptam a sua forma de se comportarem, com o objectivo de maximizarem a

sua taxa de sobrevivência. Os mecanismos principais são impostos geneticamente (tal como a formação do cérebro), onde este cérebro pode ainda refinar tais comportamentos.

Processos e mecanismos temporais possuem uma importância fundamental para os animais. O cérebro é inundado com operações temporais baseadas em sequências. Processos relacionais estão relacionados com este sequenciamento temporal no cérebro, já que este descobre que as coisas estão relacionadas, pela sua contiguidade temporal.

Capacidades de orientação também são muito banais, estando presentes em virtualmente todos os organismos complexos como os mamíferos e aves. A palavra orientação tem origem no verbo Latino oriri, que significa “surgir de” ou “originar em”. A palavra oriens, originalmente dedicada à nascença diária do sol, veio a ser utilizada para significar a direcção da nascença deste - o Este - e finalmente os

II - Processos biológicos

10

países que se encontram nesse ponto cardeal, “Este” ou “Oriente”. Cada comportamento é de alguma forma direccionado. Sempre que o animal anda, se lava, caça, ou interage com um parceiro social, o “onde” e o “em que direcção” são características indispensáveis no seu padrão de comportamento. Por isso, pode-se definir a orientação como o processo que os organismos usam para organizar o seu comportamento com respeito às características espaciais [Schöne, 1984]. Por outras palavras, a orientação não se limita a descobrir onde é que o animal se encontra, mas também ao comportamento face a essa descoberta.

VI-2. Processos temporais e relacionais Tal como dito anteriormente, o cérebro processa correntes de informação temporalmente

relacionada, ligando, sincronizando e relacionando eventos temporais. Existem inúmeras provas de que esse processamento temporal realmente existe:

• Dactilografia - relações entre movimentos que primem teclas em sequência para produzir

palavras. Se houvesse um deslocamento relativo de uma tecla, ocorreriam erros determinísticos e persistentes, já que os movimentos são relativos. É muito difícil começar ou rescrever uma palavra a partir do meio, já que falta a sincronização inicial.

• Memória espacial - lugares contíguos são intimamente relacionados nos processos mentais. Contiguidade significa que eles ocorreram numa sequência temporal. Lugares que são contíguos mas que nunca ocorreram em sequência, não são directamente relacionados. No entanto, atalhos novos e direccionados (não-arbitrários) são prontamente realizados por animais [Poucet, 1993], o que indica que estes também possuem capacidades de mapeamento.

• Música - ouvir e tocar música envolve mecanismos de sincronização complexos, na medida em que somos facilmente capazes de detectar novidades ou erros de sequências, tocando sem grande esforço consciente, e cantando sem qualquer dificuldade. É difícil começar a tocar ou cantar a partir do meio duma música, sendo sempre mais fácil começar do princípio.

• Fala e escrita - os processos de fala são inerentemente temporais, já que as palavras e frases são faladas duma forma bem determinada. Erros de acústica e de ortografia são facilmente detectados. Inicialmente, é difícil pronunciar ou escrever palavras longas raramente usadas (Ex: “Otorrinolaringologista”), contrariamente ao que acontece com as mais frequentes. Palavras similares alteradas também são muito difíceis de se aprenderem e levam muitas vezes a se dizerem ou escreverem aquelas que são mais conhecidas e relacionadas (Ex: “Cocaquinha bem frescola” é difícil de pronunciar devido à já existente tendência para “Coca-Cola bem fresquinha”).

• Detecção de novidades / erros - não é só na música e fala que somos capazes de determinar se há ou não pontos estranhos em sequências previamente aprendidas.

• Processos mentais - pensar como que numa corrente de imagens e ideias que passam é um processo cumum que acontece todos os dias. A emergência de ideias novas e relacionadas, a partir das últimas, também é muito comum. Por vezes, se aconteceu desviar-se de algo sobre o qual se estava a pensar, apenas somos deixados com a sensação da ocorrência passada de tais pensamentos desejados. Diz-se algo como “O que é que eu estava a pensar ?” ou “ Eu queria dizer algo que me esqueci”. Raramente capazes de relembrar tais pensamentos passados através de simples espera, recorre-se frequentemente a pensamentos anteriores tais como lugares e estímulos, na esperança de reconstruir a sequência que deu origem aos pensamentos desejados. Normalmente isto funciona bem.

II - Processos biológicos

11

Os processos temporais governam quase todos os mecanismos biológicos no cérebro, pelo que nunca devem ser negligenciados ou subestimados no que poderão fazer para resolver as diversas tarefas que o animal enfrenta.

VI-3. Construção de mapas e encontro de lugares A maioria dos animais passa muito do seu tempo e viajar de um lugar para outro, movendo-se dentro

e fora de habitats. Estes movimentos são um processo adaptativo que é absolutamente necessário para o sucesso da sobrevivência e reprodução de muitas espécies. Não é pois de admirar que sobrevivam melhor apenas aqueles animais que “escolheram geneticamente” os melhores e mais correctos mecanismos para o mapeamento e navegação com esse mapa. Soluções mais eficientes e fiáveis proporcionaram uma maior probabilidade de sobrevivência e, por isso mesmo, uma maior taxa de reprodução.

Os elementos básicos necessários à construção de mapas e consequente navegação, são a exploração*, memória, repetição e aprendizagem. A um nível mais baixo, existem os mecanismos inatos e involuntários, tais como o dead-reckoning†, detecção e reconhecimento de PV, pistas de cheiro, tacto, sonar em morcegos, e quaisquer outras pistas térmicas, mecânicas, químicas, magnéticas ou eléctricas [Waterman, 1989]. Estes processos dependem muito mais dos sentidos. Existem ainda pistas que são usadas para se obter um conhecimento absoluto de direccionalidade‡. Os processos envolvidos são chamados de bússola interna e encontro de direcção [Waterman, 1989], e usam pistas como a visão (PV fixas ou móveis), o sol (aves), as estrelas, a lua, sentido temporal (aves), gradientes químicos, sensores quimioreceptivos (moscas, feromonas da formiga), gradientes luminosos (peixes, insectos), polarização do céu / luz (formigas), magnetismo planetário, tacto (bigodes de roedores em túneis, pessoa cega), sensação de vibrações (sonar em morcegos), e sensores termoreceptivos, eléctricos (peixes), e magnéticos (aves, abelhas, peixes, existem algumas evidências em Humanos).

Navegadores Humanos de canoagem no Pacífico do passado recente, podem possuir de 75 a 100 vectores que emanam do atol da sua casa, formando um mapa espacial polar (torre com vectores radiantes), com as direcções e distâncias a várias PV e lugares [Waterman, 1989]. Cada lugar pode mesmo ter outro mapa polar local, formando-se um mapa global de densas interligações de lugares. A escala é a que satisfaz melhor as necessidades (pequena para ratos, grande para aves e navegadores Humanos no mar). O problema reside em descobrir se o mapa do animal é Cartesiano ou polar. O primeiro poderia fornecer trajectórias entre quaisquer 2 lugares, enquanto que no segundo o animal poderia perder-se a menos que cada PV tivesse vectores para todas as outras. A maioria dos experimentadores reportam um mapa polar em vez de um mapa tipo grelha Cartesiana.

O encontrar lugares [Waterman, 1989] com instrumentos e tabelas de dados pode definir univocamente um lugar na Terra, com as suas coordenadas de longitude e latitude. Os animais conseguem reconhecer uma PV e relacioná-la com a casa. Cheiros, visões, dead-reckoning, navegação inercial também são possíveis.

* Exploração significa que o animal não sabe o que fazer e para onde ir. Apenas vai vagueando, na esperança de encontrar algo, em vez

de o procurar. Por outro lado, logo que o ambiente se torne conhecido, o animal sabe o que fazer e para onde ir, numa procura direccionada. † Este termo tem a sua origem no seu uso para significar a capacidade de estimar a posição corrente através duma bússola magnética e

um diário de bordo, num navio. Neste diário, eram anotadas todas as direcções e distâncias viajadas, de forma a que a posição actual podia ser aproximadamente calculada, relativamente a um ponto de partida. Em organismos biológicos, esta palavra refere-se a algo como uma “sensação” de se estar num determinado lugar no domínio espacial. Está relacionada com a capacidade do organismo estimar a sua posição corrente ou distância vectorial relativa a um lugar, usando apenas uma bússola interna ou um mecanismo de integração do caminho. Aqui não é referido o uso directo de sistemas de posicionamento global como o GPS (Satellite Global Positioning System ), reconhecimento de lugares ou PV, ou qualquer outro mecanismo que forneça um conhecimento absoluto do lugar onde se está. Tecnicamente, significa que um AA é capaz de estimar a sua posição relativamente a uma referência, para uso em cálculos de mapeamento local.

‡ Quando se refere a algo absoluto em termos espaciais, geralmente isso significa que é absoluto ou quase absoluto no espaço de existência limitado do organismo isto é, no seu ambiente limitado. Por exemplo, pode-se usar o sol para navegar dentro dum país, mas nunca se poderia usá-lo para navegar no espaço já que não é fixo. Assim, fora desse espaço limitado, pode não ser utilizável para tarefas de navegação de longa distância, pelo que têm de ser tomados outros tipos de informação, ou têm de ser realizados cálculos complexos

II - Processos biológicos

12

VI-3.1 Evidência da construção de mapas nos organismos Com a assistência dum poderoso mecanismo de construção e uso de mapas, os organismos vivos

podem efectivamente ficar com uma maior consciência das propriedades do ambiente que os rodeia, para além da informação sensorial imediata. Isto permite-lhes evitar locais anteriormente perigosos ou menos confortáveis e encontrar comida, esconderijo, despensa e outros locais interessantes. Quando se fala de mapas, refere-se a todo o mecanismo que permite ao animal saber onde se encontra.

Fig. 1 - À esquerda mostra-se uma experiência onde formigas e ratos encontram o seu caminho num labirinto com uma saída e uma entrada, possuindo muitos becos sem saída. Os ratos comportam-se melhor, provavelmente devido ao maior tamanho do cérebro. O uso de feromonas não foi mencionado. À direita, uma formiga do deserto vagueia durante uns 21 minutos numa área de cerca de 120m2. O ponto (0,0) indica o ponto de partida e o (+) branco é o ponto de retorno. Se a formiga é perturbada durante o seu caminho, esta consegue retornar directamente para casa, em linha recta e com um erro relativamente pequeno. Esta capacidade tem de se basear nalgum sistema de dead-reckoning sofisticado, que funcionará através da integração vectorial do caminho ou através da manutenção da direcção de bússola para casa. (adaptado de [Waterman, 1989])

Fig. 2 - Num estudo realizado por {Tolman} (em [O’Keefe & Nadel, 1978]), após treinar ratos no labirinto esquerdo onde havia comida no lugar (G), estes eram largados no labirinto à direita onde o anterior braço da comida tinha sido selado. A maior parte dos ratos (36%) escolheu o braço (6) (atalho directo para a comida). Isto pode ser evidência de que os ratos confiam bastante no seu mapa interno. (adaptado de [O’Keefe & Nadel, 1978])

II - Processos biológicos

13

Fig. 3 - Ilustração da tomada de atalhos por parte de cães numa floresta [Chapuis & Varlet, 1987]. À esquerda são mostrados os caminhos de treino, onde os cães eram conduzidos até aos locais de comida (A) e (B) através da sequência SA-AS-SB-BS. Quando libertados do lugar de partida (S), aqueles encaminhavam-se primeiro para o lugar (A) e seguiam por atalhos para o lugar (B). 47% dos atalhos tinham um desvio de menos de 5o. A experiência mostra claramente a capacidade dos animais integrarem Informação Topológica (IT) e Informação Métrica (IM) para conseguirem calcular trajectórias aproximadas. (adaptado de [Poucet, 1993])

Fig. 4 - A IT tem em conta a conectividade espacial entre lugares, sem especificar distâncias nem direcções (em cima), enquanto que a IM apenas dá valores de distância e direcção sem especificar fontes, destinos ou conectividade (em baixo). A título de exemplo, imagine-se a rede de esgotos de Paris: conectividade significa poder ir de uma fossa para outra, sabendo que existe esse caminho, mas sem saber por qual dos túneis entrar, nem saber o comprimento desse túnel. No Hippocampus (HC) do cérebro do animal são armazenadas muito provavelmente ambas as formas de informação espacial. Enquanto que a primeira lhe permite concluir sobre as relações espaciais geométricas entre lugares, a segunda fornece os dados métricos precisos. Note-se que esta segunda nada pode fazer sozinha, já que não conhece referências nem geometria.

Existem muitas experiências efectuadas com ratos {Tolman} {Olton & Samuelson, 1976} {Olton, 1977} {Olton, Collison & Werz, 1977} (em [O’Keefe & Nadel, 1978]), pombos, etc., que mostram claramente as capacidade de mapeamento dos animais.

VI-3.2 Particularidades da construção de mapas biológicos Existem alguns aspectos e particularidade importantes na construção de mapas em animais, os

quais podem ajudar a revelar as operações e regras destes mecanismos cerebrais.

Fig. 5 - Labirinto em forma de Y, que demonstra que o animal passa muito mais tempo a inspeccionar locais interessantes (cantos e bifurcação) do que a atravessar os braços. Poder-se-ia pensar que vectores polares poderiam fornecer a necessária informação espacial para que o rato fosse capaz de rapidamente calcular trajectórias entre os lugares.

II - Processos biológicos

14

[Poucet, 1993] apresenta e comenta um resumo dos conhecimentos acumulados pelos neurofisiologistas experimentais, bem como de experiências espaciais em animais. Ele assume que o HC deveria realizar uma espécie de cálculos de mapeamento em volta de coordenadas polares, pois estas explicariam grande parte dos dados experimentais. Esta representação seria uma forma eficiente e completa* de armazenar e relembrar informação espacial.

Uma experiência muito interessante [Morris, 1981] mostra que não são necessárias quaisquer PV no lugar do alvo desejado, para que o rato consiga calcular trajectórias precisas para lá. Aqui, um rato era colocado num tanque cheio de água opaca, rodeado por PV exteriores. Logo que o rato encontrava uma plataforma submersa, através de nado aleatório, era capaz de voltar a nadar directamente para a plataforma e de qualquer ponto de partida.

Diversas considerações e descobertas são ainda feitas por {McClearly & Blaut, 1970} {Munn, 1950} {Trowbridge, 1913} {Moore, 1973} (in [O’Keefe & Nadel, 1978]), [Hebb, 1949] [Collet, 1987] [McNaughton, Leonard & Chen, 1989]

VI-3.3 O fenómeno do hippocampus Julga-se que os processos de mapeamento estão intimamente ligados a uma área cerebral

chamada Hippocampus. Os neurocientistas provam que aquela possui neurónios responsáveis por sinalizar a detecção da posição do animal num certo ambiente [O’Keefe & Dostrovsky, 1971] [O’Keefe & Nadel, 1978] [Muller, Kubie & Ranck, 1987] [Muller & Kubie, 1987] [McNaughton, 1988] [O’Keefe, 1989].

Fig. 6 - À esquerda tem-se o cérebro do rato, enquanto que à direita se tem um corte através da área do HC, que se assemelha a uma salsicha. Aqui, existem os neurónios piramidais que disparam em resposta ao posicionamento do animal num certo lugar e ambiente. (adaptado de [O’Keefe & Nadel, 1978])

* Por completo, quer-se dizer que qualquer trajectória para um lugar anteriormente visitado poderia ser facilmente calculada, através de

simples adições e rotações.

II - Processos biológicos

15

Fig. 7 - À esquerda mostram-se quatro diferentes mapas de actividade neuronal de quatro Neurónios Posicionais (NP) diferentes, que são neurónios que disparam quando o animal está numa certa zona do ambiente. A cada NP responde numa certa Zona Posicional (ZP) do ambiente [Muller, Kubie & Ranck, 1987] e que são as áreas onde o correspondente NP dispara, o que significa que cada NP codifica uma pequena parte do ambiente. À direita mostra-se um exemplo de como as diferentes ZP cobrem um ambiente rodeado de PV. (adaptado de [Burgess, Recce & O’Keefe, 1994])

Foi demonstrado [Muller & Kubie, 1987] que apreciáveis mudanças na disposição geométrica das PV provocam também mudanças nas ZP, que mudam de lugar, tamanho e forma. Enquanto isso, características como a cor, textura e tamanho não parecem ter um papel importante na actividade dos NP [Muller & Kubie, 1987].

Experiências mostram que a IT é mais provavelmente armazenada no HC (relações entre lugares), e que a IM mais detalhada mas incompleta é armazenada mais provavelmente no córtex parietal [Poucet, 1993].

Os factores que mais parecem influenciar a actividade dos NP são a direcção de movimento e as trajectórias potenciais [Eichenbaum, Wiener, Shapiro & Cohen, 1989] [McNaughton, Barnes & O’Keefe, 1983] [Muller & Kubie, 1987] [Wiener, Paul & Eichenbaum, 1989] [Foster, Castro & McNaughton, 1989].

Fig. 8 - A informação resultante dos movimentos entre lugares será de alguma forma armazenada nas ligações entre NP, de modo que o animal consiga relembrar trajectórias, mesmo na escuridão. Assim, enquanto que os estímulos visuais servem como sinal de início para a aprendizagem, essa informação não é de todo necessária para o animal navegar correctamente dum lugar de partida conhecido para outro. Isto estará certamente relacionado com a potenciação das sinapses Hebbianas entre os NP [Muller, Kubie, Bostock, Taube & Quirk, 1991]. Quando o animal atravessa os lugares pela sequência (A-B-C), são formadas sinapses excitatórias que perfazem assim o mecanismo de correlação entre lugares [McNaughton, Leonard & Chen 1989] [Foster, Castro & McNaughton, 1989].

II - Processos biológicos

16

Existem alguns estudos Humanos [Levitt, Lawton, Chelberg & Nelson, 1987] [Levitt, Lawton, Chelberg & Koitzsch, 1988] [Levitt & Lawton, 1990], {Thompson, 1980} {Pick & Rieser, 1982} (em [Collett, Cartwright & Smith, 1986]), onde se tenta arranjar modelos que expliquem algumas das capacidades de mapeamento Humano. Exemplos são os modelos QualNav e Tour. De uma forma geral, são utilizadas as relações geométricas entre as PV.

CCCCAPÍTULO APÍTULO APÍTULO APÍTULO IIIIIIIIIIII

“A dedicação contínua a um objectivo único consegue frequentemente superar o engenho” - Cícero

III - Teorias, modelos e implementações existentes

18

As questões na navegação, planeamento de movimentos, reconhecimento de PV e a construção de

mapas, tem originado muita actividade neste campo de investigação, onde existem já inúmeras tentativas de imitar estas capacidades biológicas dos animais. Existe também uma tendência crescente na simulação de estruturas semelhantes às do HC e áreas relacionadas.

Este capítulo está dividido em três principais grupos de investigação: modelamento directo do HC, outras inspirações biológicas e a IA, cobrindo apenas os artigos mais interessantes.

VI-1. Modelamento das estruturas e funções do HC • [Burgess, O’Keefe & Recce, 1993] [Burgess, Recce & O’Keefe, 1994] postulam a existência de

neurónios que disparam conforme a distância e direcção do rato em relação ao alvo desejado, chamados neurónios de alvo. Com base nisso, constróem uma arquitectura neuronal com NP como fornecedores da informação espacial. São formadas sinapses Hebbianas entre estes e as camadas seguintes, de forma a se formar uma representação global distribuída e com actividades muito largas. Estas actividades surgem nos neurónios subiculares existentes no HC, e permitem correlacionar qualquer posição do rato com a posição do alvo.

• [Redish, 1995] apresenta um modelo para navegação com o HC, chamado CRAWL. As PV locais são representadas na forma de vectores com o tipo de PV, distância e direcção global, bem como informação posicional de dead-reckoning. Assim, as ZP são formadas por uma combinação destas informações, permitindo actividades na escuridão, bem como a reaquisição do mapa em caso do AA se perder.

• [Poucet, 1993] apresenta um modelo teórico que integra informação local, bem como relacional, num mapa cognitivo. Esta informação resulta de actividades motoras e inclui IT e IM. É assumido que as relações entre lugares são codificadas sob a forma de vectores polares. Cada lugar A, B, C e D possui a sua própria referência local rA, rB, rC e rD, com a qual o animal pode calcular para onde quer ir. Ligações entre lugares distantes podem ser estabelecidas através de mapas locais múltiplos. O animal saberia como ir de a para b seguindo os lugares ligados. Isto não retira a hipótese de atalhos directos. Outra hipótese é que se poderá formar uma rede de IT global que relacione lugares distantes.

• [Hetherington & Shapiro, 1993] propuseram uma Rede Neuronal Artificial (RNA) com ligações recorrentes que deveriam simular as ligações laterais recorrentes entre os NP do HC. Através do treino destas ligações, o rato simulado era capaz de navegar na escuridão, percorrendo todo o

III - Teorias, modelos e implementações existentes

19

percurso aprendido até ao alvo. Foi utilizado o backpropagation [Lippman, 1987] para os treinos, o que requer muitos ciclos de treino, para além de não ser biologicamente plausível. Um facto curioso é o de que ligações positivas correspondiam a lugares próximos, enquanto que lugares distantes originavam ligações negativas entre os correspondentes NP. Estes resultados são sugeridos por fenómenos anti-Hebbianos e Hebbianos no HC [Stanton & Sejnowski, 1989].

• [O’Keefe & Nadel, 1978] apresentam um modelo do HC onde se utiliza um tipo de onda chamada theta rhythm (TR) que serve para sincronizar a informação espaço-temporal que vai surgindo à medida que o rato se desloca por entre lugares. Este ritmo cíclico excita grupos de NP sucessivos, de forma a que estes armazenem a configuração da actividades das linhas de entrada sensoriais na mesma sequência dos correspondentes lugares. Este modelo pode existir em duas zonas distintas do HC: na CA3 ou na CA1. Este modelo poderá eventualmente explicar a rápida construção das ZP no rato.

• [Bachelder & Waxman, 1994] implementaram um AA chamado MAVIN, que tenta reconhecer PV a 3D dispostas em redor, e criar ZP como no HC. A utilização de uma RNA do tipo ART [Carpenter, 1989] [Carpenter, Grossberg & Rosen 1991] resulta numa capacidade de adaptação a modificações incrementais no ambiente, sem que isso origine a disrupção de padrões anteriormente aprendidos. Num ambiente laboratorial com quatro PV, o MAVIN gerou três ZP distintas.

• [Penna & Wu, 1993] apresentam o modelo QualNav [Levitt, Lawton, Chelberg & Nelson, 1987] [Levitt, Lawton, Chelberg & Koitzsch, 1988] [Levitt & Lawton, 1990] modificado, na medida em que fazem uma interpretação diferente, distinguindo entre mapa cognitivo e mapa físico. Enquanto que o primeiro é o mecanismo computacional através do qual os dados sensoriais são previamente pré-processados, o segundo é o mapa espacial propriamente dito que se forma a partir do primeiro. Apenas é dito que o mapa físico é do tipo de Kohonen [Kohonen, 1982, 1988]. A distinção entre lugares era feita através das linhas que ligavam as PV (esquerda). São analisados os casos de PV distinguíveis e não-distinguíveis, onde as diferenças se encontravam a nível do mapa cognitivo. Este modelo não requer o conhecimento dum “Norte global”.

• [Zipser, 1985, 1986] considera que a localização e tamanho das PV são as características mais importantes que determinam a actividade dos NP no HC. Ele realiza algumas simulações onde mostra como é que as ZP podem ser formadas a partir dessa informação. Também mostra o efeito de dilatação e alterações que as PV podem provocar nas ZP, o que é muito semelhante ao que acontece no HC.

VI-2. Modelos inspirados na biologia • [Brooks, 1985b, 1986] propõe um novo modelo chamado subsumption architecture onde existem

várias camadas de controlo do AA. Cada camada pode conter um ou mais comportamentos que são despelotados por certos estímulos sensoriais exteriores. Assim, o AA é puramente reactivo e é constituído por pares estímulo-reacção dispostos por camadas. No fundo, trata-se de uma decomposição comportamental das funções do AA, através de camadas interligadas onde comportamentos das camadas superiores podem inibir os das camadas inferiores, impondo a sua própria vontade. Existe portanto uma arbitragem fixa de comportamentos. Neste exemplo, o comportamento de seguir luzes pode ser inibido pelo comportamento de evitar obstáculos que impõe a sua própria acção. Esta arquitectura é vista como a introdução de um novo paradigma de IA para navegação de AA, já que não processa símbolos como na IA clássica, mas também não possui estruturas biológicas directas. Esta arquitectura apenas é inspirada no comportamento animal, mas continua a ser programada da forma clássica. Por isso mesmo, é chamada de

III - Teorias, modelos e implementações existentes

20

paradigma do controlo comportamental* que é baseado na hipótese de base física [Brooks, 1990]. Este paradigma consiste na decomposição do problema do controlo autónomo em tarefas simples independentes, em vez de funções interligadas e dependentes. Brooks demonstrou que muitas tarefas complexas poderiam ser realizadas de forma reactiva, isto é, através da simples acoplagem dos sensores aos actuadores através de funções de transferência simples com poucos ou nenhuns estados internos (memória).

• [Payton, 1990] [Payton, Rosenblatt & Keirsey, 1990] apresentam um conceito arquitectural muito interessante e que visa eliminar os problemas que surgem devido a uma arbitragem rígida como acontece na subsumption architecture [Brooks, 1985b, 1986]. Aqui, tanto os comportamentos como o árbitro são distribuídos. Desta forma consegue-se realizar uma espécie de fusão de informação que pesa as opiniões de cada nível comportamental, para se tirar partido de toda a informação disponibilizada por cada um. Na subsumption architecture [Brooks, 1985b, 1986], quando um nível era inibido por outro, este era completamente ignorado na próxima decisão. Este é o chamado problema da arbitragem de comandos. Enquanto que na subsumption architecture a decisão seria aleatória, aqui ela é oportunística. Tudo isto tem a ver com o não-abuso de abstracção, a qual faz com que se perca informação logo a níveis baixos, e também tem o nome de comportamentos refinados.

• [Mataric, 1989, 1990b, 1992] implementou um AA chamado TOTO e que demonstrava que era possível realizar tarefas de mapeamento e reconhecimento com a subsumption architecture [Brooks, 1985b, 1986]. Utilizando sonar, o TOTO seguia paredes e classificava-as como sendo corredores, paredes, obstáculos confusos e cantos. Tanto à esquerda como à direita podem-se ver dois exemplos de mapeamento, onde é construída uma rede de nós que armazenavam o tipo de PV e distância. Após a fase de exploração, para se chegar a um alvo desejado, o nó de alvo espalhava actividade através das ligações vizinhas até que esta atingisse o nó corrente. Aí, o AA escolhia aquele caminho de onde vinha mais actividade. No fundo, o AA escolhia o caminho fisicamente mais curto, mesmo que este corresponda à distância topológica mais longa.

• [Caselli, Doty, Harrison, & Zanichelli] [Doty, Caselli, Harrison & Zanichelli] [Doty & Seed, 1994] [Seed, 1994] apresentam uma alternativa radicalmente diferente para o reconhecimento de PV e respectiva construção de mapas. Esta baseia-se na circunscrição de cada PV, armazenando os diversos cantos e comprimentos de paredes encontrados, até prefazer uma volta completa. Numa volta de reconhecimento, esses dados armazenados eram comparados com os novos, para se decidir se era uma nova PV ou se era uma já conhecida. O mapa global era construído a partir destes conjuntos de dados das PV e respectivas distâncias entre elas. Este processo é de todo semelhante ao que acontece quando uma pessoa cega tenta reconhecer um objecto através do tacto. As PV tinham de ser obrigatoriamente poligonais.

• [Euliano & Príncipe, 1995] implementaram uma rede do tipo Kohonen [Kohonen, 1982, 1988], mas com um carácter temporal extra. Em vez de organizar os padrões de entrada de forma totalmente independente da sua sequência temporal, agora aos neurónios desta RNA têm em conta o factor tempo, na medida em que o vencedor também depende da actividade dos seus vizinhos no tempo. Este tipo de rede mostrou-se muito apropriada para o reconhecimento de sequências de padrões anteriormente aprendidas. Este reconhecimento é identificado através do crescimento de uma frente-de-onda acumulativa. O “segredo” desta RNA é o de aglomerar padrões temporalmente contíguos, em vez da anterior aglomeração de semelhança.

• [Schöner, 1995] [Schöner & Dose, 1992] implementam sistemas dinâmicos através de equações diferenciais simples, onde predominam conceitos como atractores e repulsores, bem como bifurcações e decisões inteligentes. AA controlados por estes mecanismos são capazes de evitar obstáculos de forma bastante eficiente.

* Do Inglês behavior-control paradigm.

III - Teorias, modelos e implementações existentes

21

VI-3. Navegação através de inteligência artificial • [Asada, Fukui & Tsuji, 1990] apresentam uma implementação dum mapa global constituído por

múltiplos mapas locais interligados. Uma câmara extraía segmentos dum ambiente laboratorial, e escolhia uma referência fiável para o mapa local corrente. Logo que essa referência era perdida (saia fora da imagem), era iniciado outro mapa local que seria ligado ao anterior através da distância àquele. Também aqui as PV tinham de ser poligonais.

• [Dudek, Jenkin, Milios & Wilkes, 1991] propõem que a exploração dum ambiente pode ser feita através da construção de árvores onde não existe nenhuma IM. Este modelo apenas funciona para labirintos e requer a utilização de um marcador que o AA pode largar e apanhar. Inspeccionando os ramos da árvore, o AA sabia que zonas ainda não tinham sido exploradas.

• [Zelinsky, 1992] apresenta um algoritmo de mapeamento que se baseia na utilização de árvores quaternárias, pela sua maior eficiência em relação às árvores normais. Aqui, os obstáculos eram representados por nós com o tamanho necessário. A trajectória para o alvo era depois determinada através duma transformada de distância, onde quadrículas mais afastadas possuíam valores mais elevados. O AA limitava-se a seguir as transições mais bruscas em direcção ao alvo.

• [Gonçalves, Ribeiro, Kulzer & Vaz, 1996] implementaram um AA com limitadas capacidades de compasso interno e dead-reckoning. Estas apenas serviam para rodar ou transladar um mapa local centrado no AA e que armazenava as posições dos obstáculos e perigos. O mapa local era um array de 5x5 elementos e os shaft-encoders possuíam apenas 8 impulsos por volta. Todo o sistema foi implementado num microcontrolador 68HC11 [Motorola], utilizando menos de 10 Kbyte de Memória (RAM). O propósito deste AA era o de lhe dar a capacidade de saber onde já viu obstáculos e onde existem saídas, reduzindo drasticamente o comportamento aleatório de outros AA sem memória.

• [Fennema, Hanson, Riseman, Beveridge & Kumar, 1990] propõem um AA capaz de navegar num ambiente parcialmente modelado e estático, através do processamento visual das características das PV modeladas. Não existem obstáculos não-modelados. Antes que qualquer acção se inicie, é seleccionada uma característica de uma PV do ambiente, usando conhecimentos armazenados na base de dados. A acção é então realizada incrementalmente: antes de cada incremento, é previsto o movimento daquela característica. O erro resultante do movimento real é usado para corrigir os parâmetros do movimento do AA. Isto traduz um controlo tipo servo-motor, mas ao nível das acções tomadas pelo AA e é originalmente chamado de “action-level” perceptual servoing.

• [Borenstein & Koren, 1991] [Guldner & Utkin, 1995] [Hwang & Ahuja, 1992] [Kim & Khosla, 1992] [Rimon & Koditschek, 1992] todos apresentam implementações baseadas em campos vectoriais potenciais ou campos de gradiente. Aqui, o AA tem de seguir os vectores sucessivos.

CCCCAPÍTULO APÍTULO APÍTULO APÍTULO IVIVIVIV

“Os projectos nunca têm a dimensão dos nossos sonhos” - Anónimo

IV - Desenvolvimento do projecto

23

Após todas as exposições anteriores, os conhecimentos acumulados vão finalmente ser aplicados e

modificados de forma a melhor servir este projecto de um AA.

VI-1. Sistema reactivo básico Este sistema á basicamente constituído pelos mecanismos responsáveis pelas capacidades do AA

de se desviar de obstáculos e de seguir paredes (circunscrever PV). Foram usadas dinâmicas [Schöner, 1995] [Branco & Kulzer, 1995] para suavizar os movimentos dos motores. Para evitar obstáculos e seguir paredes, foi usada a técnica dos comportamentos refinados [Payton, Rosenblatt & Keirsey, 1990] [Rossey, 1996].

VI-2. Bússola interna e integração do caminho

Fig. 9 - Quando o AA se move com velocidades das rodas de (SL) and (SR) durante um curto intervalo de tempo ∆T, então são percorridas as distâncias ∆L=SL∆T and ∆R=SR∆T. Com esta informação, o ângulo da bússola pode ser facilmente calculado como se segue:

αcompass

t

t

t

t

dR dL

D

R L

D=

=−

∫ ∫∆ ∆ Eq. 1

IV - Desenvolvimento do projecto

24

Fig. 10 - Para calcular o vector de integração de caminho (VT), é necessário integrar todos os deslocamentos vectoriais infinitesimais. Como aqui o integral é uma soma, tem-se de somar tantos segmentos quanto possíveis.

O vector que aponta da posição de partida para a posição corrente, é dado pela expressão:

r rV dV dx dy dV dVT t

t t t

t

t

t

t

= =

=

∫ ∫ ∫ ∫ ∫; cos ; sinα α Eq. 2

Como o AA apenas pode somar deslocamentos não-infinitesimais, esta expressão é aproximada à soma de todos os deslocamentos possíveis:

rV V VT t

t

t

t

∑ ∑∆ ∆cos ; sinα α Eq. 3

Fig. 11 - Para que o AA seja capaz de voltar a casa ou a outro qualquer lugar, é necessário que este possua um mecanismo que mantenha sempre a direcção correcta até que o vector de deslocamento desejado tenha sido completamente percorrido, o que se traduz-se num comportamento tipo servo-motor.

À medida que o AA vai andando, o vector a viajar vai sendo corrigido da seguinte forma: r r r r r r r r r

V V d V V d V V d2 1 1 3 2 2 4 3 3= − = − = − ; ; ; ... Eq. 4

VI-3. Discriminação e reconhecimento de pistas A discriminação e reconhecimento de PV serão implementados através de algoritmos básicos de IA

e essencialmente RNA. O mais difícil é o mecanismo de classificação e reconhecimento. Foi dada bastante atenção a este módulo, pois é dele que vão depender todos os subsequentes níveis superiores de comportamento. Seguem-se os passos da sequência lógica dos comportamentos do AA, bem como os detalhes de funcionamento interno do mecanismo neuronal implementado.

Fig. 12 - A parte de detecção (esquerda) é facilmente realizada com sensores orientados para a frente e para os lados (esquerda). A parte subsequente de “atracagem” (direita) é realizada apenas com dois sensores laterais que detectam o instante de paralelismo entre o AA e a parede da PV (direita).

IV - Desenvolvimento do projecto

25

Fig. 13 - A circunscrição é realizada da mesma forma em PV poligonais e não-poligonais. O AA tenta sempre manter uma distância lateral constante ao longo de toda a trajectória, pelo que adquire a “sensação” da sua forma. Algoritmos convencionais [Seed, 1994] [Caselli, Doty, Harrison, & Zanichelli] [Doty, Caselli, Harrison & Zanichelli] [Doty & Seed, 1994] falham completamente ao tentar reconhecer PV como a (B).

Fig. 14 - Enquanto o AA circunscreve as PV, são criadas correspondentes RNA circulares cujas ligações sinápticas armazenam os ângulos de desvio local. Por exemplo, quando é realizada a volta (3), é adicionado um novo neurónio cujo circuito sináptico conterá o valor de -90o. Aqui, a PV (A) apenas difere da (B) a partir da volta (6), pelo que é esperado que o AA apenas consiga discriminá-las a partir daí. Note-se que foram utilizados exemplos com PV poligonais apenas por simplicidade.

Fig. 15 - Detalhes da propagação de actividade durante a fase de reconhecimento. Aqui, quando o AA circunscreve a PV, todas as RNA são alimentadas com os valores angulares dos desvios efectuados. Aquela que tiver maiores semelhanças é a que mantém um pico elevado de actividade, enquanto que as restantes perdem actividade sempre que houver uma discrepância maior. Por exemplo, uma discrepância de 90o pode causar uma atenuação por um factor de 2, enquanto que 180o já podem causar um factor de 4.

IV - Desenvolvimento do projecto

26

Fig. 16 - Visualização das várias frentes-de-onda que aparecem na RNA (A), quando cada uma das PV é circunscrita. Enquanto que na PV (A) decaem todas as frentes-de-onda correspondentes a lugares errados, apenas sobrevive a frente-de-onda principal que corresponde à sequência correcta de lugares em volta da PV, pelo que a PV é reconhecida como sendo a (A). Por outro lado, a PV (B) origina o decaimento de todas as frentes-de-onda, o que significa que a PV foi rejeitada como sendo a (B). O AA pode iniciar o reconhecimento a partir de qualquer ponto da PV, o que apenas origina a sobrevivência de diferentes frentes-de-onda únicas

Fig. 17 - Para resolver o problema dos desvios angulares nem sempre se processarem exactamente nos mesmos pontos durante o reconhecimento, o que levaria a uma discrepância intolerável, implementou-se ainda um córtex motor cuja missão é a de permitir ao AA realizar amostras mais finas (sobre-amostragem) e detectar quando acontecem os matches desejados (em cima). Em baixo mostram-se as ligações permanentes resultantes da fase de aprendizagem, onde cada circuito sináptico fica ligado a uma saída neuronal do córtex motor através de sinapses Hebbianas. Note-se que aquele possui curvas de sintonia que discriminam entre ângulos diferentes.

Fig. 18 - Este é o percurso executado pelo AA quando efectua uma circunscrição. Atracando num certo ponto, inicia a amostragem um pouco mais à frente, parando depois perto deste último ponto, através de dead-reckoning.

IV - Desenvolvimento do projecto

27

Note-se que aqui não existem estruturas do tipo if-then-else, tendo este modelo as seguintes características principais:

• A PV pode ser de qualquer forma, não apenas poligonal.

• Os neurónios são criados uniformemente à medida que o AA circunscreve a PV.

• O circuito sináptico de cada neurónio criado contém informação angular.

• A necessidade de compressão / dilatação espacial é eliminada pelo amostragem uniforme.

• O mecanismo de reconhecimento é directo e apenas necessita duma rede do tipo Winner-Take-All (WTA) para seleccionar a RNA com a maior actividade.

• Desvios angulares são bem tolerados (isto será mostrado em simulações)

• Decisões óptimas são inerentes a este modelo, pois a rede WTA selecciona sempre a RNA mais activa isto é, aquela com mais semelhanças, mesmo que esta actividade seja baixa.

• A posição na circunscrição é facilmente obtenível pela posição da frente-de-onda principal.

• A estrutura e o mecanismo de propagação são muito semelhantes às Cadeias de Markov* [Rabiner & Juang, 1993] no que respeita ao processo de “recall”.

• Rotações e translações da PV não afectam de todo o desempenho, tal como acontece no rato [O’Keefe & Nadel, 1978] [Muller, Kubie & Ranck, 1987].

• Alterações no arranjo de desvios angulares altera também as respostas das RNA, tal como acontece no rato [Muller & Kubie, 1987].

VI-4. Construção do mapa Os mecanismos de mapeamento dependerão fortemente do desempenho dos mecanismos

anteriores. Aqui será apenas dada a plataforma teórica que permite a eficiente construção de mapas espaciais do ambiente, bem como a realização de atalhos e voltas.

VI-4.1 Aspectos básicos A arquitectura escolhida para a construção do mapa do ambiente, é uma onde o espaço é

representado por uma rede de nós ou PV. Por outras palavras, diferentes PV serão interligadas através de vectores polares que irradiam de cada uma. Aqui serão empregues as ideias base de [Poucet, 1993].

Fig. 19 - Exemplo de uma rede armazenada de PV. Tal como apontado por [O’Keefe & Nadel, 1978], cada par de PV possui uma ligação polar “bi-direccional” entre elas. Desta forma, (L2) possui uma ligação comutável para (L3). (L3) possui duas ligações deste tipo, para (G) e (L4).

* Numa cadeia de Markov, a probabilidade duma dada sequência de observações é obtida através do produto de probabilidades de

observações independentes. No fim, dependendo da semelhança global da sequência de entrada com os valores armazenados, esta probabilidade será maior ou menor. No entanto, o modelo aqui apresentado não possui problemas de “transições nulas” ou “probabilidades nulas”, porque foi implementado de forma diferente. Além disso, o treino é local e simples.

IV - Desenvolvimento do projecto

28

Ligações bi-direccionais entre cada par de PV são caracterizadas pelo deslocamento e de um ângulo relativo a uma referência local:

{ }l di i i=→

;α Eq. 5

Fig. 20 - A PV (L1) possui um vector-deslocamento polar { }

rd di i i= ;α direccionado para (L2), de acordo com a referência

local (R1). (L2) também possui um vector polar para (L1) (não mostrado), mas desta vez de acordo com a referência (R2). Note-se que

r rd d12 21≠ por causa das diferentes referências usadas para cada um.

Tal como será verificado mais adiante, esta estrutura vai permitir todas as operações interessantes observadas na biologia.

VI-4.2 Adicão de uma nova pista visual Quando uma nova PV é encontrada, discriminada e armazenada, é adicionado um novo nó ao mapa

corrente. Este nó possuirá um vector polar dirigido à última PV de onde o AA veio. O módulo do vector será a distância viajada e o sua fase será escolhida segundo uma referência local. O mais simples será considerar que esta referência corresponde à posição onde o primeiro neurónio da RNA de reconhecimento dispara no máximo (frente-de-onda a passar por ele), onde a bússola assumirá um valor de referência nulo.

VI-4.3 Navegação baseada no mapa O presente mecanismo de mapa funcionará de forma semelhante àquela apresentada em [Mataric,

1989, 1990b, 1992], onde é propagada actividade a partir do nó de alvo desejado. Assim, quando o AA “pensa” no alvo desejado, o correspondente nó activar-se-á e propagará actividade até ao nó corrente. Isto despelotará um processo de seguimento de nós sucessivos até ao alvo.

Fig. 21 - Propagação de actividade desde o nó de alvo até aos outros pelos vários caminhos possíveis. A actividade vai decaíndo como consequência das maiores ou menores confianças de cada caminho (perigos, dificuldades). Assim, por exemplo no primeiro caminho, a confiança de 0.5 corta a propagação para 50%, e assim sucessivamente. Pode-se imaginar que este processo de propagação pode ser biológicamente realizado através de ressonâncias locais plausíveis de acontecerem no HC.

IV - Desenvolvimento do projecto

29

Fig. 22 - Quando um AA explora o seu ambiente, as ligações entre PV ficam mais ou menos “confiantes”, o que origina caminhos preferidos (linhas mais grossas). A propagação de actividade vai precisamente escolher o caminho preferido para o alvo, tendo em conta todas as possibilidades. Tal como em [Mataric, 1989, 1990b, 1992], as “idas-à-volta” devidas a caminhos repentinamente fechados ou perigosos, é um sub-produto deste mecanismo de propagação de actividade.

Fig. 23 - Tomando como exemplo a rede de deslocamentos do mapa à esquerda e em cima, existem várias situações possíveis durante a tomada de caminhos para o alvo. Um caminho “normal” seria o (S - L2 - L3 - G). O caminho mais curto (atalho) seria o (S - G) tal como mostrado em baixo à esquerda. Este tipo de caminho requer alguns extras em termos de cálculos, e que consistem em propagar também os vectores polares, de forma a se obter uma “soma” destes no nó corrente. Esta soma representa directamente o vector polar para o alvo. No caso da ligação (L3-G) estar obstruída, o AA simplesmente ignora esta ligação, podendo então tomar o caminho (S - L2 - L4 - L5 - G).

A reaquisição da posição corrente no mapa, após o AA se ter “perdido”, é efectuada de forma simples. Logo que o AA encontre uma PV conhecida, fica a saber onde se encontra e pode retomar o caminho eventualmente planeado anteriormente.

CCCCAPÍTULO APÍTULO APÍTULO APÍTULO VVVV

“Sonhar é bonito, mas se não se converte em realidade, não serve para nada” - Anónimo

V - Uma implementação real

31

Após as considerações anteriores, foi realizada a implementação do AA. Foi utilizada uma

plataforma robótica já construída e chamada “THOMAS” [Snow, 1995]. Foram realizadas algumas melhoras, como a inclusão dum LCD, dum teclado para comandos exteriores, um novo carregador de programas ultra-rápido, e um tranceptor de RF.

A plataforma de programação utilizada foi o ICC11 v3.5 C compiler / assembler da ImageCraft [ImageCraft, 1996].

Fig. 24 - Vista de cima da plataforma e fotografia do AA com os seus vários órgãos.

V - Uma implementação real

32

VI-1. Discriminação e reconhecimento de pistas Foram realizadas algumas simulações preliminares em MatLab© v4.2b [MathWorks, 1994], para

verificar as espectativas quanto ao modelo implementado.

Fig. 25 - A imagem do topo e à esquerda mostra duas PV sobrepostas, onde se utiliza a RNA aprendida para a das linhas a cheio. À direita dessa, tem-se também os correspondentes ângulos dos desvios durante a circunscrição de ambas. Logo por baixo, têm-se as progressões das actividades das diversas frente-se-onda, onde apenas é reconhecida a primeira PV. Nas imagens mais abaixo, pode-se ver melhor a propagação das frentes-de-onda para cada PV.

Fig. 26 - Tal como seria de esperar em PV simétricas, aparecem tantas frentes-de-onda principais quantas forem as simetrias. Neste caso, tem-se uma PV quadrada, pelo que o AA não sabe em qual dos cantos / lados se encontra. Para discriminar melhor nesta situação, ter-se-ia de utilizar mais alguma fonte de dados sensoriais.

V - Uma implementação real

33

Fig. 27 - Mais casos com diferentes PV, onde foi adicionado ruído uniforme elevado, além de se estar a utilizar já o mecanismo de sobre-amostragem. Além disso, o AA começa a meio das PV. Ainda nestas situações, existe discriminação suficiente que permite o correcto reconhecimento.

VI-2. Construção de mapas

Shaft-encoders

Bússola

Integração de caminho

Córtex motorMemória de

trabalhoArmazenamento

das PV

Armazenamento do mapa

Sensores IVComportamentos

de pista visual

MotoresComportamentos

de alvo

Planeamento de alto-nível

Cht. 1 - Diagrama de blocos da estrutua global que permite a construção de mapas no AA. Esta estrutura foi integralmente programada no 68HC11, mas nunca foi testada devido a problemas demasiado demorados nos mecanismos inferiores (circunscrição e reconhecimento de PV. Os “comportamentos de alvo” referem-se à procura e circunscrição de PV, enquanto que os “comportamentos de alvo” se referem à procura direccionada de PV. A “memória de trabalho” refere-se à RNA temporária que é gerada quando o AA tenta reconhecer uma PV. Se esta for nova, então é armazenada no “armazenamento de PV”, e apagada em caso contrário. O “planeamento de alto-nível” seria uma entidade que desse uma sequência lógica às acções a realizar pelo AA (de momento apenas controlada pelo projectista).

V - Uma implementação real

34

VI-3. Testes reais, calibragens e resultados Tentativas Média

Xinterno -58 -111 +38 -139 -88 -123 -81 -80 mm Xinterno = valor da abcissa interna

Yinterno -90 -45 -54 -22 -45 -101 -65 -60 mm Yinterno = valor da ordenada interna

Dinterna 108 119 66 140 98 159 104 113 mm Dinterna = módulo da dist. interna

βinterno -122 -157 -54 -170 -152 -140 -141 -133o βinterno = ângulo interno

Xreal -90 -130 +10 -160 -20 -60 -140 -84 mm Xreal = abcissa medida exterior/

Yreal -90 -40 -70 -30 -50 -90 -60 -61 mm Yreal = ordenada medida exterior/

Tab. 1 - Resultados de experiências para verificar o correcto funcionamento e precisão do mecanismo de integração de caminho, quando se ordena ao AA para realizar uma coroa circular de 360o e parar. Os valores referem-se aos erros medidos, com as seguintes constantes deste mecanismo: distância entre rodas = 166 mm, distância viajada por cada impulso dos shaft-encoders = 1.4 mm. Note-se que as médias internas aproximam-se das externas e são diferentes de zero, pelo que o AA percebe que, de facto, não parou no ponto de partida.

Tentativas Média

∆α -5 -7 -20 +1 +10 -8 +13 -16 -12 -12 -5.6o D 341 122 209 172 153 128 421 171 472 128 232

Tab. 2 - Resultados de 10 tentativas de circunscrição a uma PV rectangular, com um perímetro de 4 m. O erro máximo da bússola está abaixo dos 6%, enquanto que o erro máximo de deslocamento se encontra abaixo dos 12%. Estes resultados parecem ser melhores que em [Seed, 1994].

Tentativas

Distância total (m) 3.8 3.0 3.6 2.0 2.1 2.7 4.4 5.0 5.7 3.3 4.0 Voltas totais Φ ∆T i

i

=∑ α (o) 240 260 220 180 180 180 320 420 200 650 700 Error de retorno (∆d) (cm) 25 19 27 20 12 6 21 50 30 60 9

Tab. 3 - Quando o AA parte dum aposição de “casa” e depois volta após algumas voltas, observam-se erros finais. Com não mais de 250o de voltas, o erro de retorno estava abaixo dos 10%. Estes erros aumentam mais com as voltas dadas do que com a distância. Como o AA apenas deve viajar directamente entre PV, os erros serão ainda menores (menos de 4%).

Fig. 28 - Após as calibrações anteriores, vai-se finalmente proceder ao armazenamento e reconhecimento duma PV. Neste primeiro caso trata-se de uma PV com forma rectangular (quase quadrada), mostrada na figura acima. Com 72 e 82 cm de lado, resulta numa trajectória de aproximadamente 102 e 112 cm para o AA, respectivamente. O córtex motor foi fixado para ter uma resolução de 22.5o, sendo então capaz de discriminar apenas entre ângulos de -90o, -67.5 o, -45 o, -22.5 o, 0 o, 22.5 o, 45 o, 67.5 o, 90 o. A distância entre amostras refinadas será de 250 mm, ligeiramente mais do que o comprimento do AA. O refinamento (nº de amostras entre cada armazenamento angular) foi fixado em 4.

V - Uma implementação real

35

VOLTA 1 67.5 - 0 - 0 - 90 - 22.5 - 0 - 0 - 45 - 22.5 - 0 - 67.5 - (-22.5) - 0 - 22.5 Parou normalmente

VOLTA 2 67.5 - 22.5 - 0 - 45 - 45 - 0 - 0 - 67.5 - 22.5 - 0 - 22.5 - 67.5 - 0 Parou precocemente

VOLTA 3 67.5 - 0 - 0 - 67.5 - 0 - 0 - 45 - 45 - 22.5 - 22.5 - 45 Parou precocemente

VOLTA 4 90 - 0 - 0 - 45 - (-45) - 0 - 67.5 - 22.5 - 0 - 45 - (-22.5) - 0 - 0 - 90 Não parou

VOLTA 5 67.5 - 22.5 - 0 - 22.5 - 90 - 0 - 0 - 22.5 - 45 - 0 - 0 - 67.5 - 0 - 0 - 0 - 67.5 Não parou

VOLTA 6 67.5 - 22.5 - 0 - 45 - 45 - 0 - 0 - 45 - (-22.5) - (-22.5) - 45 - (-22.5) - 0 - 0 - 45 Não parou

VOLTA 7 90 - 22.5 - 0 - 0 - 67.5 - 0 - 0 - 67.5 - 22.5 - 0 - 0 - 90 - 0 - 0 - 67.5 - 22.5 Não parou

VOLTA 8 22.5 - 0 - 0 - 45 - (-22.5) - 0 - 67.5 - 22.5 - 0 - 22.5 - 67.5 - 0 - 0 - 22.5 - 67.5 - 0 Não parou

VOLTA 9 67.5 - 22.5 - 0 - 45 - 22.5 - 0 - 0 - 67.5 - 0 - 0 - 67.5 - 22.5 - 0 Parou normalmente

VOLTA 10 67.5 - 0 - 0 - 67.5 - 22.5 - 0 - 22.5 - 22.5 - 0 - 0 - 90 - 0 - 0 Não parou

VOLTA 11 67.5 - 0 - 0 - 67.5 - 22.5 - 0 - 0 - 67.5 - 0 - 0 - 45 - 22.5 - 0 - 0 - 45 Parou normalmente

VOLTA 12 45 - 0 -0 - 22.5 - 67.5 - 0 - 0 - 45 - 0 - 0 - 22.5 - 67.5 - 0 - 22.5 Não parou

VOLTA 13 90 - 0 - 0 - 45 - 45 - 0 - 22.5 - 45 - 0 - 0 - 45 - 45 - 0 - 0 Parou normalmente

Tab. 4 - Sequências angulares armazenadas, para diferentes circunscrições de teste, incluindo as condições de paragem. Muitas vezes o AA não parou sequer, devido a elevados erros de dead-reckoning (mais de 200 mm num total de 4 m). Os ângulos são dados no sentido dos ponteiros do relógio. Por vezes, os cantos exibiam uma soma angular diferente de 90o (67.5o e 112.5o) devido à quantização relativamente pouco refinada. Os lados possuem ângulos de 0o. Em geral, é observada uma elevada diversidade de ângulos e sequências, o que torna evidente a dificuldade de se realizar reconhecimentos com métodos convencionais.

Testando o algoritmo de reconhecimento, constatou-se que a actividade das frentes-de-onda principais se situavam entre 0.49 e 0.90, o que pode significar que o poder de discriminação entre PV será ainda bom. Isto será eventualmente mostrado em experiências posteriores.

Grf. 1 - Resultados experimentais relativos à sequência armazenada VOLTA12 e uma das sequências de reconhecimento. No gráfico em cima à esquerda, pode-se ver as diversas frentes-de-onda, onde existem 4 ou 5 mais activas. Note-se que são esperadas quatro frentes principais para esta PV quase quadrada, tal como explicado numa secção anterior. Por baixo, é mostrada a progressão 3D das actividades das frentes, onde se nota claramente a existência de quatro frentes mais activas. É interessante notar que um par é mais activo que o outro, o que significa que o AA foi realmente capaz de discriminar entre os dois pares de posições simétricas possíveis em volta da PV rectangular. Nos gráficos à esquerda, de cima para baixo e da esquerda para a direita, tomaram-se as actividades das frentes-de-onda nas amostras refinadas 3, 5, 7, 10, 20 e 55. A amostra 55 corresponde ao ponto de paragem. É claro que no início ainda existem muitas frentes, que vão decaindo à medida que o AA avança. Após menos de um quinto da circunscrição, o AA já mostra duas frentes-de-onda principais que correspondem às duas posições possíveis do AA na PV. Para o fim, a discriminação torna-se melhor. Note-se como as frentes de cada par possuem formas similares. Isto demonstra que a simetricidade da PV é fielmente representada no AA.

V - Uma implementação real

36

Fig. 29 - Esta é a segunda PV, não-simétrica. Sem a cadeira, esta PV fica como a primeira. Neste caso, os valores de reconhecimento situava-se acima de 0.53.

Grf. 2 - Tal como anteriormente, são estas as frentes-de-onda e a sua correspondente progressão no tempo. Ao contrário da PV 1, aqui o AA já consegue saber perfeitamente onde se encontra na PV, pois existe uma frente-de-onda claramente dominante. Isto deve-se ao facto da PV não ser simétrica.

V - Uma implementação real

37

Fig. 30 - Terceira configuração de PV, desta vez com zonas arredondadas. O mecanismo de reconhecimento não fará qualquer distinção especial entre cantos, curvas e segmentos rectos. Também neste caso aparecia uma frente-de-onda mais activa.

LA1/LB1

S1 LA1/LB1

S2 LA2/LB2

S1 LA2/LB2

S2 LA6/LB6

S1 LA6/LB6

S2 LA12/LB5

S1 LA12/LB5

S2

Pista visual A (1) 1.51 1.65 0.98 1.13 0.90 1.23 1.61 1.99 Pista visual B (2) 1.34 1.37 1.40 1.32 1.22 0.90 1.35 1.20

Tab. 5 - Estes resultados mostram as razões de discriminação para diferentes combinações de sequências armazenadas e novas circunscrições. Por exemplo (LA12/LB5 - S1) significa que foram usadas as sequências armazenadas (12) da PV (1) e a (5) da PV (2), e testadas em face da sequência de reconhecimento (1). A coluna mais à esquerda indica qual das PV foi circunscrita. Por exemplo, circunscrevendo a PV (1), obteve-se uma actividade 1.51 vezes maior na RNA da PV (1) que na da (2). São observadas algumas “inversões” (bold) daquilo que seria de esperar em reconhecimentos correctos, levando a uma taxa de erro de 18.8%.

Grf. 3 - À esquerda tem-se uma progressão da razão de reconhecimento desejada: a razão sobe sempre sem grandes sobressaltos. À direita tem-se uma progressão anómala, onde a razão “inverte”, provavelmente devido a um desajustamento entre os comprimentos das sequências.

LA1/LC1

S1 LA1/LC1

S2 LB1/LC1

S1 LB2/LC1

S2 LA2/LC2

S1 LA2/LC2

S2 LB2/LC2

S1 LB2/LC2

S2

Pista visual A(1) / B(2) 2.78 2.5 1.37 2.18 0.97 0.86 1.51 1.42 Pista visual C (3) 1.05 1.33 1.3 1.25 1.4 1.77 1.3 1.39

Tab. 6 - Estes resultados já se referem a combinações entre a terceira PV e as duas anteriores. Aqui, a taxa de erro desce para os 12.5%, mostrando que esta terceira PV é mais facilmente discriminada das restantes duas. Os valores das razões também apresentam casos maiores que na tabela anterior.

V - Uma implementação real

38

VI-3.1 Comentários Todo este trabalho desenvolvido no âmbito do reconhecimento de PV, foi apresentado de forma a

expor as características de operação deste novo mecanismo neuronal. Ainda existe muito trabalho a ser feito, de forma a tornar este mecanismo mais robusto e fiável. Para se atingirem melhores resultados em termos de discriminação, talvez não fosse má ideia tentar usar constelações sensoriais mais complexas do que simples informações motoras. Por outras palavras, dever-se-ia tentar usar informação mais completa acerca da sequência de posições em volta duma PV, através de mais dados a partir de outros sensores. Seria interessante tentar implementar algo como em [O’Keefe & Nadel, 1978], especialmente no que se refere ao córtex sensorial que alimenta as RNA de reconhecimento.

CCCCAPÍTULO APÍTULO APÍTULO APÍTULO VIVIVIVI

“Uns livros devem ser provados, outros engolidos, e alguns poucos mastigados e digeridos” - Francis Bacon

VI - Palavras finais

40

VI-1. Comentários Usualmente, é dada pouca importância aos aspectos de sequenciamento temporal dos

mecanismos neuronais, quando poderá ser um factor decisivo na implementação de sistemas inteligentes e robustos. “Pensar temporalmente” poderia eventualmente colocar mais luz sobre dados experimentais bem como em sistemas neuronais plausíveis. A maior parte dos dados experimentais disponíveis apenas mostram fenómenos neuronais estáticos e locais, relacionados com aspectos de navegação.

A utilização de memória e de informação de contexto também carece ainda de maior interesse, apesar de parecer ser absolutamente indispensável para se construírem AA verdadeiramente robustos e eficientes.

Por vezes é dada pouca importância à tentativa de interligar teorias inicialmente descorrelacionadas. Nesta tese, foi feito algum trabalho com o objectivo de atingir uma teoria global para os mecanismos de mapeamento, onde muitas coisas se encaixavam relativamente bem.

As contribuições mais importantes deste trabalho foram as seguintes: • Reconhecimento de pistas podem possuir qualquer forma - não só funciona para PV poligonais,

mas também para PV redondas ou com zonas arredondadas.

• Reconhecimento tolerante - o mecanismo de reconhecimento baseia-se numa estrutura distribuída e que se ajusta automaticamente de forma a fornecer o melhor desempenho possível.

• Pode-se extrair a posição do AA dentro da pista - desde que a PV não seja simétrica em relação a ambos os eixos Cartesianos, torna-se trivial extrair a posição corrente do AA em relação a esta.

• Plausibilidade biológica - teve-se o cuidado de utilizar mecanismos neuronais que possuíssem alguma plausibilidade biológica, pela observação de dados experimentais e relação com estes.

• Atalhos e idas-à-volta flexíveis e triviais - o mecanismo de mapeamento produz caminhos normais, bem como atalhos e idas-à-volta de forma simples e directa.

• Plataforma de baixo custo e baixa potência

De uma forma geral, penso ter conseguido mostrar um caminho neuronalmente plausível para se

implementarem mecanismos de mapeamento e navegação autónomos. Penso também ter apresentado

VI - Palavras finais

41

algumas ideias inovadoras, como sejam o córtex motor e a forma de propagar a actividade nos NP que correspondem a ZP na PV.

O principal impulsionador deste trabalho foi a necessidade de pensar em novas soluções neuronais para os problemas do mapeamento e da navegação com um mapa. Pesquisas anteriores apenas serviram como um ponto de partida, para ver onde é que outros mecanismos falhavam e onde é que realmente explicavam dados biológicos. Durante toda a implementação do AA presente, tentei manter-me o mais fiel possível aos dados biológicos experimentais, caindo em soluções necessárias que, com alguma surpresa, possuíam semelhanças com mecanismos em animais.

VI-2. Lições a tirar desta tese • A pesquisa bibliográfica revelou-se ser a parte mais importante deste trabalho, já que permite

obter uma visão global e detalhada sobre trabalhos e resultados existentes. Sem isto, uma implementação futura pode-se revelar inútil, redundante, sem contribuição, e até mesmo errada.

• Novas teorias, explicações e sucessos são conseguidas mediante muitas horas de pensamento sério e sequencial. Quando se tenta resolver um problema teórico, tem-se de esforçar bastante.

VI-3. Trabalho e melhorias futuras • RNA de reconhecimento - seria desejável conseguir meter todas as RNA numa só RNA global.

Quanto ao método de paragem, foi sugerido um processo diferente, em conversas com [Euliano & Príncipe, 1995], que utiliza as próprias frentes-de-onda em vez de dead-reckoning.

• Adaptação das RNA - seria mais robusto permitir ao AA adaptar as suas RNA já adquiridas, de forma a ter em conta eventuais modificações das formas das PV ou corrigir erros anteriores.

• Robustez e calibragem dos sensores - seria desejável conseguir um mecanismo de sensores invariante a condições de iluminação, textura e cor das PV, e que realizasse alguma forma de auto-calibragem automática.

• Linearização dos motores - seria desejável que ambos os motores reagissem da mesma forma aos comandos dados, e que não andasse um mais que o outro.

• Uso do mapa - as estruturas de mapeamento estão implementadas, faltando apenas detalhes menores mas importantes (ponto de saída da PV, compensação do tamanho da PV, etc.)

• CPU dedicado - visto que o mecanismo de integração de caminho necessita de processamento constante, seria interessante ver se um CPU dedicado a esta tarefa melhoraria os resultados.

• Use de aprendizagem genética e por reforço - para se obter um AA capaz de se adaptar ao a certos aspectos do ambiente por ele próprio, seria interessante implementar alguma aprendizagem ao nível dos sensores e mapa. Seria interessante ver se alguns dos problemas de calibragem seriam assim minimizados.

• Uso de visão e outros sensores - seria interessante implementar mais alguns sensores como sonar e visão, de forma a se tentar melhorar o desempenho das RNA de reconhecimento. O uso de constelações sensoriais complexas para o treino destas redes, poderia torná-las mais discriminativas, em vez de utilizar apenas informação motora algo confusa.

• Possível aplicação - o AA desenvolvido, com as melhorias acima mencionadas, poderia eventualmente ser útil para compreender fenómenos fisiológicos em animais e Humanos. Já que ainda se encontra muito num nível experimental, aplicações reais exigiriam uma plataforma mais estável e um comportamento global mais robusto.

ANEXOANEXOANEXOANEXO

Autonomous agent navigation strategies with neural networks

i

Abbreviations

1D - 1 Dimension

2D - 2 Dimensions

3D - 3 Dimensions

AA - Autonomous Agent(s)

AI - Artificial Intelligence

ANN - Artificial Neural Network(s)

EEG - ElectricoEncephaloGram

EEPROM - Electrically Erasable Read-Only Memory

HC - HippoCampus

IR - InfraRed

LCD - Liquid Cristal Display

MI - Metrical Information

PN - Positional Neuron(s)

PF - Place Field(s)

RAM - Random Access Memory

RF - Radio Frequency

TI - Topological Information

VL - Visual Landmark(s)

WTA - Winner-Take-All network or mechanism

Conventions

• [Author, 19xx] - published references.

• {Author, 19xx} - unpublished material or conversations referred in a paper.

• italic word / expression - technical word / expression used for the first time in a section, or expression that should be distinguished from the rest of the text.

• bold word - important detail in an explanation.

Autonomous agent navigation strategies with neural networks

ii

Contents

I. INTRODUÇÃO ...................................................................................................................................................................1

I-1 REASON AND MOTIVATION FOR AUTONOMOUS AGENTS....................................................................................................1

I-2 OBJECTIVES FOR THIS WORK ............................................................................................................................................2

I-3 AUTONOMOUS AGENT CHARACTERISTICS.........................................................................................................................3

I-4 GENERAL ISSUES ..............................................................................................................................................................4

I-4.1 Artificial intelligence in autonomous agents ...........................................................................................................4

I-4.2 Navigation and map use ..........................................................................................................................................6

I-4.3 Place and landmark recognition .............................................................................................................................7

II. BIOLOGICAL PROCESSES .........................................................................................................................................10

II-1 INTRODUCTION .............................................................................................................................................................10

II-2 TEMPORAL AND RELATIONAL PROCESSES .....................................................................................................................11

II-3 MAP CONSTRUCTION AND PLACE FINDING.....................................................................................................................12

II-3.1 Evidence of map construction in biological behavior..........................................................................................12

II-3.2 Particularities of biological map construction.....................................................................................................15

II-3.3 The phenomenon of the hippocampus ..................................................................................................................16

II-3.3.1 Special operations on the topologic and metric information............................................................................................ 23

II-3.3.2 Other areas associated to the hippocampus...................................................................................................................... 24

II-3.3.3 Final facts, questions and hypothesis............................................................................................................................... 25

II-3.4 Human studies......................................................................................................................................................27

II-4 LEARNING AND SELECTION ...........................................................................................................................................28

II-4.1 Natural or genetic selection .................................................................................................................................28

II-4.2 Neural learning ....................................................................................................................................................29

II-4.3 Reinforcement learning ........................................................................................................................................30

III. EXISTING THEORIES, MODELS AND IMPLEMENTATIONS ...........................................................................33

III-1 HIPPOCAMPAL FUNCTION AND STRUCTURE MODELING ................................................................................................33

III-1.1 Use of place neurons for navigation ...................................................................................................................33

III-1.2 CRAWL - hippocampus function model ..............................................................................................................35

III-1.3 Model of a cognitive map....................................................................................................................................35

III-1.4 Simulation of recurrent connections in the hippocampus...................................................................................37

III-1.5 Hippocampal place learning model....................................................................................................................38

III-1.6 MAVIN - visual localization through the hippocampus......................................................................................40

III-1.7 Tour model..........................................................................................................................................................41

III-1.8 Modified QualNav model ....................................................................................................................................41

III-1.9 Other hippocampus-inspired models and implementations ................................................................................44

III-2 BIOLOGICALLY INSPIRED MODELS................................................................................................................................45

Autonomous agent navigation strategies with neural networks

iii

III-2.1 Subsumption architecture ...................................................................................................................................45

III-2.2 Plan-guided reaction ..........................................................................................................................................46

III-2.3 Subsumption architecture map construction.......................................................................................................49

III-2.4 Map construction by landmark circumvention ...................................................................................................50

III-2.5 Spatio-temporal self-organizing feature maps....................................................................................................51

III-2.6 Other models and implementations.....................................................................................................................53

III-3 NAVIGATION THROUGH ARTIFICIAL INTELLIGENCE ......................................................................................................53

III-3.1 Global map through related local maps .............................................................................................................54

III-3.2 Exploration as a graph construction ..................................................................................................................54

III-3.3 Terrain acquisition through continuous movement ............................................................................................55

III-3.4 Quad-trees for map construction ........................................................................................................................56

III-3.5 Moving array map centered on the robot ...........................................................................................................57

III-3.6 Other implementations and theories ...................................................................................................................57

III-4 OTHER CONSIDERATIONS.............................................................................................................................................60

IV. PROJECT DEVELOPMENT........................................................................................................................................63

IV-1 PRACTICAL IDEAS ABSTRACT.......................................................................................................................................63

IV-1.1 Ideal aspects and real limitations .......................................................................................................................63

IV-1.1.1 Low-level reactions ........................................................................................................................................................ 64

IV-1.1.2 Landmark and place recognition .................................................................................................................................... 64

IV-1.1.3 Map construction and usage........................................................................................................................................... 65

IV-1.1.4 General aspects............................................................................................................................................................... 66

IV-1.2 Implementation aspects.......................................................................................................................................66

IV-2 BASIC REACTION MECHANISM .....................................................................................................................................69

IV-2.1 Movement smoothness.........................................................................................................................................69

IV-2.2 Obstacle-avoidance.............................................................................................................................................69

IV-3 INTERNAL COMPASS AND PATH INTEGRATION..............................................................................................................72

IV-.3.1 Shaft-encoder problems .....................................................................................................................................73

IV-3.2 Turn-to behavior .................................................................................................................................................73

IV-3.3 Go-to behavior ....................................................................................................................................................74

IV-4 LANDMARK DISCRIMINATION AND RECOGNITION ........................................................................................................75

IV-4.1 Finding behavior.................................................................................................................................................75

IV-4.2 Docking behavior................................................................................................................................................75

IV-4.3 Circumvention behavior......................................................................................................................................75

IV-4.4 Classification and recognition ............................................................................................................................76

IV-4.4.1 Absolute temporal Kohonen 1-D network...................................................................................................................... 76

IV-4.4.2 Ring network with temporal activity propagation .......................................................................................................... 77

IV-4.4.3 Averaging motor-cortex as a preprocessor for the 1D network...................................................................................... 81

IV-4.4.4 Similarities with the hippocampal phenomenon............................................................................................................. 85

IV-4.4.5 Mathematical analysis .................................................................................................................................................... 86

IV-4.4.5.1 Hidden Markov Model similarity ........................................................................................................................... 87

IV-4.4.5.2 Saturating activity................................................................................................................................................... 88

IV-4.4.5.3 Additive activity increase ....................................................................................................................................... 88

IV-4.5 Last remarks........................................................................................................................................................89

IV-5 MAP CONSTRUCTION ...................................................................................................................................................90

Autonomous agent navigation strategies with neural networks

iv

IV-5.1 Basics ..................................................................................................................................................................90

IV-5.2 Special operations of map construction and usage.............................................................................................90

IV-5.3 Adding a new landmark ......................................................................................................................................91

IV-5.4 Updating a path ..................................................................................................................................................92

IV-6 MAP-BASED NAVIGATION ............................................................................................................................................92

IV-6.1 Activity propagation............................................................................................................................................92

IV-6.2 Learning and normal path computation .............................................................................................................93

IV-6.3 Shortcuts and detours..........................................................................................................................................94

IV-6.4 Failure and reacquisition....................................................................................................................................96

IV-6.5 Expectancy addition............................................................................................................................................97

IV-6.6 Final aspects .......................................................................................................................................................98

IV-6.7 Comments..........................................................................................................................................................100

V. A REAL IMPLEMENTATION....................................................................................................................................102

V-1 THOMAS ...................................................................................................................................................................102

V-1.1 Initial platform ...................................................................................................................................................102

V-1.2 Preliminary improvements .................................................................................................................................103

V-2 IMPLEMENTATION AND SIMULATIONS .........................................................................................................................104

V-2.1 Programming platform considerations...............................................................................................................104

V-2.2 Landmark discrimination and recognition .........................................................................................................105

V-2.2.1 Simulations.................................................................................................................................................................... 105

V-2.2.2 Particular cases .............................................................................................................................................................. 109

V-2.2.3 Final experiments .......................................................................................................................................................... 110

V-2.3 Map construction................................................................................................................................................111

V-3 REAL TESTS, CALIBRATIONS AND RESULTS .................................................................................................................112

V-3.1 Basic reaction mechanism..................................................................................................................................112

V-3.2 Internal compass and path integration...............................................................................................................112

V-3.3 Landmark discrimination and recognition .........................................................................................................114

V-3.3.1 Almost square landmark shape ...................................................................................................................................... 114

V-3.3.2 Second landmark ........................................................................................................................................................... 117

V-3.3.3 Third landmark .............................................................................................................................................................. 119

V-3.3.4 Presence of both landmarks 1 & 2................................................................................................................................. 119

V-3.3.5 Presence of landmarks 1/3 and 2/3 ................................................................................................................................ 120

V-3.3.6 Last experiments............................................................................................................................................................ 121

V-3.3.7 Comments...................................................................................................................................................................... 122

VI. FINAL WORDS............................................................................................................................................................124

VI-1 COMMENTS ...............................................................................................................................................................124

VI-2 LESSONS TAKEN FROM THIS THESIS ...........................................................................................................................125

VI-3 FUTURE WORK AND IMPROVEMENTS .........................................................................................................................125

REFERENCES, BIBLIOGRAPHY, HARDWARE AND SOFTWARE

CCCCHAPTER HAPTER HAPTER HAPTER IIII

“It is from the beginning that everything is started” - Anonymous

I - Introduction

1

VI-1. Reason and motivation for autonomous agents There is a multitude of applications for the so-called “Autonomous Agents” (AA). Assuming that they

intrinsically possess some sort of navigation capabilities such as map construction or at least goal achievement strategies (obstacle avoidance, light-following, etc.), some potential application fields can immediately be pinned out:

• Extraterrestrial exploration - space probes, planet surface rovers [Gat, Desai, Ivlev, Loch &

Miller, 1994].

• Sub-aquatic exploration - deep sea exploration and repairs [Herman & Albus, 1988] [Herman et al, 1988], autonomous submarines [Buholtz, 1996].

• Operation in toxic and hazardous environments - inspection robots in hazardous radiation areas, detection of failures and radiation leaks, cleanup of nuclear accident areas*, oil and gas fields, mines, war mine-field cleanup operations†.

• Vigilance and security systems - security robots for intruder detection and following [Branco & Kulzer, 1995].

• Rescue operations - mine flooding, nuclear breakdowns, aircraft crash.

• Help in industry - autonomous and intelligent transportation robots, dangerous industrial operations and others [Arkin & Murphy, 1990].

• Explain physiological phenomena - besides the above real applications, there is also much work that “only” tries to understand certain physiological phenomena, specially at the level of the brain areas responsible for mapping and navigation. This knowledge can then eventually be applied in reality.

* This would have saved thousands of lives of workers that had to clean up the Chernobyl nuclear power plant. Of course, the actual AA are

not yet sophisticated enough to allow that type of specialized work, but it could happen some day. † This is already being attempted by a German that built a machine that roams on the mine-fields and blowing up detected mines. I do not

have further information on this, only that the scientific community is skeptical about this machine. How many lives and crippled people could be saved if such a machine actually works?

I - Introduction

2

Although autonomous agents are still subject of early research, it seems to be a very promising field, where the progress made out of small achievements will eventually lead to an overall achievement and help for Mankind. It is therefore a big motivation, trying to realize something that contributes to this young field.

Note that this field of autonomous agents has little to do with the field of Classical Robotics [Fu, Gonzalez & Lee, 1987]. This later field has to do with the conception of non-autonomous mechanisms, controlled by centralized systems and that perform repetitive and deterministic tasks, such as robotic arms and other mechanisms in an industrial mounting line.

VI-2. Objectives for this work The main objective of this work is to present the most important previous research made on this field

of autonomous navigation strategies and related issues for AA. It is going to be compact and commented, without excluding the most important and interesting aspects of each research topic that should be retained from that previous work. Abstracts of theories, models, simulations and practical implementations will be presented.

After having broadened the horizons on this steadily growing field of research, based on Artificial Intelligence (AI) and more biological ones, and where still does not exist much concordance about which mechanisms are best, a practical implementation is going to be made which emphasizes certain mechanisms and demonstrates their functionality. This implementation’s model is going to be inspired on the many existing ones. It tries to respect two major guide-lines: simplicity and minimalism, although not jeopardizing performance. Simplicity, in that the number of modules, systems and components is kept to the strictly necessary, and minimalism in that those modules, systems and components are the most simple ones. Again, a limit is imposed when a certain degree of performance is required. This performance degree will be exposed when the actual implementation is going to be made.

The sequence of work in this thesis, will be the following: • Make an extensive bibliographic research, to get acquainted with the past and current theoretical

and experimental work on this field of autonomous navigation, focusing on mapping strategies. While doing this research, focus on a particular topic of the field. This topic has been chosen as being the phenomenon of the Hippocampus*. At the end of it, make an abstract of the most important issues. Furthermore, while exposing other authors’ work, make the appropriate comments regarding the implementation done in this work.

• Sketch implementation ideas for a real AA. Here, all the previous research will both serve as a starting point and idea pool.

• Make the actual implementation, by progressively building the various modules. Find out problems and limitations of these modules. The AA should be able to compute shortcuts, detours and simpler trajectories between Visual Landmarks (VL).

• Test the whole implementation that was managed to be built in the available time, in a test environment. It is not absolutely necessary that the AA behaves 100% accurately and robustly. It is only intended to demonstrate that the way the implementation was made, produces a sufficient or insufficiently overall accurate behavior. Hopefully, it will behave sufficiently well, to survive the complex task of navigating through landmarks.

* The Hippocampus is a structure that will be detailed later.

I - Introduction

3

During this work, some major guidelines are going to be followed as close as possible: • Instead of strict AI with if-then-else structures, neural mechanisms will be used extensively. If-then-

else structures will only be used where they simplify the job. So, all mechanisms should be biologically inspired and thus have biological plausibility, with the aim of demonstrating some working mechanisms that may exists in animals and man.

• Always try to satisfy the most amount of experimental data with the developed models. Since some of them could even be wrong to date, things that are not confirmed by experience have nevertheless some plausibility and could even shed further light on the biological mechanisms.

• Try to interconnect and interrelate different experimental data and theories into one global theory of recognition and mapping mechanisms. Theories that initially seem to be unrelated could eventually be in some global model. Each author studied separate problems, data and theories, and some “gluing together” must be tried.

• Give preference to models and theories that may be implemented in parallel and distributed computer hardware. These have the potentiality of being much faster than centralized ones, and could provide further research data.

• Give preference to on-line learning mechanisms, instead of mechanisms where complete data must be present at once. In other words, the AA should be able to learn “as it goes”.

• Always try to make every value relative to another, so accumulative global errors are avoided, and only accumulative local errors (relative) exist which are usually small. This should be followed in issues like displacements, landmark recognition, and mapping.

• In general, try to be as complete as possible on each issue, but without occupying too much space. Here, completeness means as few ambiguities as possible. This takes up pages, of course. When reporting previous work, biologically plausible and interesting works are going to be talked in some depth, while superficially talking about other ones.

• Keep hardware cost as low as possible, using cheap sensors, processors, robotic platform, etc.

VI-3. Autonomous agent characteristics The AA will be based on a previously assembled platform called “THOMAS” [Snow, 1995]. This

platform received several improvements, such as a Liquid Crystal Display (LCD), a keyboard, a Radio-Frequency (RF) transceiver, among others. Furthermore, a whole operating system was developed, for the command of the most basic capacities of the AA, such as motor and Infra-Red (IR) sensors activation, LCD write, keyboard read, real-time interrupts and routines, etc. THOMAS had already the sensor suit to avoid obstacles and circumvent landmarks on the floor, as well as shaft-encoders for wheel motion measurements.

The final desired characteristics for the AA that will emerge from this work will be: • Basic reaction mechanism

• Landmark discrimination and recognition

• Map construction of the environment • Map-based navigation with goal finding, shortcuts and detours.

I - Introduction

4

VI-4. General issues Under the notion of navigation, one understands the capacity of an AA being able to make its way

through an environment where it should exist. Behaviors like obstacle-avoidance and goal-finding are some of the most important navigation tasks. This work is going to concern two other important tasks which are landmark recognition, and map construction / use. These last two tasks greatly enhance the AA’s capabilities of survival and more successful and efficient task completion. In other words, the AA turns out to be more intelligent and robust.

VI-4.1 Artificial intelligence in autonomous agents It is known that it is easy to implement an AA that performs some very simple task like wall avoidance

in small toy cars. A very simple reactive system allows the toy to have basic reactions. On the other hand, it is not easy to integrate many reactions or behaviors, as well as superior intelligence in such AA. The biggest difficulty lies in choosing the hardware and software structures for such complex mechanisms. It is very difficult to chose architectures and tweak parameters, even in highly constrained laboratory environments, not to speak in natural outdoor environments.

When an AA is built, one cannot expect to get an entity capable of understanding things like “Go and get a beer out of the refrigerator for me, please”. For us, it is simple and immediate, but it suffices to have some unexpected situation to get the AA completely confused or lost. We do not think which angles to give to our arm and hand joints, illumination conditions, image segmentation, object recognition, nor do we think about errors or misses because everything comes naturally. For an AA however, this is an almost impossible task to be executed robustly. If a robust AA is desired, then it must have robust behaviors starting from the bottom: the reaction level [Brooks, 1985]. In other words, classical Artificial Intelligence may not be the best way to do things. According to [Brooks, 1985], the belief that high level symbolic programming will lead to Human-like intelligence, stands on unfounded faiths. While classical AI bases itself on a symbolic system hypothesis, Brooks insists on a more physical grounding hypothesis. While the first one decomposes Human intelligence into interdependent functional modules which generate the desired behavior (top-down architecture), the second one composes several individual behavioral modules whose coexistence originates the emergence of more complex behaviors (bottom-up architecture).

Brooks’ new idea is based on the belief that “the best model of the world is the world itself, it contains all the necessary information, and it is always updated”, while traditional AI has the idea that there should be a strong central model reflecting the outside world in some representational terms. In other words, instead of making high-level models upon which the AA acts, let the AA act directly with the real physical world. While classical AI tries to demonstrate high levels of sophisticated reasoning in very specific and laboratory conditions with the hope of generalizing its robustness, this new AI paradigm of Brooks tries to implement robust AA in very noisy and complex low-level environments with the hope of generalizing to more sophisticated tasks. Brooks even argued against the need of internal representation of the environment [Brooks, 1987] and showed complex AA [Brooks, 1989]. As said in [Connel, 1990], the complexity of a creature’s actions are not necessarily due to deep cognitive introspection, but rather to the complexity of the environment it lives in. In this environment-creature interrelation, funny but distressing situations may arise. For example, when a cat ducks while waiting for the best moment to jump and catch a prey, it waggles the tail because of the internal conflict “jump / wait” and scares the prey [Desmond, 1986].

I - Introduction

5

When an AA is programmed, one must descend to its level of existence, i.e. make a mind meld* with the AA’s “mind”. Our brain has the innate ability of abstracting images† to a level where only ideas like “the book is over the table”, “I am away from the table” or “it is the book about biology” exist, without even troubling with technical issues mentioned above. That is why we have great difficulty sometimes to understand the underlying low level mechanisms present in an artificial AA. The most usual and incapacitating mistake made during an AA construction and behavior evaluation, is to forget that it does not “see” what we “see”. Sometimes, weird behaviors of the AA are perfectly correct in terms of program following, but the observer cannot understand the reason of “wrong behavior”.

Fig. 1 - [Branco & Kulzer, 1995] implemented a security AA, which followed light. When it was confined in a room with a window where the sun entered directly, the AA insisted in going to the opposing wall. It didn’t follow the window like we would do? This was just because its sensor was at the level of the floor, where the higher window was not visible. Only the reflection on the lower wall was visible, so the AA did in fact behave correctly according to its capabilities and not to our expectations.

To fully understand what is going on with an AA, the observer must see nothing else than the information data that the AA really receives, and the real programmed capabilities faced with that information. Hardware limitations as well as software brittleness have to be carefully studied. Essentially, the following points must not be forgotten:

• An AA only sees and operates on the sensory data provided by its limited sensor suit.

• An AA cannot perform tasks impossible to be performed with those particular sensors.

• An AA does not predict unforeseen situations, it just reacts according to the built-in behaviors.

Simplicity is another keyword that must be kept in mind when implementing an AA {Doty, 1995} (in

[Kulzer, 1995]). If the mechanisms of an AA are kept simple, it is much easier to improve, debug and inspect them. In other words, one must not complicate systems that can be implemented in a simple way. Here, a striking example is the use of very simple, noisy and cheap IR sensors for obstacle-avoidance [Kulzer, 1995] [Seed, 1994] [Branco & Kulzer, 1995], instead of using complex and processor intensive stereo camera mechanisms [Braunegg, 1993] [Dhond & Aggarwal, 1989] [Tsuji, 1986] [Kriegman, Triendl & Binford, 1989]. Simplicity is also related to minimalist agents, where only the minimum number of mechanisms are used to achieve a desired behavior.

* In the TV series “Star Trek”, Mr. Spock was a Vulcan with the ability of entering and inspecting peoples’ minds. This was called the Vulcan

Mind Meld, and is strongly emphasized by Prof. Keith L. Doty of the Machine Intelligence Laboratory, University of Florida. † Talking about images, one means everything sensory that converges in the brain, like vision, tact, hearing, smell and taste, as well as

combinations of these factors. For example, when one thinks about the favorite dish, vision images, smell images and taste images flood the corresponding brain areas in resonance.

I - Introduction

6

When implementing some mechanism that tries to mimic the behavior or results of a biological one present in animals, such as mapping, VL recognition, etc., three main problems may be the origin of the relative poor performance obtained:

• The mechanism may be implemented in a totally different way in the biological organism.

• The data that the mechanism receives may be totally different from the biological case.

• One can be misinterpreting the results or the implementation.

The whole head-breaking problem of research in this biological field, consists of finding out what of the above mentioned problems is the real limiting one.

VI-4.2 Navigation and map use In this work, autonomous navigation is going to be emphasized, where VL, basic navigation

techniques and maps play the major roles. The here implemented system will only account for fixed objects. Examples of navigation among moving objects are given in [Griswold & Eem, 1990] [Fujimura & Samet, 1989].

The basic navigation system gives the AA the basic abilities like obstacle-avoidance, wall-following, light-following, etc. It allows the AA to satisfy higher-level orders like VL finding. Some high-level planning coordinates those navigation actions, in order to reach a desired goal or goals. This planning can arise from the use of an internal map that is also autonomously constructed by the AA.

Mapping is a feature that can enormously enhance the AA’s capacity and efficiency to achieve a desired goal. Taking again the security AA [Branco & Kulzer, 1995], this AA is 100% able survey rooms in a building and eventually chasing intruders. The issue here is one of total lack of efficiency, because it is only a reactive system. It does not know if it already entered a room, if it already finished inspected a room and therefore should exit, where it should go, etc. If it loses an intruder (stimulus), then it forgets about chasing and returns to normal surveillance. Mapping mechanisms provide a memory of the past, where previous actions and visions can be stored for future use and manipulation. Note that map construction is perfectly dispensable when there is no need for a superior spatial intelligence. In this kind of AA, the only needed intelligence is to be able to recognize unique VL or places via an external artificial beacon system, which allows the AA to know exactly where it is at all times. An example is a beacon-tracking AA [Leonard & Durrant-Whyte, 1991].

Some questions and issues that arise when performing map construction, are the following [Penna & Wu, 1993]:

• Type of representation - how should a 2D or 3D map be constructed from sensory data, and

should that environment sensory information be integrated to allow efficient storage and retrieval?

• Update - how should the map be updated in order to best adapt to environment changes?

• Dimension - can go from 2D (land surfaces) to 3D (air, undersea, space).

• Spatial limits - an environment can be limited (maze) or virtually unlimited (sea, space, land, air). An unlimited environment poses greater difficulties upon efficiency of representation.

• Structure - the environment may be highly structured (laboratory, maze) or highly unstructured (outdoor nature). Unstructured environments present high difficulties on recognition tasks.

• Stationary landmarks - VL can be stationary (rocks, mountains, walls) or non-stationary (sun, moon). Non-stationary VL present evident difficulties in recognition of places.

I - Introduction

7

• Visible and distinguishable landmarks - VL can be visible or partially hidden, further being distinguishable or not. Demands are placed on pattern completion and discrimination techniques.

• Number of landmarks - this number can be known or not. Often it is unknown and VL are memorized as they are found on the way.

• Landmark accessibility - VL can be accessible or not accessible. In case of accessible VL, recognition methods based on its circumvention are possible [Doty, Caselli, Harrison & Zanichelli] [Caselli, Doty, Harrison, & Zanichelli]. If not, other methods like vision [Braunegg, 1993], sonar [Drumheller, 1987] [Elfes, 1987] [Barshan & Kuc, 1992] [Watanabe & Yoneyama, 1992] and laser ranging may be used.

• Global North - The presence of a global reference direction can greatly ease the mapping and recognition mechanisms, but it takes some autonomy of the AA that depends on the accurateness and availability of that source of information. Local techniques are preferred.

• Distance and angles between landmarks - distance and angles may be measurable or not. If they are, the AA can use them as features to recognize VL and corresponding places.

• Continuous or discrete observations - many similar VL, can be discriminated through continuous observation, contrary to discrete observation where only snap-shots are taken.

• Obstacle presence - how to tell between obstacles and VL? What features distinguish them?

Position-finding in an AA has been influenced by “brute-force” AI techniques [Moravec, 1981] [Zhang

& Faugerhaus, 1992] which construct a 3D metric representation of the environment, and has been limited by the computing power needed as well as by the inherent errors in stereo vision [Braunegg, 1993] [Dhond & Aggarwal, 1989] [Kriegman, Triendl & Binford, 1989] and sonar [Drumheller, 1987] [Elfes, 1987] [Watanabe & Yoneyama, 1992]. More recently some techniques of topological maps appeared which learn geometric relations between distinct VL [Kuipers, 1978] [McDermott & Davis, 1984] [Levitt, Lawton, Chelberg, & Nelson, 1987] [Kuipers & Levitt, 1988] [Bachelder & Waxman, 1994] [Penna & Wu, 1993] [Wu & Penna, 1993].

VI-4.3 Place and landmark recognition One cannot forget two fundamental mechanisms that are indispensable to the autonomous

navigation with the planning and spatial map construction capabilities: place and landmark detection, and their eventual unique recognition. Landmarks will be further referenced as VL, since these are the most common in research. VL can viewed as any feature in the environment that can be distinguished and eventually recognized, based on sensory information of any kind (vision, smell, tact, etc.). The main problems that arise with those VL are related to their distinguishability and observability, as mentioned in [Penna & Wu, 1993].

Fig. 2 - On the left picture are displayed three cases where invariance in VL recognition would be highly desired, always on the line of sight. On the right, an aspect sphere [Bachelder & Waxman, 1994] is shown, where a VL is mapped as a circular sequence of aspects. These aspects vary smoothly around the VL. (adapted from [Bachelder & Waxman, 1994])

I - Introduction

8

First, it is necessary for the detection system to separate the VL from the background, making it observable. This background could be noise or other objects in an image for vision), a mixture of smells (for smell detector), or any other kind of confusing data that ’hides’ the VL and has to do with the discrimination power of the detection system*. This already rises problems related to image segmentation and feature detection. Second, the recognition or classifier system must be able to ‘label’ the detected VL so the AA has some information of where it is in an environment to enable correct navigation though a map. In 2D or 3D environments, this recognition task gets more difficult, as there are much more variations and degrees of freedom both for the environments and its VL and for the system as well. Reduction of sensory input space must be performed to reach an efficient system that suits the desired purposes†. Dealing with the measurement errors is a necessity in unstructured environments [Sutherland & Thompson, 1994] [Beckerman & Oblow, 1990].

* In a radio receiver, there is a lower boundary above which signals can be detected. It still does not guarantee that the receiver recognized

them as a special symbol. That is the classifier system’s task. † The visual cortex in the brain performs a huge visual data compression by means of line and corner abstraction. Those line and corner

detecting outputs then further feed the more reasoning areas of the brain, like the hippocampus which is thought to be the mapping area of an organism and will be further detailed.

CCCCHAPTER HAPTER HAPTER HAPTER IIIIIIII

“We have always copied Nature, obtaining the civilization; we should, however, learn with it to achieve our own Nature”

- Frank J. Martin

II - Biological processes

10

Beyond the research made on the field of artificial AA for getting to build better machines that are

able to imitate the Human capacities of intelligent navigation, there is also a huge field of research where one tries to understand and mimic the more biological processes of map construction and landmark recognition, through plausible models. Plausibility means that the model has similar behavior as the real biological experimental data features at some aspect. Here, the topics vary between low-level neuronal behavior such as the Electroencephalogram (EEG) in certain areas and neurons of the brain, and the high-level mechanism of Human reasoning and planning.

The findings that may occur, can then be attempted to be included in the artificial architectures, as a hope to get better performance out of the machines. In fact, one expects biological performance, such as reliability, robustness and efficiency. Only when one fully understands the solutions that evolved in Nature, some simplifications and translations can be made due to the limitation of the current technology. This is exactly the current trend, since more and more biological experimental data becomes available from research.

Of course, biological solutions or trends are not that much taken in areas where the implementation of algorithmic solutions for a group of deterministic applications is necessary. In mounting-lines, digital computers, calculators, classic control systems and others, different and essential qualities are required like high precision, repeatability and predictability. It is curious to note that nowadays one is trying to fuse these two realities, to get the best out of each. Examples are neuronal control systems, fuzzy logic systems, etc.

VI-1. Introduction Biological organisms adapt their way to behave to maximize their survival rate. The main

mechanisms are genetically imposed (like the brain formation), where this brain can furthermore refine those behaviors.

Temporal processes and mechanisms are of fundamental importance for the animals. The brain is overwhelmed by temporal operations based on sequencing. Relational processes are related to this temporal sequencing in the brain, since it knows that many things are related, simply by their temporal contiguity.

Orienting capabilities are also widespread and are virtually all complex organisms like mammals, insects and birds. The word orientation stems from the Latin verb oriri, which means “to arise from” or “to originate in”. The word oriens, which originally referred to the daily rising of the sun, came to be used for the direction of its rising - the East - and finally for the lands that lay at that compass point, “the East” or the “Orient”. Every behavior is oriented in some way. Whether the animal walks, grooms, catches

II - Biological processes

11

prey, or interacts with social partner, “where” or “in which direction” is an indispensable feature of its behavior pattern. Thus, we can define orientation as the process that organisms use to organize their behavior with respect to spatial features [Schöne, 1984]. In other words, to orient is not only to find out where the animal is, but also to behave according to that finding.

VI-2. Temporal and relational processes Keywords: Sequences, continuity, relations.

As said above, the brain processes temporal relating information streams. It links, synchronizes and relates temporal events. There are many evidences that there really is temporal processing:

• Typewriting - relations between movements for pressing keys in a sequence to produce words. If

there were a relative displacement of one key, deterministic and persistent errors would occur, since movements are relative. It is very difficult to start or re-write a word from its middle, since initial synchronization is missing.

• Spatial memory - contiguous places are closely related in mental processes. By contiguous, it is meant that they occurred in a temporal sequence. Places that are contiguous but never occurred in sequence, are not directly related. However, novel and directed (non-arbitrarily) shortcuts are readily performed by animals, which indicates that they also have mapping strategies.

• Music - hearing or playing music involves complex synchronization mechanisms in that we are capable of easily detecting sequencing novelties or errors, playing without much conscious effort, and singing without any difficulties. It also is difficult to start playing or singing at any middle point in the music. It is always easier to start from the beginning and keep going.

• Speech and write - speech processes are inherently temporal, since words and sentences are spoken in a well determined manner. Acoustic and written spelling errors are easily captured. It is initially difficult to be able to say or write rarely used long words, as opposed to the most often used ones. Similar changed words are also difficult to learn and often lead one to say or write the better known and related ones (Ex: “Absotively Posilutely” is difficult to learn because of the already existing build-in bias towards “Absolutely Positively”).

• Novelty / error detection - not only in music and speech, we are perfectly capable of determining if whether there are or not strange spots in previously learned sequences.

• Mental processes - thinking like a stream of ideas and images that fly by is a common process that happens to us every day. The emergence of new or related ideas from the last ones, are also very common. Sometimes, if we veered away from something we were thinking of, we are only left with the sensation of the past occurrence of those desired thoughts. We say something like: ”What was I thinking of ?” or “I wanted to say something that I forgot”. Rarely able to recall those past thoughts by simply waiting, one often goes back to older thoughts, places or stimuli, in the hope that the sequence is again performed, reaching those thoughts. This usually works well.

Temporal processes govern almost all biological mechanisms in the brain, therefore they should

never be neglected or underestimated in what they can do to solve the various tasks an animal faces.

II - Biological processes

12

VI-3. Map construction and place finding Keywords: Landmarks, places, place finding, map construction, intelligent navigation, .

The majority of animals spend much of their time traveling from one place to another, moving inside and between habitats. These movements are an adaptive process which is absolutely necessary for the survival and reproduction success of many species. So, it is no surprise to realize that only those animals that ‘genetically chose’ the best and correct mechanisms for the purpose of mapping and navigation with that map, survive better. More efficient and reliable solutions provided a higher survival and thus, a higher chance of reproduction.

The basic elements needed for map construction and consequent navigation, are exploration*, memory, repetition and learning. At a lower level, there are the more innate and involuntary mechanisms found in animals, such as dead-reckoning†, detection and recognition of VL, smell cues, tact, sonar in bats, and other thermal, mechanical, chemical, magnetic or electric cues [Waterman, 1989]. These processes depend much more directly upon the senses. There are some particular stimuli that are used by animals for knowledge of absolute directionality‡. The involved processes are called internal compass and direction finding [Waterman, 1989] and use cues like visual targets (fixed or moving), sun (birds), stars, moon, time sense (birds), chemical gradients, chemioreceptive sensors (flies, ant pheromones track), luminosity gradients (fish, bugs), light / sky polarization (ants), planetary magnetism, touch (rodent whiskers in tunnel, blind person), vibration sensing (sonar in bats), thermoreceptive sensors, electric (fish), magnetic (birds, fish, there is some evidence in Humans).

Pacific Human canoe navigators of the recent past may have 75 to 100 vectors fanned out from the home atoll forming a polar spatial map (hub with radiating vectors), with directions and distances to various landmarks and places [Waterman, 1989]. Each place can even have another local polar vector map, forming a whole global map of dense interconnection of places. The scale is as to suit their needs (small scale for rats, large scale for birds and Human navigators on the sea). The problem is to know whether an animal’s map is a Cartesian or a polar map. The first one could give the traveling route between any 2 sites, whereas in the second, once out of the route, the animal is lost unless each landmark has a hub with vectors to all other places and landmarks (fixed relations). Most experimenters report a route-like (polar) map, instead of a mosaic/grid-like (Cartesian) map.

Position finding [Waterman, 1989] with instruments and tables of data can uniquely define a place on Earth in longitude and latitude coordinates. Furthermore, animals can uniquely recognize a landmark and relate it to home. Smell, sights, dead-reckoning, inertial navigation are also possible.

VI-3.1 Evidence of map construction in biological behavior With the assistance of a powerful map-construction and retrieval mechanism, organisms can

effectively get a larger conscience of the properties of the environment that surrounds them, besides the direct sensory information, being able to efficiently reach goal locations. This also allows them to avoid

* By exploration, it is meant that the animal does not know what to do and where to go. It just wanders about in the hope of finding

something, instead of searching for it. On the other hand, exploitation means exactly the opposite, where one knows what to do and where to go, in a directed search for something.

† Originally, this term was used to refer to the ability of estimating one’s position through a magnetic compass and a board diary, on a ship. In this diary, all the traveled directions and distances are noted, so that the actual position can be approximately calculated relative to a starting reference point. In biological organisms, this word refers to something like a ‘feeling’ of being somewhere, in the spatial domain. It relates to the organisms capability of estimating its current position or displacement relative to some place, using only internal compass or integrating mechanism. Generally, it does not refer to the direct use of an absolute sensing device like GPS (Satellite Global Positioning System, landmark and place identification, or any other mechanism that provides an absolute knowledge of where it is). More technically, this means that a AA with this capability, can estimate its relative position to a reference, for local mapping computations.

‡ When one refers to something absolute in spatial terms, it generally means that it is absolute or almost absolute in the limited space-frame of the organism that is, in its limited environment. For example, we can use the sun to navigate inside a country, but could never use it to navigate in space since it does not stay fixed. Outside of that frame, it could not be suitable for a longer-range navigation task, where other kinds of information must be taken or complex computations must be performed.

II - Biological processes

13

dangerous or previously unpleasant locations, and to retrieve previous food, shelter, storage and other interesting locations. By talking about maps, it is meant any mechanism that allows the animal to know where it is, based on an internal description of the environment, and what path it should take to achieve a goal. It is a complex “Where am I? Where to go? How to go?” mechanism. No external absolute references can be used in a map, or else it would reduce to a simple localization mechanism.

Fig. 3 - On the left, ants and rats find and learn their way through a maze with one outlet and many dead-ends, but the rats behave better probably because they have more brain resources. Nevertheless, the ant’s final performance is comparable to that of the rat. As can be seen from the right error curves, the rat learns to go around the maze much faster. The use of pheromones by the animals was not mentioned. On the right, a desert ant meanders around for about 21 minutes over an area of about 120m2. The (0,0) coordinates point is the starting point and the white (+) sign is the ending point. If disturbed on course, the ant can home directly and accurately. This straight-line following mechanism is called piloting or servoing and does works like a classical control system. This ability must depend on some dead-reckoning mechanism by vector integration along its way or by simply keeping track of the compass heading of the home location. Sky polarization allows it to return in a straight line. When it reaches the home range (distance sense also present) it starts looking for it. Piloting navigation works like a control system with constant feedback and error-correction, whereas non-piloting navigation relies on local data to activate feedforward-only navigation tasks which receive error-correcting information only when the task is re-evaluated at a later point in space (newly achieved place, as a result from this task). The return trajectory can be affected by the local presence of VL, which demonstrates the importance given to visual information [Etienne, 1987] [Etienne, Teroni, Hurni & Portenier, 1990]. See also [Wehner & Räber, 1979] (adapted from [Waterman, 1989])

Fig. 4 - In a {Tolman} study (in [O’Keefe & Nadel, 1978]), after training rats on the left maze, where food was available at location (G) (also a light at (H) was shown), they were allowed to choose freely between the arms of the right maze, where the previous food arm had been sealed off. The largest number of rats (36%) choose arm (6) (which was the direct short-cut to the previous food location) despite the fooling light stimuli at arm 5 (location H). 17% chose arm (5), which might indicate that some rats associated the light with food. Although this might show that rats heavily rely on map information to get to a goal, it is quite impossible to conclude anything final, since different mechanisms were mixed in this experiment. (adapted from [O’Keefe & Nadel, 1978])

II - Biological processes

14

{Olton & Samuelson, 1976} {Olton, 1977} {Olton, Collison & Werz, 1977} (in [O’Keefe & Nadel, 1978]) made an experiment where a rat had to choose between 8 or 17 arms radiating from a central platform. In each arm food was placed at the beginning of each trial. The problem for the rat was, to enter each arm avoiding entering a previously visited, and therefore empty, arm. They learned it quickly and visited 7 or 8 arms before a mistake in the 8-arm, and 14 or more in the 17-arm maze. No obvious strategies like entering the arm next to the one they just exited, smells or visual cues could be used. They had to store some sort of spatial memory, which was long-lasting since confining rats to the center platform after having entered 3 arms, for several minutes, or increasing the overall time for the choices, did not affect the accuracy. This memory can be therefore modified in a single trial. This clearly shows the efficiency aspect of animal navigation strategies.

Fig. 5 - Illustration of the short-cuts taken by dogs in a forest [Chapuis & Varlet, 1987]. On the left, the training paths are shown, where the dogs were guided on the leach, to food locations (A) and (B) with the sequence (SA-AS-SB-BS), where (S) is the starting point. On the right, the paths are shown for the dogs freed afterwards at (S). The dogs first went to (A), which was nearer, and then it is clear that the dogs were able to compute a short-cut from there to (B). In 47% of the trials, the dogs choose a more or less direct shortcut (1) (less than 5o of angular error), where the rest choose a more deviated path, having to correct it afterwards. It is curious to note that more dogs chose path (2) than (3). It seems that the neuronal mechanisms involved make more deficit than overshoot errors*. This experiment clearly shows the capacity of the animal to integrate Topological Information (TI) as well as Metric Information (MI), to be able to compute approximate trajectories. (adapted from [Poucet, 1993])

Fig. 6 - TI accounts for the spatial connectivity between places without specifying accurate displacements (top), whereas MI gives only distance and direction values without specifying sources or destinations (bottom). In the Hippocampus† (HC), probably some MI is also stored, giving it the coarse ability of giving relative directions like: “When you approach place (2) coming from place (1), you can move to place (3) only or go back to place (1).”, whereas MI can only give something like: “Place (3) lies 10m away from place (2), with an angle of 90o”. While the first type of information lacks accuracy (but still being a complete representation), the second one lacks the indispensable relational information (“turn 90o in relation to what?”, “are those places even connected?”). MI is like a pool of unconnected and unrelated vectors. Just imagine the sours of Paris: TI or connectivity means being able to go from one manhole to another, knowing that the path exists, but not knowing what tunnel to take nor its length.

* Just think on what happens when you try finding your way in your home, when you go to the bathroom at night, with the lights off: most

often you slam against or touch the corners, instead of going more away from them. † Brain area that is going to be presented later.

II - Biological processes

15

Unpublished observations by {Nadel, O’Keefe & Somerville, 1976} (in [O’Keefe & Nadel, 1978]) tell that rats trained to avoid shock by reaching a platform in one corner of a square arena inside a room (external cues only), when the square was rotated by 180 degrees, 11/12 chose the previous platform place instead of the new place. This clearly indicates that they were relying on previously stored map information as place learning. Use of distal VL cannot be made since they stay approximately on the same spatial configuration for the whole experimental square/environment and so cannot be used for local place localization by the animal. No local cues were used either, so they have to learn local displacement relations. Distal cues can only provide for information about directions.

Amazing facts like homing in pigeons and migration of birds over large distances, and animals blindly displaced capable of returning home, can be partially explained by the gradient search mechanism, where it is possible (magnetism, light), instead of long-range maps. Young pigeons cannot home with magnets on their heads, so they also need magnetism besides the used sun, whereas adult ones managed to learn to do it when only the sun is available (they had to learn time-relations).

VI-3.2 Particularities of biological map construction There are several important aspects and particularities of map construction in animals, which may

help to reveal the macroscopic as well the microscopic operation rules of the mapping related brain mechanisms.

Fig. 7 - Y-shaped labyrinth. This is a pictorial illustration of the fact that the animal spends most of its time inspecting complex and decision locations such as the bifurcation and the corners of the edges, while running fast between those places. One could think that stored vectors in polar coordinates could deliver the necessary connective information to rapidly calculate displacements between places, whereas a more detailed local maps would store information about finer inspected VL information. This reflects the highly non-uniform nature of the exploration behavior of the animals, which is characterized by a the span of large environments with the close inspection of the interesting sites.

{McClearly & Blaut, 1970} (in [O’Keefe & Nadel, 1978]) note that a cognitive map is not a mental picture that models the environment, but instead it is an information structure from which map-like data or images can be reconstructed. So, animals do on have a topologically correct built-in map in the brain, but instead they have neuronal connections and structures that allow it to remember directions, paths, trajectories to get to some desired place.

{Munn, 1950} (in [O’Keefe & Nadel, 1978]) says that experiments on place-versus-response learning have, in general, confirmed Tolman’s prediction that in a heterogeneous environment rats will learn to run to one place from two different directions more readily than they will learn to make the same turn (left or right) to different places. These turns are included in a class of behaviors called motor-schema*. This means that the type of information that most probably initiates activity in NP is related to the position in the cognitive map of the animal and not to its behaviors.

* Motor-schemas are generally sequences of motor behaviors that are elicited by particular stimuli.

II - Biological processes

16

[Hebb, 1949] pointed out that distal cues remain essentially stationary to the animal’s view compared to proximate ones, so that the first ones can be better used for direction estimation on the current environment, where they cannot be used for precise place discrimination since these cannot be distinguished from one another. For this latter purpose, proximate cues must be used.

{Trowbridge, 1913} (in [O’Keefe & Nadel, 1978]) distinguishes between two types of map: egocentric map where civilized man uses an internal compass to find its way, and the domicentric map where birds, beasts, young children and “primitive” man guide themselves according to a fixed reference point, usually home (starts, sun, etc.). They criticize {Lee, 1970} (in [O’Keefe & Nadel, 1978]) for stating that distances are non-commutable that is, a mental map doesn’t encode a distance from A to B and from B to A simultaneously. Then, how is it possible to store that information in the same structures that provide alternative paths to the goal and rapid path reversals. Non-commutativity would imply that action patterns are stored for each path directed traversal. All this seriously overloads the neuronal system and would rise difficulties in converting the concept to some neural reality. {Moore, 1973} (in [O’Keefe & Nadel, 1978]) supports O’Keefe & Nadel stating that the reversibility of spatial operations (going from A to B and from B to A) has to do with map-like representations. This should be even more true for unfamiliar environments where the animal can only rely on very sparse/incomplete new data. When the two opposite paths become more familiar, the animal could also store route information for each direction, which enhances the precision of each path direction. Basically, the flexibility of cognitive maps argues against the assumption that these would encode the behaviors related to motions into them.

[Poucet, 1993] presents and comments an abstract of the knowledge accumulated by experimental neurophysiologists, as well as by experiments made on animal spatial behavior. He even tries to develop a conceptual model that integrates most of this information. The basic assumption he uses, is the fact that polar coordinates could explain many of the experiments’ data. He points out that the HC should be performing some kind of polar coordinates map computations. This would give an efficient way to code and retrieve complete* spatial information.

An interesting experiment made by [Morris, 1981], emphasizes the fact that no VL are necessary at the goal location for the rat to compute a correct trajectory to it. Here, a rat was placed in a tank of milky water, surrounded by external VL. Finding a under-water hidden platform by exploratory swimming, the rat was subsequently able to swim directly to the platform starting at a random location.

In another experience [Collet, 1987] suggests that gerbils are able to plan their goal trajectory by simple observation of the starting VL constellation. Gerbils do not alter this trajectory until some difficult to specify instant (even if the VL constellation changes), where they re-plan their move. These observations are rather difficult to related to more consistent observations in rats.

[Rolls & Treves, 1993] make a theoretical exposure of the observed synaptic connections between the centers in the HC and related areas. They task about the associativistic nature of the HC connections, specially those that backproject to the neocortex.

VI-3.3 The phenomenon of the hippocampus The construction of maps and the development of related navigation strategies is simply amazing in

some animals like mammals and birds. Those processes are thought to be intimate related to a brain area called the Hippocampus (HC). Neuroscientists prove that there exist neurons responsible for signalizing upon the detection of the animal’s position in a certain environment [O’Keefe & Dostrovsky, 1971] [O’Keefe & Nadel, 1978] [Muller, Kubie & Ranck, 1987] [Muller & Kubie, 1987] [McNaughton, 1988] [O’Keefe, 1989].

* By complete, it is meant that any trajectory to any mapped place can be computed from those polar coordinates, by simple vector

additions and rotations.

II - Biological processes

17

Fig. 8 - On the left, we have the look of a rat’s brain. The hippocampus is a small sausage-shaped portion deep inside the brain. On the right, we can see a magnification of that area, with its most important neural zones and some connections. These differently classified zones were distinguished by the different cell morphology. The hippocampus is mainly constituted by layers of neurons. Excluding the CA4 area, the basic layout pattern is the same for all areas: large neurons (pyramidal CA3 and CA1-type cells) are packed together in one layer and whose dendrites all run off in the same direction for several hundreds of microns. Many of the inputs to each area traverse these dendrites at roughly right angles, making synapses en passage. So, these synaptic links exist for many neurons, some in each traversed layer. (adapted from [O’Keefe & Nadel, 1978])

Fig. 9 - Another look at the HC: longitudinal cut of the rat’s hippocampus showing the laminar structure of the different pathways and neural organization. DG-dentate gyrus; HF-hippocampal fissure; V-lateral ventricle. (adapted from [O’Keefe & Nadel, 1978] [Miller, 1991])

II - Biological processes

18

Fig. 10 - Examples of CA1 and CA3 pyramidal cells, and examples of granule and basket cells. These form highly interconnected paths, causing the emergence of feedback that may play a fundamental role in the most interesting and flexible spatial behaviors like navigating in the dark and finding shortcuts or detours. (adapted from [O’Keefe & Nadel, 1978])

More concretely speaking, there are neurons in the HC, called Place Neurons (PN). The electrical activity of these PN is directly related to the animal’s movement in particular locations of the environment. In other words, there are the so-called Place Fields (PF) that are small spatial zones of the environment, which correspond to particular PN, where these PN fire when the animal is currently moving inside the corresponding well defined PF. In yet other words, when the animal moves within a particular ground parcel (PF), the corresponding PN fire and others do not. It is like a coarse surface encoding mechanism, where each PN encodes the presence of the animal in a certain piece of ground surface (PF). The form of these PF tend to be such that the activity decays exponentially with the radial distance from the center peak [Muller, Kubie & Ranck, 1987].

Fig. 11 - On the left, a new measurement technique is illustrated, with a four-point electrode called tetrode [Recce & O’Keefe, 1989], four different PN of the HC can be measured simultaneously as the rat moves around. This enables the possibility of discovering correlation between different PN at the same time instant. On the right are four different activity maps for four different PN, taken with a tetrode. Each square map represents the environment space where the rat moves around, but for four different PN that is, each map shows the activity that each PN at the various locations. As observed, each PN has a peak in a different location of the environment. Each PN’s activity map has been low-pass filtered and has the rough shape of an ellipsis where the peak activity is at the center. Outside these ellipsis of activity, there is no firing of the PN. This means that each PN represents a different piece of the environment total map, where its firing indicates that the rat is at a particular location (at the corresponding PF). (adapted from [Burgess, Recce & O’Keefe, 1994])

II - Biological processes

19

Fig. 12 - Here is an example of the distribution of PF around a maze-like environment with some landmarks (represented by the ellipsis outside the maze). After a while, when the exploration phase has passed, different PN get active when the rat moves around the maze. This is how the rat codifies the environment, by ‘placing’ PF all over it. Whenever the rat moves through a particular PF, the corresponding PN fires, giving the ‘sensation’ to the rat of where it currently is. These PF are sensible to the identity and orientation of the landmarks. Those PF are partially overlapped on each other, and a small number is sufficient to code an entire environment.

It has been shown [Muller & Kubie, 1987] that if a substantial amount of change is made to the arrangement of the present VL, also results in substantial changes to the shape and location of PF. Sensory aspects like color, texture and size, do not seem to influence largely the activity of the PN [Muller & Kubie, 1987]. It seems that the qualitative perception of VL is the major visual fact, which leads to a large degree of invariance. Also, PN are very sensitive to the spatial arrangement of VL, because their change also causes changes to the PF [Muller & Kubie, 1987]. On the other hand, PF exhibit relative invariance to the absolute position and distance of VL, in that they scale, rotate and translate with the spatial arrangement of those VL (even if only one VL is present in the environment) [Muller, Kubie & Ranck, 1987].

Fig. 13 - Here is an example of how a particular PF can rotate and translate to accompany the corresponding VL (represented by the ellipsis). Also, when the limits of the environment are increased (bottom two pictures), the PF also grows, keeping the rest unchanged. These experiments clearly show that PN are very sensitive to the orientation and relative location of the VL. Almost all experiments lead to the thought that PN code spatial arrangements of VL So, each PN fires at a certain location, related with VL, and moves around as these VL move around. It has been also found that, when the whole environment expands proportionally, the PF also expand proportionally, with the possibility of new PF to appear. Also, a more radical change (e.g. spatial VL distribution) causes unpredictable changes in the PF.

II - Biological processes

20

Fig. 14 - Schematic illustration of the relations between the different types of information that can be gotten from VL. Two different pathways exist in the brain [Mishkin, Ungerleider & Macko, 1983], which get two types of information, the “What?” and the “Where?” components. These are then somehow reintegrated in the HC [McNaughton, Leonard & Chen, 1989], to form PF. Note that motor information also gives an important contribution to the “Where?” component.

Fig. 15 - EEG from a single hippocampal site during various behaviors. Note the irregular activity when sitting still, chattering the teeth, moving the head, sleep and when the rat was awakened but did not move about; more regular activity when swimming, moving, jumping, and moved around. A highly irregular and large-amplitude was observed when sleeping (possibly due to fast memory processing). (Adapted from [O’Keefe & Nadel, 1978]).

There are theta cells in the HC that exhibit a firing pattern similar to the theta-rhythm (TR) of the hippocampal EEG. This TR is best characterized by a sinusoidal wave with a frequency that can reach from 4 to 12Hz [O’Keefe & Nadel, 1978]. It appears on the EEG only when the rat is moving around the environment, but not when it is doing stationary things like grooming, eating and sitting still [Vanderwolf, 1969]. In relation to the firing patterns of the PN, it has been further discovered that there exists a systematic temporal correlation between the activity of the PN and the of the hippocampal EEG [O’Keefe & Recce, 1993]. The destruction of the brain area responsible for the generation of this TR “pacemaker”, results in a severe deficit in the spatial problem solving capabilities of the rat [Winson, 1978].

Rhythmical (regular) slow

activity (RSA)

walking, running straight, backing up, turning, rearing, jumping, climbing, struggling when held, swimming, head movements, getting up, lying down, digging.

Large and small amplitude

irregular activity (LIA and SIA)

immobility (standing still, hanging from a support), licking, chewing, chattering of the teeth, salivation, piloerection, defecation, face washing, licking and biting the fur, scratching with a hind foot, vocalization (squealing in rat, barking in dog), shivering, startled rat leaps to feet but does not run, rat jumps out of box and stops.

Tab. 1 - {Vaderwolf, 1969, 1971} (in [O’Keefe & Nadel, 1978]) reports this correlation between behaviors and TR. RSA seems to be most correlated to navigation-like behaviors, while LIA and SIA seem to be related to non-navigational behaviors. It seems that the HC relies on this TR to carry out some internal neuronal synchronization, indispensable for navigation tasks (Adapted from [O’Keefe & Nadel, 1978]).

II - Biological processes

21

Fig. 16 - Typical firing pattern of PN relative to the TR, when a rat traverses a PF in a straight line. (A) shows the firing of one PN through time, whereas (B) shows those same firings in relation to the TR phase. (C) shows the TR progression at one pacemaker site. The vertical segments show the zero-crossing instants. When the rat enters a PF, the PN fire ‘late’ (up to 360o) in relation to the TR phase. As the rat traverses that PF, this phase gets more and more ‘early’, until the rat exists the PF and the phase is near 0o lag in relation to the TR. Since there are many PF, there are firings on all phases of the TR. (adapted from [O’Keefe & Recce, 1993])

There are two general classes of neurons in the CA1 and CA3 fields of the HC formation: PN and displace neurons. PN were termed as complex-spiked cells, since all of them show complex-spike patterns: a complex spike is a burst of several spikes within a brief period and with decreasing amplitudes, whereas displace neurons never fire that way. These last ones always fire with a TR-like pattern and are included in the theta-cells group {Ranck, 1973} (in [O’Keefe & Nadel, 1978]). Displace neurons fire in correlation to some unexpected event, like the absence of food or an unexpected object or landmark that appears.

Theta cells (including displacement cells) Complex-spike cells (Place cells) - Regular firing - 8-147 spikes per second - Clear phase relation to theta rhythm - Located in the CA1, CA3 and Fascia dentata fields

- Irregular firing in bursts - 2-40 spikes per second - Clear phase relation to theta rhythm - Located in the CA1, CA3 and Fascia dentata fields

Tab. 2 - Both classes of neurons found in the CA1 and CA3 fields of the HC have different firing characteristics, as shown in this comparative table.

It has been reported {Brown, 1968} (in [O’Keefe & Nadel, 1978]) that cats have a TR of widely ranging and irregular frequency during exploration in a novel environment (3 - 6 Hz). Later, when the cat sits in one place, making lots of eye movements the TR is more marked and regular in frequency (4.3 - 4.7 Hz). This could mean that during exploration there are many different, quick and unstable resonance forming, while in the more focused attention (one place), there is focused resonance as well. This phenomenon can also be seen in early stages of conditioning, where the novel stimulus is first presented. The same phenomenon has been pointed out by [Miller, 1991], where novelty lead to a range of (3 - 7 Hz), focusing to (4 - 5 Hz). [Goebel, 1993] describes an oscillator ANN for vision.

II - Biological processes

22

New environment Explored environment Specialization

Mental state Adaptation TR presence

- confusion - adapting - highly variable TR

- confidence - adapted - focused TR

- automation - wired - TR absent

Tab. 3 - Plausible characteristics of an animal’s physiological behavior when faced with these three progressive situations. When placed in a new environment, the TR is highly irregular in frequency, which suggests that the HC neural resonance mechanisms are still learning (many resonances competing) and that the animal is confused and lost. After a while, when the environment becomes familiar, the animal gets more confident and knows where he is, where the TR becomes more focused to a narrow range of frequencies, suggesting that the neural resonances are now much more selective. After a longer period of time, where the environment becomes banal, the animal already seems to move automatically in certain most used paths, which reflects in the absence of the TR (no resonances needed), turning the whole goal seeking into a wired feed-forward mechanism (in the sense that each PF elicits movements towards another one, without any need for “goal thinking”). Speculations arise on the possibility of this TR disappearing only for non-ambiguous tasks, but remaining for ambiguous ones where discrimination is still needed. Also, experiments indicate that rats are not able to perform spatial computations, with the TR “pace-maker” surgically destroyed.

It has been also discovered that the positive half of the TR is responsible for the modulation of the long-term potentiation (LTP) of hippocampal synapses [Pavlides, Greenstein, Grudman & Winson, 1988], more precisely in the dentate gyrus area of the HC where recurrent connections exists between neurons. Furthermore, the location of the PF is independent of the goal locations [Speakman & O’Keefe, 1990], which means that the PN are not excited by the presence of goals. This puts aside the hypothesis that these would change their synapses in response to the presence of goals, but that they do it only in response to present VL. Rats are able to navigate well, after a short period of exploration [Tolman, 1948], which implies ultra-fast learning strategies that make coarse information readily available.

It is well known that the HC does not present an organized structure of its PN, in the sense of the topological relation to the corresponding outer environment space. In other words, there does not exist a topologically correct map. This means that neighboring PN can correspond to PF widely apart in the environment, and that neighboring PF can correspond to PN that are far apart [O’Keefe & Speakman, 1987]. Although usually each PN has only one PF in one environment [Kubie & Ranck, 1983], it can have more than one PF [McNaughton, Barnes & O’Keefe, 1983] [Eichenbaum, Wiener, Shapiro & Cohen, 1989] generally one in each environment. PN have their PF overlapped, and one PN can have several PF in different environments [Kubie & Ranck, 1983] [Muller & Kubie, 1987]. These facts and others suggest that the PN in the HC are not feature detectors for unique places, but make up a distributed spatial representation where each PN contributes to the encoding of one or more places, where groups of PN give a precise representation of a place.

It is not yet completely known what features of the environment cause the firing of the PN, but has been shown that the simplest one (radial distance to VL) is probably not one of them [Muller, Kubie & Ranck, 1987]. Speculations about the nature of these feature point to the angles between VL [Shapiro, 1990], retinal area of the VL [Zipser, 1985], functions of groups of coordinates centered around the VL, movement kinematics of the observer [Muller & Kubie, 1987], and predictions from VL aggregations [Schmajuk, 1990]. Furthermore, there are still some speculations about factors that may influence this local exploratory feature information, like the effects of rewards in certain locations [Nadel, 1991], movement between locations [Muller & Kubie, 1987] [Muller & Kubie, 1989] [Schmajuk, 1990], and temporal discontiguity between locations [Hughey, 1985]. In primates, the HC is viewed as the convergence point of the “What?” and “Where?” of the visual system [Mishkin, Ungerleider & Macko, 1983] [DeYoe & Van Essen, 1988].

Experiments show that TI is more probably stored in the HC (relations between places), and that more detailed metric information but more incomplete is more probably stored in the parietal cortex [Poucet, 1993].

II - Biological processes

23

VI-3.3.1 Special operations on the topologic and metric information The factors that seem to most influence the activity of PN are direction of movement and potential

trajectories [Eichenbaum, Wiener, Shapiro & Cohen, 1989] [McNaughton, Barnes & O’Keefe, 1983] [Muller & Kubie, 1987] [Wiener, Paul & Eichenbaum, 1989]. When a transparent plastic barrier is put in the middle of a PF, PN associated with neighboring high activity PF will have less activity when the rat enters them with another trajectory [Muller & Kubie, 1987] [Muller, Kubie & Ranck, 1987]. Furthermore, when rats are moved passively through an environment, the activity of PN is small relative to free movements [Foster, Castro & McNaughton, 1989]. This suggests that visual information is not sufficient for the maintenance of PF, being the potential movement of major importance. In fact, it has been shown [Muller & Kubie, 1989] that PN fire before the animal reaches the corresponding PF, by a temporal lead of approximately 120 ms, which suggests some kind of “expectation” mechanism makes PN fire for the predicted location instead of the current one.

After a short period of exploration time, where the PN acquire a certain activity pattern in a progressive way, the animal is able to accurately navigate in the dark. This activity patterns maintains its properties even in situations where VL have been removed or if lights have been turned off. More concretely, when the animal travels to a previously explored place, the PN showed very similar PF to the original ones (either in shape and size). This again leads to the thought that visual information only serves as a place learning trigger for exploration, and that it only serves as a confirmation signal for the PF activities, where the motor activity and the past trajectory history would play the major role in activating the corresponding PN.

In this theory of past activities, enters the evidence of Hebbian synapses between PN. These synapses learn correlation information between simultaneously active PN in the HC [Muller, Kubie, Bostock, Taube & Quirk, 1991], so that some sort of relational place-information storage occurs. This activity contiguity is naturally caused by the movements along sequential places, which allows the storage of TI in the HC. These lateral topologic connections allow to hypothesize something about the previously mentioned PF retention in the dark, where one PN facilitates* the firing of the next PN having the motor information as the primary excitation source. PN activity does not even absolutely need explicit sensory or motor input. Cognitive activity, such as visualizing or “thinking about” the goal, is sufficient [Eichenbaum & Cohen, 1988].

Fig. 17 - Illustration of a possible mechanism of lateral recurrent excitation: the PN receive activities that originate in visual and motor information, as well as from other PN. When the animal traverses PF in the sequence ‘A’, ‘B’ and ‘C’, excitatory Hebbian connections are made between the corresponding PN that have overlapped PF (simultaneous pre- and post-synaptic). By means of long-term synaptic depression (LTD) it is also possible that inhibitory connections emerge between NP that do not fire in correlation (distant PF ⇒ temporal discontinuous firing of the correspondent NP) [Hetherington & Shapiro, 1993]. These connections can be further related to the corresponding motions, so that VL neighborhood and related motion are closely tied together [McNaughton, Leonard & Chen 1989] [Foster, Castro & McNaughton, 1989].

* Sinaptic facilitation is a well-known process [Thompson], where a certain excitation source is not enough by itself to fire a neuron, but that

successfully fires the neuron when a facilitation synapse is also active. This later synapse is also not able to fire the neuron by itself. So, it resembles to something like a catalysator substance in chemistry.

II - Biological processes

24

This expectation pre-excitation driven to neighboring PN could be part of the process by which animals can navigate in the dark or with removed landmarks. This capability also enhances the robustness and tolerance to missing VL along a long trajectory to the goal, allowing the animal to keep on going. These TI capabilities lead to the hypothesis that navigation between near places is easier and more precise than navigation between distant places which do not have these connections [Poucet, 1993].

Maze-running experiments were used to test the animals’ ability to construct internal maps [Gallistel, 1989]. After the rats had familiarized themselves with a new environment, the arms of the maze were shortened or lengthened. The rats ignored visual sensory input and ran into the shortened walls, whereas they stopped on the lengthened ones at locations corresponding to their previous lengths.

VI-3.3.2 Other areas associated to the hippocampus While the HC is thought of having the function of recognizing places by means of its PN, and storing

the topological relations between those places, the other areas of the brain, highly related and connected to the HC, which seem to execute complementary operations. These hypothetical operations are also based experimental data, many times from lesions to those areas.

• Parietal cortex (PC) - probably the area responsible for the storage of MI about the relations

between places. This system seems to receive TI from the HC and also MI from the motor cortex (MC). Associating both, a refined representation of the relation between places is generated. This refinement learning is much slower than the TI storing in the HC, maybe because it requires the animal to travel many times through the same trajectories to enable this repetitive learning nature. So, this MI is only available after a longer time of exploration. The advantage of this type of information lies in that it allows the animal to much more accurately perform his moves. While the HC allows to rapidly build up a coarse topological map, so the animal can also react quickly, MI offers the refinement of such reactions. Rats are able to navigate well, after a short period of exploration [Tolman, 1948], which implies ultra-fast learning strategies that make coarse information readily available.

• Pre-frontal cortex - this area is probably used for the computation of trajectories, receiving information from the HC and PC. It may combine TI and MI to compute short-cuts, detours and other maneuvers. It gets TI for finding paths, and MI for effectively computing accurate directions and distances of travel.

Fig. 18 - Simplified scheme of how the different brain areas might be connected in order to attend to experimental data. These connections may explain the various details of biological mapping and navigational skills. (adapted from [O’Keefe & Nadel, 1978])

The hypothesis that the HC gives fast inaccurate TI for a fast familiarization to a new environment, and that the PC then refines this map with MI to produce more efficient movements, can be reinforced with some experimental lesion data on animals. HC lesions produce spatial performance loss: in a already highly familiar environment, an animal may still retain good navigation capabilities, where

II - Biological processes

25

animals in new environments have a deep navigational deficit [Poucet, 1993]. In this case, the animal still knows distances, but looses some of the TI. On the other hand, PC lesions produce general deficits, where the animal never gets to accurately move about the environment [Poucet, 1993].

This way of doing things is a quite efficient way of solving the problem of generally slow learning of complete information sets in neural networks*: as soon as the animal faces itself with a new unknown environment, the HC almost instantly provides for information that can be used in coarse but still useful navigation, where the slower refinement learning process gradually provides accurate information for increase in final performance. So, first comes instant survival, followed by better survival. If the animal had to wait for the second source of information, it would eventually not survive to novel dangerous situations.

Fig. 19 - There are head-direction cells found in the postsubiculum of the HC and other areas. These neurons seem to respond selectively to the animal’s current head direction in a certain environment [Taube, Muller & Ranck, 1990a, 1990b] [McNaughton, Barnes & O’Keefe, 1983] [McNaughton, 1988] [Redish, 1995]. In other words, there is a constant preferred firing direction for each environment. The neuron ensemble fires maximally when the animal’s head turns to that direction and then decays for other directions, with the form of a tuning-curve. This directional information could be used to compensate for the animal’s head direction of gaze, and are suggestive of navigation. Note that each neuron possesses a receptive field with Gaussian-like sensitivity shape relative to the animal’s head direction angle, with a standard deviation of about 20o - 60o. This subicular phenomenon is similar to the tuning-curves of the visual cortex, where neighboring neurons have neighboring sharp tuning-curves to the angle of visual line segments [Kulzer & Branco, 1994]. One detail that is not yet fully clear, is whether the direction of movement of the rat influences the PN’s activity. Some experiments indicate that they do [McNaughton, Barnes & O’Keefe, 1983] [McNaughton, 1988], whereas others fail to do so [O’Keefe, 1976] [Muller, Kubie & Ranck, 1987]. These last two, suggest a panoramic view of VL. (adapted from [Redish, 1995] and [Kulzer & Branco, 1995])

[Redish, 1995] points out that the neural basis of the biological path integration mechanism is not yet known, but any such representation must enable vector arithmetic. This is because gerbils search for food in a circuitous route, but when they find it, they return directly home. Also, when their nest is moved, they search where the nest was, ignoring the strong sensory cues from the displaced nest. In other words, rodents seem to keep a homing vector from the current position, which must be constructed with some reliable directional sense.

VI-3.3.3 Final facts, questions and hypothesis Anatomical and experimental data on the HC suggest that this formation is the site for cognitive maps

[Tolman, 1948] [Kaplan, 1973]. But how does it integrate all the experimental information available on its operation?

Let’s review all the important facts about the HC known until now: • The HC is a small sausage-like area of the brain.

* Learning processes in the brain are quite slow. This is particularly true for large networks and huge amounts of data. Many artificial neural

networks (ANN) spend lots of time learning something, until they can really solve a problem. This is because they must learn the whole data set at once, so they converge slowly. In the Brain, there is a more distributed and parallel learning, where different areas learn limited or coarse sets of data.

II - Biological processes

26

• PN fire in correspondence to PF which are small and confined areas of the environment.

• Usually, each PN has only one PF in an environment, but can have more than one.

• PN can have multiple PF in multiple environments.

• PF usually have the shape of an ellipsis where activity decays exponentially from its center.

• PF usually cover all the environment by partially overlapping each other.

• Location of PF depends on the VL arrangement in the environment. Major changes to this arrangement, also originates major changes in the PF’s location, size and shape.

• PF exhibit invariance to the absolute position and distance to VL. PF scale, rotate and translate in concordance to the same types of spatial changes in the VL.

• If the environment is proportionally expanded, PF also expand and there may appear other PF in-between the existent ones.

• The map formation in the HC is not topologically correct that is, spatially PN are not organized in the same way as the correspondent PF.

• There are head-direction neurons with steep tuning-curves that provide information to the HC.

• Direction of movement and potential trajectory seem to be the factors that most influence PN’s activity.

• The TR appears usually only when the animal is performing navigation-like behaviors, essentially movements. This TR ‘pacemaker’ is indispensable for good navigation and mapping performance.

• PN fire consistently in relation to the TR’s phase. As the animal moves across a PF, they begin firing ‘late’ and end firing ‘early’ relative to the TR’s.

• Displacement cells fire when some stimulus or VL was expected to be at a certain place, but has been removed.

• LTP and LTD create long-lasting connections, forming feedback loops between PN.

• The presence of rewards does not seem to influence the HC processes.

• HC map building is ultra-fast, where the animal only needs a short exploration period to navigate well afterwards.

• Animals are able to accurately navigate in the dark or with removed landmarks, after a short exploration period.

• Animals will learn to go to one place from different directions more readily than they will learn motor-schemes.

• Animals are able to make shortcuts, detours and rapid path reversals.

According to [Penna & Wu, 1993], some of the many questions that remain when studying the

operation of the HC, are the following: • What causes the firing of the PN? In other words, what exact information is presented to those

neurons? Until now, it is thought that PN receive visual and motor information, as well as lateral excitations and inhibitions.

• What mechanisms are involved in PN firing? What connections are there? There are excitatory and inhibitory connections, but their purpose and exact wiring is not well known.

• What is the effect of the PN firing? What is the information they output? It is thought that PN’s firing serve as behavior-triggering signals or as some kind of data that will be further processed and abstracted by other brain areas.

• What is the role of place in all this? Who does the position of the animal in the environment affects the operation of the HC? It seems that places play a fundamental role in PN’s firing.

• What is the role of the TR in synchronization, PN firing and spatial reasoning?

II - Biological processes

27

Besides these facts and questions, there are also some important hypothesis and mechanisms that might answer some questions in the future:

• Motion affects the readiness with which PN fire. It acts as an expectation factor.

• Visual information serves as a trigger for first exploration and for the later place recognition, but motor information is the most important source of excitation for PN.

• Rewards may bias an animal to seek for a particular place, through some activity backpropagation* mechanism, but they do not affect the way PN fire and PF are constructed.

• Temporal contiguity and discontiguity between places forms LTP and LTD connections between the correspondent PN, constructing an interlinked place-map called topological map, which holds only TI.

• When one PN is firing, it pre-excites other neighboring PN through its previously learned LTP connections, generating some sort of expectation to those places, which leads to the capability of navigation in the dark or with missing VL.

• The TR is part of a pacemaker mechanism for shifting excitation focus within the HC layers of PN. Its frequency is proportional to the speed of the rat, so that its spatial phase progression through those PN is constant.

• The HC fuses both “What?” and “Where?” information, in the way that it holds TI as for the place relations, as well as for VL identities and arrangements.

• MI is stored in some other location of the brain and permits precise navigation once the TI has been used to go from one place to another in a coarse approximation. First, TI becomes available as in a new environment, and afterwards MI becomes gradually available for navigation refinement.

• The PC receives MI in form of motor information from the motor cortex. It slowly associates MI to the fast acquired TI of the HC, allowing further refinement of movements and trajectory computation.

• The Pre-frontal cortex in conjunction with the HC and PC, is the center for trajectory computation, where shortcuts, detours and other maneuvers are produced.

• The mapping mechanisms use polar coordinates to code and compute spatial information. It is both efficient and complete.

• For local navigation and place discrimination, animals use only local VL.

All these experimental findings from biological research on navigation, as well as the hypothesis, give

clues and important data for imagining and implementing a similar but artificial system. It is fundamental to discover how to solve the navigation and map construction problem, in an intelligent and efficient way. Once more, it is not intended to make a complete and accurate implementation in this work. Instead, only the main ideas of hippocampal operation are going to be used as inspiration for an implementation that has similar capabilities and robustness as the biological HC.

VI-3.4 Human studies There are two essential complementary models for Human map construction: the Tour model [] [] []

and the QualNav model [Levitt, Lawton, Chelberg & Nelson, 1987] [Levitt, Lawton, Chelberg & Koitzsch, 1988] [Levitt & Lawton, 1990]. The Tour model has preference for environmental structures like a net inside a city, where places are interconnected by streets, while the QualNav model is intended for open space, where places are defined by the spatial constellation of VL.

* It is not intended to refer to the backpropagation algorithm [Haykin, 1994]. It only refers to some mechanism by which the desired goal

place spreads activity or anything else that will ‘attract’ the animal towards this location.

II - Biological processes

28

These models were inspired in several Human studies, where it was verified that most errors made when people tried to speak about their learned internal maps, were metric and very rarely topologic []. In other words, people were much more able to memorize topological relations among places, rather than their metric distances and angles. Even if the drawn maps came out distorted, at least the topological relations were mostly preserved.

Humans are able to walk accurately towards a goal or walk some way and point to an object they have not seen since they started walking, after inspecting the surroundings from on spot and the closing their eyes {Thompson, 1980} {Pick & Rieser, 1982} (in [Collett, Cartwright & Smith, 1986]). This clearly shows that some sophisticated mapping mechanism exists.

Finally, studies made by cognitive psychologists and zoologists, showed that either Humans and animals memorize distinctive VL and use relations between those VL to predict and identify places, and to plan and execute movements.

VI-4. Learning and selection The most important learning methods present in nature and AI are going to be present. This is an

issue of major importance to create adaptability capabilities for dynamic AA.

VI-4.1 Natural or genetic selection Keyword: Genes, chromosomes, populations, reproduction, crossovers, mutations.

There are navigation aspects that are genetically pre-wired and others that are learned through life. While the second ones are related with neural learning, the first ones are related with natural selection, i.e., only the best adapted animals will have the best chance to survive*. Significant environmental disturbances are accompanied by the selection of the most genetically fit, so that each specie progresses in a continuous random search for the best solution in a given environment.

Because of the already huge repertoire of information stored in animal genes, resulting from millions of years of evolution, it is certainly instructive [Brooks, 1990] to study the biological mechanisms chosen in nature.

“… there’s nothing like millions of years of really frustrating trial and error to give a species moral fiber

and, in some cases, backbone.” - Reaper Man, Terry Pratchett (in [Korning, P. 199X])

The actually existing species are the result of a long evolution, where only the most fit branches of

organisms survived. For example, almost all species have at least two eyes which enable stereo vision for distance measurement, whereas a single eye would be far less powerful. Although there are insects with more than two eyes, these mutations are not better than the previous ones (e.g. spiders have multiple eyes and are nevertheless almost blind). Therefore, it is important to take biological processes as a starting point for navigation mechanisms as well.

There is the fact that nature “took” much more time to develop the basic environment interaction mechanisms for survival and exploration, than for seemingly more complex and environmentally less related mechanisms like speech, thought, reasoning, problem resolution, etc. That suggests that these last progressions are not so problematic as long as there already exists a good base of interaction with

* An incorrect idea that can arise about survival, is a thought like: “That animal behaves like that, so it survives better” or “Animals evolve

towards a better chance of survival”. It is not quite like that. In reality, the opposite happens: an animal behaves like that and has a certain survival chance, just because natural selection imposed the death of the other genetically less fit ones. Note that nature does not know what it is doing! It is a random trial-and-error process.

II - Biological processes

29

the environment [Brooks, 1990]. As soon as symbolic representations became available to the organisms, things started progressing rather fast. This clearly suggests that basic environment interaction mechanisms, combined with high-level abstract representations, lead to highly capable, intelligent, and robust systems.

The disadvantage of genetic learning algorithms, is the need of large populations where genetic manipulation may thrive. It is not yet possible to allow that type of evolution with artificial AA. What is presently done, is simply applying genetic learning to each AA itself.

There are papers that implement and theorize about genetic algorithms applied to Artificial Neural Networks (ANN) construction [Angeline, Saunders & Pollack, 1994] [Maniezzo, 1994] [Fogel, 1994]. Others talk about general genetic algorithms [Goldberg, 1989] [Dorigo & Schnepf, 1993]. [Nolfi, Floreano, Miglino & Mondada, 1996] discuss and implement some very interesting genetically built ANN for autonomous evolutionary robot navigation learning

Comments: Difficulty on knowing what to evolve and how to encode it.

VI-4.2 Neural learning Keywords: Neurons, synapses, axons, dendritic trees, synaptic plasticity, synaptic weights, facilitation, Hebb learning.

Artificial Neural Networks (ANN), also called connectionist models try to mimic the enormous and still relatively mysterious learning capabilities of the nervous system in animals and Humans. The brain is the most interesting part, where one tries to model its organizational architecture.

"A neural network is a massive group of simple processing units (usually adaptive), interconnected in

a highly parallel manner, hierarchically organized, and which should interact with the outside world In a similar way as biological nervous systems.”

- Kohonen (1984). The basic properties of ANN are as follows: • Learning capabilities (Plasticity / Adaptability) • High parallelism and distribution • Generalization • Associative memory characteristics • High response speed to extern stimuli Changes in environments can be learned by the brain, which gives animals extra survival potential.

Fig. 20 - Illustration of two neural cells (neurons) with their synapses, dendritic trees and axons. Synapses realize the

connections between other neurons’ axons and the dendritic trees serve as “synapse collectors”. The axons are the links that come out of the neurons and that propagate the outgoing pulse trains which reflect the neurons’ activities.

II - Biological processes

30

In 1949, D.O.Hebb presented a general learning rule which predicted the biological synaptic learning found later, and called Hebb rule. This rule postulated that there exist plasticity in synapses and that they learn by a correlation process: “if neuron A is firing and connects to a neuron B that is also firing, this connection synapse will be enforced proportionally”.

Some general references on this issue, are [Kung & Hwang, 1989] [Lippman, 1987] [Kohonen, 1988] [Haykin, 1994] [Aleksander & Morton, 1991].

Comments: Very powerful field but still many difficulties on achieving human-like performance in larger systems.

VI-4.3 Reinforcement learning

Keywords: State space, action space, iterative learning, rewards, punishments, trial-and-error learning.

Reinforcement learning is a natural phenomenon in animals, specially noted in superior animals like mammals where one can observe the effects of specific training with rewards and punishments. Circus trained horses, where sugar is the reward, are one of many examples.

Conceptually, the biological system responds in a very intuitive manner: through the memory of past experience of rewards and punishments, the animal is capable of responding in a way that tries to maximize the rewards and minimize the punishments. To acquire this global behavior, there must be random first trials, where the outcome dictates their repetition or not. This is a trial-and-error mechanism, oriented for survival. This mechanism relies on internal states and external sensory information as well. Each state and sensory information elicit a specific behavior, similar to an association.

Fig. 21 - Illustration of a very simple reinforcement learning case: a person or animal is put before a red and a green button. Receiving food pressing the red button and an electric chock pressing the green button, the person or animal rapidly avoid pressing the green one, and stuck only with the red one. Looking at the “tendencies” as a matrix of probability where larger dots mean larger values, one can see that, in this example, the first tendency will be to press the green button more often than the red one. This initial condition could be caused by previous experiences in other situations and that involved these colors. After some trials, the person or animal learn to avoid the punishment and are attracted by the reward, so that those matrix values invert. The actions are both button pressings, and the state is only the button facing. In real world situations, these matrices can have hundreds of states and actions.

There are several reinforcement learning mechanisms and variants [Mataric, 1991], but Q-learning was the first and most simple one to appear [Watkins, 1989]. Others that try to overcome limitations in Q-learning, are Bucket Brigade [Holland, 1985]. All these methods are gradient search methods in a limited state / action space.

The purpose of reinforcement learning is to solve a mapping problem (also called policy or action map), between all possible states and actions, through a incremental process of reward and punishment, thus learning the “optimum transfer function” between each state and action.

II - Biological processes

31

The basic reinforcement learning algorithm is as follows: • Initialize the internal memory I to I0.

• Do forever:

- Observe the current state s of the external world. - Choose an action a according to an evaluation function a=F(I,s). - Execute action a. - See the immediate reward r for this action a in current state s. - Update the internal memory according to an update function U: I=U(I,s,a,r).

The internal memory I codes the information that the learner keeps through this process, generally

of the form of a transition matrix which relates states with actions. Variants of this basic process, use different F and U functions.

For the example above, the two buttons are the current state s, where pressing the red one is one action a retrieved by function F that chooses the best action. r is the resulting reward, and U is the update function that rises the corresponding probability.

Fig. 22 - Q-learning introduced by [Watkins, 1989]. This method consists in learning through delayed rewards, instead of

just using the immediate rewards The purpose is to maximize a function Q(s,a) which gives the expected discounted reward of taking action (a) in state (s). The picture shows an illustrative example of the transition matrix of values generated by this learning method. The larger the dots, the best the action taken for each state. In this case, the best actions to take would be: (s0→a4, s1→a0, s2→a6, s3→a1, s4→a9, s5→a4, s6→a7).

Bucket-brigade - this variation was introduced by [Holland, 1985] and consists of learning by propagated reinforcement through a chain of classifiers. Reward is distributed through the chain of classifiers.

Fig. 23 - Feudal reinforcement - to enhance the speed of reinforcement learning, [Dayan & Hinton, 1993] proposed a feudal architecture, where masters give tasks to sub-masters which, in turn, also give sub-tasks to sub-sub-masters in order to solve their problem, and so on. Rewards are given to sub-masters that achieve the masters’ tasks, without bothering with the sub-sub-masters’ actions (reward hiding). Furthermore, each sub-master needs only to work in its own state-space granularity level, relaying further refinement to their sub-sub-masters, and so on (information hiding). This is a method of speed-up through information abstraction. It is also eventually parallelizable. (adapted from [Dayan & Hinton, 1993])

Some other references are [Montague, et al, 1993] [Moore & Atkeson, 1993] [Fujita, 1993].

Comments: Very slow, many iterations needed until convergence, very low-level learning, easy to implement.

CCCCHAPTER HAPTER HAPTER HAPTER IIIIIIIIIIII

“Continuous dedication to a single goal frequently surpasses skill” - Cícero

III - Existing theories, models and implementations

33

The issues of navigation, motion planning, VL recognition and map construction in AA have elicited a

very large and broad horizon of theories, speculations, models, simulations and implementations. A quick overview on the existing literature shows clearly that there is no preferred direction of research yet, existing many different ways of looking over the problems to solve. There is still much experimentalism and low-level studies. There is, however, already a growing trend in trying to model the biological structures of animals, essentially the Hippocampus and its related areas.

Like referred in [Penna & Wu, 1993], robotic navigation has been traditionally quantitative and based on exact knowledge of distances, directions, objects and other metric dada. These implementations tend to be brittle and error-prone, accumulate errors, limited on the sensors’ range, and dependent on exact measurements with expensive sensors.

This chapter is divided into three main groups of research: direct HC modeling, other biological inspirations, and artificial intelligence.

VI-1. Hippocampal function and structure modeling

VI-1.1 Use of place neurons for navigation Keywords: Positional neurons, subicular neurons, hippocampal CA1 field, goal neurons, head-direction neurons.

[Burgess, O’Keefe & Recce, 1993] [Burgess, Recce & O’Keefe, 1994] postulate the existence of neurons that fire according to the rat’s distance and orientation to the goal (usually food). These neurons are called goal cells. During exploration, synaptic connections are established between PN and these goal cells. During food search, these goal cells indicate the location of the goal, accordingly to the activity of the PN. Subicular neurons perform some sort of PN activity diffusion.

This biologically plausible model intends to enable an AA to navigate to interesting places (food rewards), rapidly building up a map representation. This contrasts with implementations that require many learning cycles. Rats are able to navigate well, after a short period of exploration [Tolman, 1948], which implies ultra-fast learning strategies that make coarse information readily available.

III - Existing theories, models and implementations

34

Fig. 24 - Simplified architecture of the first model of the HC. A subicular neuron synapse gets active (ON) when this neuron and a PN are simultaneously firing. This way, a subicular map of activity is rapidly built-up during exploration, as the overlap of several PF. There is also a competitive dynamics between subicular neurons. (adapted from [Burgess, O’Keefe & Recce, 1993])

Keeping in mind that very separated PF do not have any correlation in the sense that one’s activity has died out long before the other activates, the biological utility of these subicular neurons is immediate: “feeling” the activity of those neurons, the rat is always able to determine the distance to the goal, since they now correlate those PN over greater distances. These correlation activities decay with distance but do not die out completely. Goal neurons will therefore have a cone-shaped activity map, centered around the food place. These subicular neurons seem to realize some sort of information abstraction.

Enhancement of synapses from subicular to goal neurons is gated through the head-direction neurons. These neurons determine which goal neuron (direction) will be reinforced at the goal location.

Fig. 25 - On the left, a typical activity map of the subicular neurons is shown, whereas on the right a typical goal cell activity map is shown, for the model implemented here. Goal cells also received reinforcement signals at the food location, to trigger the synapses formation between them and the subicular neurons. These could be seen as an abstraction layer that further abstracted and compressed the data from the subicular neurons which abstracted the PN. (adapted from [Burgess, O’Keefe & Recce, 1993])

A group of goal cells was used for each interesting location including obstacles. Furthermore, performing a vector sum of the estimations of the locations of the goal and obstacles, where the later ones are subtracted, the simulated rat had the ability of reaching pre-visited goal locations avoiding obstacles.

Comments: No evidence for goal neurons, fast learning, interesting form of viewing subicular neurons, place neurons do not fire in the dark, distances to distal cues are used.

III - Existing theories, models and implementations

35

VI-1.2 CRAWL - hippocampus function model Keywords: Place fields, path integration, local views, dark firing, self-localization.

[Redish, 1995] presents a model for navigation with the HC, called CRAWL. Local VL are represented in a vector form with three elements: type of VL, distance to the VL, and allocentric bearing to the VL. It also contains a vector representation for path integration in the form of a Cartesian vector.

Fig. 26 - PF are generated by a combination of the VL local view and path integration representations, where angles between VL are extracted from the VL vectors. The strength of this place code is determined by the product of Gaussian tuning-curves (radial basis functions) of the path integrator coordinates and VL feature vectors. (adapted from [Redish, 1995])

For self-localization, when the path integration coordinates are unknown, a parallel relaxation mechanism is used, where local VL view neural representation vectors “compete” for the possible coordinates of the current location. Also, with this path integration mechanism, goal location can be predicted, and dark firing is possible. In other words, when only path integration information is available, then the PF are generated by it.

Comments: Does not model place field learning.

VI-1.3 Model of a cognitive map Keywords: Local information, relational information, polar vectors, local maps, distant places, local references.

[Poucet, 1993] presents a theoretical model which integrates local and relational information into a cognitive map. This information result from perceptual and motor activities, including metric and topological information. This model does not specifically address the HC, but tries to model characteristics that probably have origin in the HC and associated brain areas.

Fig. 27 - [Poucet, 1993] hypothesizes about the integration of local views of a place, taken through the rotation of the animal’s head. Those initially independent local views are related through their temporal contiguity and corresponding angles (A). This contiguity allows the recognition of a place independently of any local view, when the animal is at the center (B) or even approaching the place (C) . One local view could then reactivate the entire representation. (adapted from [Poucet, 1993])

It is assumed that spatial relations between places is coded as vectors represented in polar coordinates, which seems to be a privileged form of relation between places [Collett, Cartwright & Smith, 1986] [McNaughton, Chen & Markus, 1991] [O’Keefe, 1991], and that adapts to the ballistic nature of planned movements towards previously unknown places. One clear example of such behaviors, are novel shortcuts taken by dogs.

III - Existing theories, models and implementations

36

The distance parameter can be extracted from vision and motor activities, where the direction extraction is somewhat more obscure and unknown mechanisms which involve the computation of some kind of reference. Nevertheless, [Poucet, 1993] proposes a very convenient model, in that the reference direction is calculated by the animal, in a place dependent way. Then, the animal can use the polar vectors to reach one place from another.

Fig. 28 - Schematic representation of a reference mechanism that depends upon the place. Each place (A), (B), (C) and (D) provides its own reference direction (rA), (rB), (rC) and (rD), respectively. This way, distant places’ vector information depends upon the place. Note that this mechanism does not imply reciprocity: a place may have a vector to another one, but it may happen that this later one does not have a symmetric vector to the first place. With increasing experience, there could even arise different references for each place, from which trajectories to distant places can be computed. (adapted from [Poucet, 1993])

Until now, this MI does not permit efficient shortcuts and detours, since it is incomplete. For those purposes, TI has to be added (proximity, connectivity, order). The interesting part of this topological net which forms from repetitive exploration, is the fact that the animal can obtain new information from already existing one. Shortcuts and detours are direct examples.

Short-range neighboring places which share common stimuli, where those local nets of vectors are formed, are called local maps [Poucet, 1993]. The interconnection between these local maps, to enable long-distance navigation, pose problems for which it does not exists sufficient consistent experimental data. By definition, distant places do not share common stimuli, which turns place relation coding difficult. One hypothesis [Poucet, 1993] would be that animals reach distant places through access to the local map of the target. The question is, how these local maps are connected.

Fig. 29 - Links between distant places can be established through multiple intermediate local maps, which provide some sort of intermediate references or connection points. These intermediate maps can be formed through the close inspection of frontier, junction and obstacle places. This hypothesis does not take in account the possibility of direct shortcuts. (adapted from [Poucet, 1993])

Another hypothesis consists in the possibility that a global topological net may form, which relates those distant local maps among themselves. Yet another is that there may be a global reference which allows vector computation between distant places.

Comments: Potentially powerful way of encoding displacements.

III - Existing theories, models and implementations

37

VI-1.4 Simulation of recurrent connections in the hippocampus Keywords: Recurrent connections, sequentially stored places, landmark removal tolerance.

Fig. 30 - [Shapiro & Hetherington, 1993] proposed a feedforward ANN which simulated hippocampal PN. Sensory input was composed by the angles of VL. This ANN showed some tolerance to VL removal, just like the HC, and was trained by backpropagation. Multiple PF and their overlaps were also studied. Sensory input was required for the PN to fire. (adapted from [Shapiro & Hetherington, 1993])

In a second experiment [Hetherington & Shapiro, 1993], the ANN possesses PN with modifiable recurrent links between them. The purpose was to test the possibility of the ability of these links to maintain PN activity even with completely removed VL or lights turned off. In other words, PN would still fire without any external sensory inputs.

Fig. 31 - Illustration of a labyrinth where this memory effect could happen: during the initial exploration period, the animal learns how to get from the start arm to the goal arm, searching for food. After this period, the animal is placed at the start arm, and then lights are turned off. The animal manages to reach the goal arm, where the PN fire in a very similar way as if lights were still on. This means that PN and correspondent PF code perceived places as well as memorized ones [O’Keefe & Speakman, 1987] [McNaughton, Chen & Markus 1991] [Quirk, Muller & Kubie, 1990].

Taking into special account that potential movement seems to be the major factor that influences PN activity [Eichenbaum, Wiener, Shapiro & Cohen, 1989] [McNaughton, Barnes & O’Keefe, 1983] [Muller & Kubie, 1987] [Wiener, Paul & Eichenbaum, 1989], [Hetherington & Shapiro, 1993] implemented an ANN that exhibited similar results.

III - Existing theories, models and implementations

38

Fig. 32 - Recurrent ANN architecture used, where the hidden units represent the PN and the output squares represent the spatial location of the simulated rat in the environment. Again, backpropagation was used to train this ANN, except for the recurrent connections. These only provided an extra input, as the past activity of the PN themselves. This model does not generate motor commands, but just displays the progressive locations of the rat through time. After tedious training, the ANN was able to remember the learned sequence of locations, even without any sensory input, only with the recurrent connections and after presenting the first location. (adapted from [Hetherington & Shapiro, 1993])

An interesting fact, was that the recurrent connections turned positive between near locations, and negative between distant ones. This model thus tries to show a way in which the HC may compute trajectories to a predetermined goal. The need for negative connections for the correct storage of topological information, is well demonstrated in [Kohonen, 1982]. The plausibility of these results are supported by Hebbian and anti-Hebbian phenomenon in the HC [Stanton & Sejnowski, 1989].

Comments: Many learning cycles, biological plausibility of recurrent connections, does not account for motor input information.

VI-1.5 Hippocampal place learning model Keywords: Theta rhythm, excitation focus shifting, place learning, sparse memory.

On this HC map model [O’Keefe & Nadel, 1978], it is postulated that the theta system is part of a mechanism for shifting the focus of excitation from one set of place representations to another within the cortical layers, on the basis of the animal’s actual or intended movements. They also found a monotonic relationship between TR frequency and the distance covered by a jump, which indicates that theta provides the distance magnitude information for “internal navigation”. Direction information could either be given by external angles to VL (exteroceptive) or either by an internal compass or calculations based on the turns made by the animal (interoceptive).

Two different models are plausible for the two different and probable CA1 and CA3 HC fields, depending in which one the map structure resides.

III - Existing theories, models and implementations

39

Fig. 33 - Model of the CA3 mapping system. Each line is a lamella within the CA3 field, each one responding to different sensory combination*. TR then acts as to select inputs on the CA3 neurons on the basis of the speed with which the rat moves across the environment. On the right side, one input line (CA3 pyramidal cell dendritic trees spread vertically, traversing all the sensory lines) is selected to receive activation at different time instants. This way, different CA3 neurons get sensitized to different instants in a spatial path, forming a sparse memory representation of that path (each memory element is the sensory combination at each instant). The fascia dentata granule cells, that may be the entry point for the sensory lines, are gated by the TR and only allowed to fire at one phase of the theta cycle. Thus, TR imposes a synchronous bursting pattern of the granule cell output [O’Keefe & Recce, 1993]. Each place is stored in the Hebbian reinforced synapse in CA3 neurons that get activated both by the TR and by a sensory line. It was proven that HC neurons perform Hebbian synapse changes in the presence of the positive peak of the TR [Pavlides, Greenstein, Grudman & Winson, 1988]. For example, when the rat traverses place (α), sensory line (α) is active, and TR is sensitizing the left-most group of PN, then this place would be stored/represented in those PN of the CA3 field. Different sensory lines could be responsive to light at direction ENE and sound at direction S; light at ENE and card at WNW; card at WNW and wind at N, etc. Of course, as the animal moves from (α) to (δ) the sensory inputs will change and also will the active lamella. Thus, the path from (α) to (δ) will be systematically stored as the sensory representations of the corresponding sequential places. On the rightmost bottom picture, the sensory different but sequential places from α to δ are mapped into different but theta-related regions of the CA3 field. (adapted from [O’Keefe & Nadel, 1978])

After the exploration period, when the animal reencounters place α, it then generates a low-frequency TR scanning across the CA3 field until the representation of place α is activated (leftmost area in this example). This activation is possibly the “go” signal for the animal to move towards β and also served to synchronize the TR within the map (starting point). This sequencing mechanism to achieve a goal is similar to the one simulated by [Hetherington & Shapiro, 1993], with the difference that they used recursive connections between PN while here the TR is used on a sparse memory to reproduce the sequence. In this model, distance is used to encode the relation between sequential places, rather than the behaviors that led to one another {Munn, 1950} (in [O’Keefe & Nadel, 1978]). Greater movements across greater distances are accompanied by higher theta frequencies, so that the phase relationship between places in the CA3 representation are still the same. That means that, if the rat moves faster from α to δ, the faster TR would still address the same cells at the same time instants and for the same places. Similarly, if no sensory inputs were represented at place γ, those cells would not fire but would be reserved for future appearing place information. When this new information appears, it would still be placed in a correct spatial relationship within the sequence. This could be the basic mechanism through which the HC may perform some organization among places.

* These sensory inputs most likely stem from a sensory cortex, which may perform winner-take-all (WTA) and contrast-enhancement on

information it constantly receives from visual, auditory and other cortices. This cortex could be arranged in a similar way to the visual cortex where neurons have sharp response tuning curves. This way, only one line is usually active, giving a high discrimination capacity to different places where sensory combinations are different. These inputs may have their entry point into the HC through the fascia dentata field. Depending on the selectivity of these signals, they will be active for one or more places.

III - Existing theories, models and implementations

40

Fig. 34 - In the case where the mapping is not made in the CA3 field but in the CA1 field instead, this would be a similar way to do it like in CA3, based on the physiological data from the CA1 field. The active input is stored in different PN depending on its time of arrival relative to the TR, which allows a correct place sequence storage. The diagonal links account to the experimental fact that TR phase shifts with layer depth. (adapted from [O’Keefe & Nadel, 1978])

These models might also give clues to explain the experimental fact that a PN responds later in the theta cycle when the animal traverses the corresponding PF, and early when it exits that PF. Still, they do not fully explain why the PF occurs almost just once in an environment. Is the sensory cortex that much selective or is there really a Winner-Take-All (WTA) mechanism?

Both these mechanisms could eventually explain the quick place fields build-up when a rat explores a previously unknown environment. It is also possible that the mapping mechanisms distribute over a more vast area, including both CA1 and CA3 fields of the HC.

One question remains and is related to the lack of an explanation or connection between the consistent variation of the TR with movement speed [O’Keefe & Nadel, 1978] and also with novelty {Brown, 1968} (in [O’Keefe & Nadel, 1978]) [Miller, 1991].

Comments: Very plausible, integrates spatio-temporal information.

VI-1.6 MAVIN - visual localization through the hippocampus Keywords: Place recognition, place fields build-up.

[Bachelder & Waxman, 1994] implemented a practical AA with the capacity of localization through the emulation of HC place learning. The proposal is that of a real-time neurocomputational AA for unsupervised 2D location learning, in a 3D space with some VL. This system simulates the evidences of the “What?” (object vision) and “Where?” (spatial localization) in animal vision processing.

Fig. 35 - Through aspect spheres, the AA maps each VL into a circular continuous storage, where the features vary smoothly (adapted from [Bachelder & Waxman, 1994])

III - Existing theories, models and implementations

41

This model is based on [McNaughton, 1988] that made the assumption that PN respond only to a limited vision of the scenario. The difference here is that the AA has now a panoramic view. A ANN is implemented, based on [Seibert & Waxman, 1989, 1992] that combines both “What?” and “Where?” information channels. This ANN is of type ART (adaptive resonance theory) [Carpenter, 1989] [Carpenter, Grossberg & Rosen 1991] which is able to learn and update changes from the environment, without disrupting existing stored patterns. Invariance to translation, rotation and scale was obtained by some mathematical transformations.

In a laboratory disposition, MAVIN was able to distinguish between three different PF. This means that only three templates were needed to describe the environment, according to its place discriminability and ANN attention threshold. As an example, a simple VL composed by a cylinder on a rectangle, viewed from 216 different directions, generates from 10 to 20 different but related aspects for the ART ANN. This way, the VL is parceled into those different aspect regions on the aspect sphere.

[Bachelder & Waxman, 1994] mention the future work possibility of MAVIN being able to go from one PF to another by learned PF transitions. Like the aspect spheres for the VL, transitions could also contribute to the environment recognition .

Comments: Very complex recognition system.

VI-1.7 Tour model Keywords: Dead-reckoning, urban networks, topological information.

This model was idealized by [Kuipers, 1978]. It is based on a mapping scheme that links places with routes (like streets that link places in a city). It is not able to give information about space between places, requiring dead-reckoning to travel among them. It is also not able to distinguish between places with non-distinguishable VL, without any extra contextual information. This model is thought for urban navigation.

Comments: Does not cope with non-distinguishable landmarks, works only for urban environments.

VI-1.8 Modified QualNav model Keywords: Qualitative navigation, observation vectors, positional neurons, observation regions, cognitive map, physical map.

A modified QualNav model [Levitt, Lawton, Chelberg & Nelson, 1987] [Levitt, Lawton, Chelberg & Koitzsch, 1988] [Levitt & Lawton, 1990] is presented by [Penna & Wu, 1993]. Main differences reside in the way of interpreting the cognitive and physical maps. In this model, a cognitive map is a computational mechanism which provides pre-processed local information about the environment, while the physical map is the topologically correct spatial map of the entire environment.

While it is only said that the physical map is constructed in a Kohonen-like manner [Kohonen, 1982, 1988], without any further information, cognitive map topologies and computations are available. The input vectors of this later map are called observation vectors and are composed of local sensory information that changes as the AA explores the environment. The values of these observation vectors are angles between VL and a global “North” or between VL.

III - Existing theories, models and implementations

42

Fig. 36 - In this example with three VL (Q1, Q2 and Q3), the cognitive map must distinguish between each possible PF (A1 to A7). Only these PF are distinguishable using VL angles, where the cognitive map has one unique output neuron for each one. The result is to have PF that are separated by the connecting lines between VL. When the AA traverses such lines, the output of the cognitive map changes accordingly. The number of input neurons equals the number of VL. The physical map then uses the outputs of this ANN to learn and organize the different PF. (adapted from [Penna & Wu, 1993)

Fig. 37 - On the left, one can see the partial physical map generated with the outputs of the cognitive map. Here, only the links between PF are shown, overlapped on the real spatial VL constellation. On the right, one can see the complete physical map, with all links and distances between PF. These PF are directly represented by the output neurons of a Kohonen-like [Kohonen, 1982] ANN, keeping it efficient, and where the links can be viewed as the lateral synapses between neurons. (adapted from [Penna & Wu, 1993])

Fig. 38 - In the really modified QualNav model [Penna & Wu, 1993], PF are defined somewhat differently: each generic point (P) is traversed by M lines, originating 2M sectors. Each sector is called an observation region. The knowledge of a global “North” is needed for the convention of indexes. (adapted from [Penna & Wu, 1993])

III - Existing theories, models and implementations

43

Fig. 39 - On the left, all VL are distinguishable among each other, which originates one type of encoding the observation vectors. On the right however, the VL are not distinguishable, which calls for the need of another type of encoding. While in the first case the vector elements are the indexes of the sectors in the sequence of the VL classification sequence (Q1, Q2, Q3 and Q4), in the second one the elements tell the number of VL in each sector (also by sequence: S0, S1, …, S7). While the first vectors have as many elements as there are VL for each PF, the second ones have as many as there are sectors. (adapted from [Penna & Wu, 1993])

Fig. 40 - On the left, one can see the resulting “coordinates” for the case of distinguishable VL, whereas the right one shows the non-distinguishable VL case. Note that single transitions are also accompanied by single transitions in the observation region coordinates. There are observation region coordinates that are never generated in this case, like (1,3) and (1,0,1,0) respectively, which are called phantom observation regions. (adapted from [Penna & Wu, 1993])

The dynamic construction can be easily implemented [Penna & Wu, 1993], with the creation of new neurons whenever new separating lines are traversed. After the complete construction of the physical map, it can be disconnected from the cognitive map, and used for autonomous navigation.

This model does not need the exact measure of angles between VL, basing itself on the relative disposition of the VL. It is defended that a three layered ANN has better tolerance to VL removal. Biologically, the cognitive map is viewed as a short-term storage of sensory information which is then used for the physical map construction.

Comments: Very interesting and biologically plausible way of building the cognitive map, does not show ANN for the physical map, treats cognitive and physical maps separately, addresses the problem of non-distinguishable landmarks.

III - Existing theories, models and implementations

44

VI-1.9 Other hippocampus-inspired models and implementations

Fig. 41 - [Zipser, 1985, 1986] considers that location and size of VL are the most important features which determine the activities of PN in the HC. He develops some computer simulation where he demonstrates how PF can be formed by combining those features, and where the coordinate system was egocentric. He demonstrates the effect of dilating and changing the VL, on the resulting PF. On the left picture, it is showed how a PF changes with the dilation of the VL, and on the right it is shown how increasing thresholds on the PF tolerance maps into growing PF. (adapted from [Zipser, 1985, 1986])

[Blum & Abbott, 1995] suggest that the role of the HC is to provide the animal with successive directions to the goal, by a difference computation between true and forward shifted location representation in the CA1 field. Since this CA1 field receives inputs from both the CA3 recurrent field, and directly from the entorhinal cortex, this could be possible. They also talk about the LTP effects in the sequential firing of PN, in that the strength of lateral synapses reflects the animal motion between places.

[Schmajuk & DiCarlo, 1991] model some effects of classical conditioning*, where the evidence for independence between place learning and reward / punishment, is ignored. The model does not seem to be plausible because not all places have reward / punishment, but are still learned equally in animals.

[McNaughton, Leonard & Chen, 1989] propose that cognitive map be a collection of local views, spatially related by means of a representation that arises from the movements necessary to go from one view to another. However, this associativistic model does not capture the flexibility demonstrated in animal experiments, and that animals not always rely on local views to plan their trajectories (the rat is even able to accurately navigate in the dark after the map has been acquired).

[Sharp, 1991] simulates hippocampal place cells where also local views of the environment are thought of being the stimulation that produces the PF activities on the PN. The HC is again seen as being an association device, where input sensory information is associated with the PN.

* Classical conditioning is related to experiments with animals, where certain behaviors are conditioned by the presence of certain learned

stimuli. These stimuli tend to correlate to that behavior, but were previously unrelated. A typical example is that of a dog that salivates whenever a bell is rung, by previous training where food was presented simultaneously with the bell ring.

III - Existing theories, models and implementations

45

VI-2. Biologically inspired models In this section, models also inspired in biological processes are going to be visited. However, contrary

to the previous section, the HC will not be the major issue here. Instead, more artificial models are discussed, or ones that model other aspects of Nature, which still hold a close relation to biological mechanisms.

VI-2.1 Subsumption architecture

Keywords: Simple modular behaviors, layered control, reactivity.

[Brooks, 1985b, 1986] proposes a model called subsumption architecture, where several layers of control exist. Each layer may contain one or more behaviors which are triggered by external sensory stimuli. The AA is then purely reactive and is composed by several pairs of stimuli / reaction.

This architecture realizes the decomposition of the AA’s functions, into interconnected layers, where higher-layer behaviors cam inhibit lower ones, by a fixed arbitration scheme.

Fig. 42 - Example of this architecture: sensory information is processed in behavioral layers. The follow light behavior can be inhibited by the obstacle-avoidance behavior which may impose its own action. In that case, the arbiter (A) selects the behavior with most priority (layer II). (adapted from [Branco, Costa & Kulzer, 1995])

This architecture is viewed as a new AI paradigm, since it does not process symbols like classical AI, but also does not possess direct biological structures. This architecture is only inspired in animal behavior, but is still programmed in the old way. It is also called behavioral control paradigm and is based on the physical grounding hypothesis [Brooks, 1990].

Although this architecture does not have an explicit plan, this plan exists under the implicit form, and arises from the pre-established interactions between the behaviors. Through the interaction with the environment and the coordination that it imposes, an intelligent emergent behavior* arises. Everything relies on continuous sensory data evaluation, allowing the AA to react accordingly, fact that gives this architecture a big robustness. [Agre & Chapmann, 1987] even show that this continuous evaluation can even lead to the accomplishment of complex tasks, which would otherwise lead to the thought that planning were necessary.

Brooks showed that many complex tasks could be accomplished reactively, that is, through simple coupling of sensors to actuators, through transfer layers with little or no state (memory).

Comments: Fixed control and arbitration, fixed and limited behaviors, decentralized knowledge, new AI paradigm, extremely robust, non-opportunistic behaviors, information hiding between behaviors.

* For example, running away from bright places and avoiding obstacles, may originate the emergence of a more intelligent behavior that

boils down to hiding in dark places like a cockroach.

III - Existing theories, models and implementations

46

VI-2.2 Plan-guided reaction Keywords: Distributed self-contained behaviors, internalized plans, refined arbitration, information fusion, behavioral opinions,

neuronal way of implementing fuzzy logic, fine-grained behaviors, behavioral levels of competence.

[Payton, 1990] [Payton, Rosenblatt & Keirsey, 1990] present some very interesting architectural concepts, with the main purpose of integrating high-level planning with low-level reactive behaviors of an autonomous agent. The aim of these concepts goes into minimizing the information loss inside and between behavioral layers, which is inherent to the subsumption architecture of [Brooks, 1985b, 1986]. Previous experiences from [Daily et al, 1988] show that a too big a degree of abstraction can lead to bad decisions in autonomous agents, just because a great deal of the richness of information was lost due to that abstraction. Though high levels of abstraction can be useful to understand the interactions between behaviors and various modules, it is most definitely limiting for making opportunistic decisions.

The thing is, the more opportunistic an autonomous agent is, the more probability it has of surviving and executing its task efficiently. Opportunism is one of the most important features of biological organisms in nature. A simple example of an autonomous agent would be the one where it must follow a road, avoiding any obstacles that may appear.

Fig. 43 - Example of a good and a bad decision involving an autonomous agent that has to avoid an obstacle and keep following a road thereafter. If the agent is slightly more to the right of the obstacle or if the avoidance behavior always veers to the right, it would choose the (B) path, and would get trouble re-finding the road [Brooks, 1985b, 1986]. However, if the agent is opportunistic and sees that it could veer to the right, and that it could also veer to the left, it would take the (A) path since it is the best way to avoid the obstacle and keeping track of the road more easily [Payton, Rosenblatt & Keirsey, 1990]. Since the subsumption architecture inhibits the road-following behavior in the first place, it cannot be opportunistic and may decide very badly. (adapted from [Payton, Rosenblatt & Keirsey, 1990])

This loss of opportunism-guided behavior information is called command arbitration problem. The main point of opportunistic reaction is, instead of letting one behavioral layer inhibit completely another, deciding all by itself which action to take next, regardless of better and worse choices, each behavioral layer gives their opinion about those actions without inhibiting others’ opinions. Furthermore, their output in matter of information, instead of completely abstracted, will be a distributed opinion to avoid mass losses of information. This distributed and richer information will be combined with others.

Another very important benefit of this distributed and refined behavioral information processing is the smoother actions of the autonomous agent, which has now refined information about obstacle threats or whatever it is trying to do. This is readily achieved without having to program

One has to implement an arbitration that lets behaviors influence each other, through the fusion of their outputs rich in opinion information. We do not want to directly include into one behavior, knowledge about the others, because that would violate the independence and modularity of self-contained behavior-based architectures. All we want to do is to let arbitration be opportunistic in trying to get the best out of each behavior’s refined opinion information.

III - Existing theories, models and implementations

47

Fig. 44 - On the left is a second example where the AA has a too abstract planning of the sub-goals to achieve, in order to avoid the RF shadow behind the rock. There is a sub-goal (2) which can be achieved by following path (B) and then continue on path (1) for goal (3). If, however, the AA follows path (A), there is no further need to achieve sub-goal (2), and the AA could follow immediately to (3). Because of the plan abstraction, however, there is no information about efficiency of paths, having the AA only the sequence of sub-goals to achieve. This problem is called plan abstraction problem. On the right, the same example is illustrated with the use of an internalized plan. Instead of a fixed plan, where the trajectory is already given, a kind of gradient field exists. This gradient field represents a distributed plan for each point in space, as a guide for action for the AA. Even if the AA falls into the RF shadow, where no gradient is available, as soon as it gets out it resumes its way towards the goal in the best possible way. (adapted from [Payton, Rosenblatt & Keirsey, 1990])

Basically, the fault of high-level planning consists in presenting a pre-determined plan, without supplying the necessary information for intelligent and opportunistic decisions in unexpected situations like above. The various behaviors process sensory information appropriate for each one of them, just like in the subsumption architecture [Brooks, 1985b, 1986]. But, instead of doing a command arbitration where a command is selected by means of a fixed inhibitory structure [Brooks, 1985b, 1986], a refined, distributed and opportunistic arbitration is now made. This is called information fusion. The major difference lies in the fine-grained behaviors and internalized plans. Still, just like in the subsumption architecture, no computer-intensive world-modeling and centralized perception by sensor fusion is made whatsoever, again allowing the autonomous agent to react quickly to the dynamic environment.

Fig. 45 - Refined behaviors - To solve the problem of inhibitory arbitration, instead of using behaviors with a high-level abstracted output and inaccessible internal information, those behaviors are composed by functional atomic elements where all the information is accessible, after small amounts of abstraction. This way, Several behavior are combined in a distributed non-subsuming way. Each behavior expresses its opinion of what action to take, whereas the arbitration scheme takes the most opportunistic choice. In fact, the arbiter is not a central unit, but instead it is a distributed entity composed by intermediate neurons that add activity from the behavioral ones. This activity can be excitatory (positive, full circles) or inhibitory (negative or normal circles). The excitatory activities denote desired directions (to fulfill the behaviors’ competencies), and inhibitory are directions to avoid (danger). Because an obstacle is in the middle of the course, the avoidance behavior issues two possible or safe directions (hard turns are preferred over softer ones) and one danger direction, whereas the road-following behavior issues three good directions (soft left would be best, but hard-left and straight are alternatives). When combined by addition in the middle neurons, we get the most opportunistic direction (hard-left), without the agent endangering himself. Right direction are discarded. Bigger circles indicate bigger magnitudes. (adapted from [Payton, Rosenblatt & Keirsey, 1990])

The last step of this distributed neuronal arbiter, is to choose the most active neuron and take the corresponding action. In this case, the agent would turn hard-left. So, instead of avoiding obstacles

III - Existing theories, models and implementations

48

blindly, we now obtain a architecture that reasons between several possibilities. By carefully designing the synaptic link weight values, we can get the desired opportunistic overall behavior, and even predict the outcomes of any real situation. For example, the obstacle-avoidance behavior neurons may have larger weight values, so this behavior has a “heavier word” when it comes to dangerous directions.

With this distributed non-subsuming architecture, one can add new behaviors without them having others subsumed or neglected. In addition, all outputs come from a common type of unit: a neuron, which reduces interfaces and information representations to only one*. The addition of new behaviors is also straightforward, and done by simply connecting it to the arbiter neurons. In [Brooks, 1985b, 1986] it is also easy to add new behaviors. Each module of behavior neurons is called a level of competence. A level of competence is a entity which is fully capable of giving a competent opinion about a specific aspect of the agent’s overall behavior.

The integration of several internalized plans is simple, because it is not necessary to abstract all of them into a single high-level abstraction. Instead, each one contributes with its “opinion” for the direction the AA should take next.

In this architecture, the notion of information fusion or information arbitration arises, where meaningful opinions are grouped together, instead of fusing raw sensory data in sensor fusion or command fusion [Brooks, 1985b, 1986].

Refined behaviors [Payton, Rosenblatt & Keirsey, 1990]

- less loss of information - allows opportunistic arbitration - allows smoothness in movements

Refined, distributed arbiter [Payton, Rosenblatt & Keirsey, 1990]

- opportunistic arbitration - smoothness in movements - allows distributed processing

Simple behavior integration [Brooks, 1985b, 1986]1 [Payton, Rosenblatt & Keirsey, 1990]

2

- modular design of self-contained behaviors1,2 - simple integration by combining activities2 - unimodal information representation1,2

Inherent variable choice strength [Payton, Rosenblatt & Keirsey, 1990]

- the strength of a choice depends on the sensors’ information - smoother assertion of choices

Tab. 4 - Condensed characteristics of this opinion-based mechanism, and common characteristics to other highly-related mechanisms.

Processing fusion types

Sensor fusionCommand

fusionInformation

fusion

Cht. 1 - Small side comment about different types of processing fusion. Sensor-fusion is used by classical AI and stuffs all sensory information into one large stream of data, which requires high processing power and is difficult to handle due to its inherent information mix-up and confusion. Command-fusion happens in the subsumption architecture and concerns about behavioral commands arbitration, leading to lack of opportunism. Information-fusion is a characteristic of this new architecture, where separated but rich data is analyzed for an opportunistic decision.

Comments: Simple interfaces between neurons, high degree of opportunism, biological plausibility.

* This is what happens in any biological brain: all information present at various levels of brain functions, is available as a common neural

spiking train activity, which permits any neuron to read information about any other neuron. This type of an unique information representation is called unimodal information.

III - Existing theories, models and implementations

49

VI-2.3 Subsumption architecture map construction Keywords: Sonar, magnetic compass, subsumption architecture, map construction, activation spreading.

[Mataric, 1989, 1990b, 1992] implemented an AA called “Toto”, which performed spatial map construction and obstacle avoidance. The map is constructed from 12 sonar sensors and a magnetic compass. This work demonstrated that it is possible to incorporate internal representation of the external environment, in a subsumption architecture [Brooks, 1985b, 1986].

At the lowest level existed the reactive behaviors like obstacle-avoidance and wall-following. At a e medium level existed behaviors that tried to identify corridors, walls and clusters of obstacles. At the highest level existed the map construction and path planning centers.

Contrary to [Drumheller, 1987] where sensory information was compared with internal models (static, brittle, sensitive to measurement errors), recognition is performed by means of continuous scanning of wall-following behaviors. Also, instead of using specific data structures for the map, behaviors were attributed to the landmarks. Behaviors which corresponded to neighboring landmarks, had neighborhood links between them. Thus, this is a distributed and active map representation, resembling to a behavioral tree.

The addition of dead-reckoning through a magnetic compass [Mataric, 1990a, 1990b] lowered considerably the errors in constructing and using the map in more complex environments. Furthermore, a mechanism of expectancy was added, which allowed the AA to detect when it was lost and to reacquire its position in the map. This mechanism consisted in sending an expectancy signal to the neighboring landmarks of the current one.

Fig. 46 - On the left, on the top picture, two different trajectories taken by the AA are shown, which originate the spatial map at the bottom. The links make neighborhoods between landmarks evident. The symbols associated with each node denote the detected type of landmark (C=corridor, LW=left wall, etc.), where the numbers denote the magnetic compass heading at those positions in a range of 0-15 (0o - 360o). To reach a goal from any initial position, an activation spreading mechanism emitted activity through the neighborhood links, starting from the goal node into all directions of this landmark tree. Activity was uniformly attenuated over distance and would eventually reach the current node. Choosing the most active direction from the current node, the AA was capable of following the shortest path to the goal. At each intermediate node, the AA chooses the most active direction, going from landmark to landmark, instead of planning a full path at the beginning. This allows it to have some opportunism [Payton, Rosenblatt & Keirsey, 1990] in unexpected situations (closed doors, obstacles, etc.). If, for some reason, the AA loses itself, as soon as it reacquires its position it restarts going towards the goal. On the right, at the top, the initial exploration path of the AA is shown. The confusion of obstacles classified as (I) denotes uncertainty on the type of VL. On the bottom, the corresponding VL tree is shown. Starting at the current position (LW8) of the AA, activity is spread from the desired goal (C0), as the arrows indicate. Although the shortest topological distance is only 3, going through (I) and (LW12), it is also the longest metric distance of 8. The AA follows the other metrically shorter path (LW4, LW2, LW0, C0) to reach the goal. (adapted from [Mataric, 1992])

III - Existing theories, models and implementations

50

Authors that also talk about activation spreading and its appropriate use for path formation, can be found in [Kortenkamp & Chown, 1993] [Prescott & Mayhew].

Comments: Possible resemblance to the hippocampus, subsumption architecture extension to map construction, map update in a dynamic environment, flexibility and opportunism in the choice of the path to the goal.

VI-2.4 Map construction by landmark circumvention Keywords: Circumvention of landmarks, polygonal landmarks, global landmark map.

[Caselli, Doty, Harrison, & Zanichelli] [Doty, Caselli, Harrison & Zanichelli] [Doty & Seed, 1994] [Seed, 1994] present a radically different map construction technique for enclosed environments with polygonal VL. This mechanism works through VL circumvention for classification and recognition data gathering. This is similar to what a blind person does when touching objects.

The AA has short distance proximity sensors for wall-following and wheel shaft-encoders for distance estimation. This distance estimation is used for both wall length and VL separation estimation. Shaft-encoders allows the AA to measure direction changes, specially for corner angle measurement.

Fig. 47 - On the left is an illustration of the circumvention technique used by the AA. When the AA casually finds a VL, it approaches it, docks parallel to the wall, and starts circumventing it. While doing this, it keeps track of corner angles and wall lengths. After the shaft-encoders signalize a full 360o it goes around again to confirm the recently gathered data. When everything is correct, it stores the data in a global map and exits parallel to the current wall, searching for the next VL. On the right is shown that, while the AA circumvents VL, a global map is constructed with the data of each VL, including estimated distances between them. This example shows a map with four VL and several already traveled distances. It is not necessary to have all possible distances, as long as there exists a directed path from the current start VL to the desired goal VL. If the AA happens to travel in a direction without any VL, then it finds and follows the outer limiting wall. After a while, it restarts the search in some random direction from the wall. If the AA loses itself on the way to a goal, then it wanders around until it finds a known VL, where it restarts the original plan to the goal.

This VL classification and recognition mechanism is very similar to what a blind person does by touching around an object to “see” it. However, there is no physical contact between AA and VL. The AA “sees” the VL by simple proximity keeping.

Fig. 48 - [Seed, 1994] implemented a real AA which performed the operations previously explained. The AA consisted of a 68HC11 EVBU board as the heart of the system. Sensors were cheap IR with heavy noise, sonar, bumpers and light sensors. (adapted from [Seed, 1994])

III - Existing theories, models and implementations

51

Fig. 49 - These are some images taken from the AA performing VL circumvention in the enclosed environment. In this work, true map navigation was not performed, only its construction. Serious problems were observed when trying to discriminate low-angle corners. Generally, this mechanism is very brittle when the VL are not conveniently chosen. (adapted from [Seed, 1994])

This mechanism implemented by [Seed, 1994] showed serious problems in the data gathering and recognition processes. When the AA missed a corner, then the whole recognition process might be wrong. Even if a previously undetected corner now appeared, the same problem might arise. This is because the relative distances between corners appears now very differently from the previously learned one during exploration. Perimeter measurement errors were as large as from -30 to +50%. Round VL had to be completely circular. Other roundness was not measured at all.

Examples of polygonal VL recognition without contact (cameras) and related problems are given in [Bunke & Glauser, 1993] [Chen & Tsai, 1991] [Mehrotra & Grosky, 1989]. [Ellis, 1991].

Comments: Polygonal landmarks only, exits only along a wall, very brittle feature detection and data comparisons.

VI-2.5 Spatio-temporal self-organizing feature maps Keywords: Activity build-up, sequence pattern.

[Euliano & Príncipe, 1995] implemented a very interesting ANN of the Kohonen type [Kohonen, 1982, 1988], but with an extra temporal component. Instead of organizing the input patterns in a way that is totally independent of their temporal sequence (loss of vital information), now the neurons of this ANN also have the temporal activity of their neighbors in account, in that their winning capacity now also depends on that.

This ANN proved to be very appropriate for the recognition of previously learned pattern sequences. This recognition is triggered by the growth of an increasing wave-front of accumulative activity. While learned sequences produce an accumulation of coherent neural activity, other unknown sequences produce a non-coherent activity progress. The result on this later case, is that no wave-front will grow much.

Fig. 50 - One-dimensional (1D) coupling of self-organizing feature-map nodes for spatio-temporal activity processing.

(adapted from [Euliano & Príncipe, 1995])

III - Existing theories, models and implementations

52

Fig. 51 - Temporal wave-front activity created by a coherent (temporally ordered) and a non-coherent input sequence. While the first produces increasing activity (recognized), the second does not. (adapted from [Euliano & Príncipe, 1995])

Fig. 52 - 1D mapping of a 2D input space. On the left, without temporal coupling, one has exactly the normal Kohonen

organization of nodes over the input patterns (dots). On the right, with temporal coupling, note that the nodes organized themselves in the L-shaped pattern zone (in the middle) which appeared in sequence. (adapted from [Euliano & Príncipe, 1995])

Fig. 53 - On the left, the input patterns were warped up by 50% (more density), which produced a correspondent warping of the mapping nodes. On the right, the winning sequences of the map nodes are shown. Note that there are periodic monotonic sequences, which correspond to time interval where the L-shaped patterns are presented to the ANN. This is exactly the phenomenon that produces the coherent activity buildup. (adapted from [Euliano & Príncipe, 1995])

The possibility of integrating more than one pattern sequence is discussed and tested for two sequences. The “secret” of this modification to the traditional Kohonen network is that, instead of clustering patterns by similarity, it now clusters temporally contiguous patterns.

Comments: Very interesting way of integrating time into a self-organizing network.

III - Existing theories, models and implementations

53

VI-2.6 Other models and implementations • [Arkin, 1987, 1989] propose a reactive model similar to the subsumption architecture [Brooks,

1985b, 1986], where behaviors are called motor schemes and the stimulus perceptual schemes. For each activated motor schema, several perceptual schemes are generated, which will trigger other motor schemes. Competition is used to select the next motor schema.

• [Gat, Desai, Ivlev, Loch & Miller, 1994] use a programming language, specifically conceived for reactive control through behavioral modules like in [Brooks, 1985b, 1986], called ALFA (A Language For Action). Instead of inhibiting each other, behavioral layers just give their opinions or add information to others. In other words, in ALFA there exist different layers of abstraction, instead of layers of hard-limiting functionality like in [Brooks, 1985b, 1986]. With two 68HC11 [Motorola] microcontrollers equipped with 3.5 KByte of EEPROM (Electrically Erasable Read-Only Memory) and 100 Byte of Random Access Memory (RAM), an AA called TOOTH was constructed. Later, ROCKY III and ROCKY IV were implemented, for demonstration of behavioral robustness for planetary surfaces exploration. Both of these platforms had 6 wheels and 20 KByte of program memory.

• [Rao & Fuentes] implemented an AA that learned motor responses to sensory inputs in an unsupervised manner, for simple navigational tasks. A three layered ANN was used, with competitive Hebbian learning. Similarities to the cerebellum were pointed out, where the AA learns from an unstructured environment.

• [Brooks, 1985a] pointed out the problems of accumulated positional errors in AA that try to travel long distances in a global map. This AA would not be able to compensate for those errors, unless some VL existed in the environment. An interesting idea was presented, which consisted in the construction of an elastic and flexible map that would be able to expand or shrink to compensate for errors, but did never propose an implementation of this kind of map.

• [Nehmzow & Smithers, 1990] tried an implementation of an AA which organized sensory input vectors in a Kohonen ANN [Kohonen, 1982, 1988]. The purpose was for this ANN to find correlation in those inputs and to learn out of them to be able to locate itself within the structured environment. One curious fact, was that when those input vectors were not pre-processed and fed directly into the ANN, no useful activity patterns emerged from those chaotic input sequences. This means that those inputs were too little structured for the ANN to organize them conveniently. Detecting concave and convex corners and feeding this pre-processed information into the ANN, it was capable of organizing itself consistently. Those new input vectors now contained the present and past corner type, as well as the distance traveled between them. This result shows that superior nervous centers may only work correctly at a more abstract level. Like bees, this AA generates an internal map representation out of snapshots of the environment.

• [Schöner, 1995] [Schöner & Dose, 1992] implemented dynamical systems through the use of simple differential equations, where the concepts of attractors and repellors, as well as bifurcations and intelligent decisions, are present. AA controlled by this mechanism show an efficient way of avoiding obstacles.

VI-3. Navigation through artificial intelligence Beyond the systems that try to model real biological mechanisms existent in animals, strict AI

methods are also used for the purpose of creating being capable of performing localization, recognition and navigation. Although very algorithmic and sequential, this does not mean that some models presented here do not have any biological inspiration, just much less than the previous ones. Some models can even use ANN to perform their tasks, but that were not biologically inspired.

III - Existing theories, models and implementations

54

VI-3.1 Global map through related local maps Keywords: Local related maps, global map of relations, 3-D sensors, view comparison.

[Asada, Fukui & Tsuji, 1990] present a practical implementation of a global map of the environment through multiple inter-related local maps. A camera extracts segment features from a laboratory environment, on a horizontal image band near the floor. A reliable reference point is selected which is used for successive image comparison. A global map relates successive local maps of views.

Fig. 54 - RIGHT - Illustration of the building blocs of the whole mapping system. The view mapper transforms estimated information about contours in each image, to a view from the top. The view comparator integrated successive views into a local map, according to a common reference point which is centered around the current object. When this reference is lost, a new local map is initiated and updated according to a new common reference point. The map integrator associates this sequence of local maps, through simple geometric relations between them. The global map is this sequence of related local maps. Successive view comparisons are aided by shaft-encoder information. LEFT - Illustration of the update of a local map through three successive views a), b) and c). The selected reference point (O1) was the foremost corner of the first obstacle. The resulting local map d) shows the areas occupied by the image segments (AB) and (CD) as the AA moved along. When (O1) falls out of the view, the corner (S) may be selected as the next reference point. At the same time, the map integrator links the local maps, through vector (R) in e). (adapted from [Asada, Fukui & Tsuji, 1990])

Comments: Artificial comparisons, artificial global centralized map, objects must be polyhedral, floor must be plane.

VI-3.2 Exploration as a graph construction [Dudek, Jenkin, Milios & Wilkes, 1991] propose that environment exploration can be done by the

construction of graphs, where no MI as distance or orientation are assumed. It is assumed that the AA is capable of traveling along the branches of the graph (labyrinth corridors), recognize when it reaches a vortex (crossing of corridors), and to enumerate the branches that are incident on that vortex.

III - Existing theories, models and implementations

55

Fig. 55 - Illustration of the exploration graph. It is a non-oriented graph, where new vertices and branches are added as the AA explores. These branches are seen as yet to be explored. Formally, the algorithm keeps an already explored sub-graph and a group of branches yet to be explored. It is further demonstrated that the above task is not solvable without any marker or signalizing. The AA is therefore equipped with markers which can be dropped on the floor and picked up again. A 3-D exploration algorithm is presented which is tested in 3-D mazes. The use of markers can be seen as a need to disambiguate between similar places (same-looking junctions with same-looking corridors going out), as in [Kuipers & Byun, 1987] [Kuipers & Levitt, 1988], but with a more general sense. Here, he markers are used to recognize the arrival to a particular vortex (labyrinth junction). (adapted from [Dudek, Jenkin, Milios & Wilkes, 1991])

Comments: Artificial use of markers, functions only for labyrinth-like environments, no metric information present at all.

VI-3.3 Terrain acquisition through continuous movement Keywords: Landmark circumvention.

[Lumelsky, Mukhopadhyay & Sun, 1990] propose two algorithms for the acquisition of an unknown terrain with VL of random sizes and shapes, through the planning of continuous movement.

The first algorithm is called “Sightseer”, where the terrain may be finite or infinite and the VL are mutually visible. The AA explicitly visits the closest VL, circumvents it, marks it as visited in a map, and follows to the next closest unvisited VL. This goes on until all VL are already visited. The second algorithm is called “Seed Spreader”, where the terrain must be finite but the VL do not need to be mutually visible. In this last one, it is assumed that the terrain is rectangular, where the AA executes “come-and-go” trajectories. Landmarks are annotated in the map as they appear.

As opposed to the “piano movers” problem [Schwartz & Sharir, 1983], the VL can have any shape. However, this works does not specify how those VL are recognized or classified. The efficiency of these algorithms is evaluated according to the total traveled distance to achieve complete mapping.

Fig. 56 - Here is an example of the “Sightseer” algorithm. On the left are the trajectories of the AA, whereas on the right one can see the resulting complete map with 11 VL. This map has a graph form, where MI is registered. (adapted from [Lumelsky, Mukhopadhyay & Sun, 1990])

Comments: Landmarks with any shape, need of landmark circumvention, environmental constraints, visibility constraints, no real implementation tried.

III - Existing theories, models and implementations

56

VI-3.4 Quad-trees for map construction Keywords: Hierarchical decomposition of space and obstacles.

In this algorithm, [Zelinsky, 1992] envisages the planning of trajectories towards a goal in a known environment. The AA maps the environment and its VL only at the necessary level of granularity, where sensory scanning is achieved by touch sensors.

Initially, all space is treated as being empty, until the AA “collides” with the VL positions and marks them on the map as being occupied. Afterwards, a new trajectory is planned towards the goal. This repeats until the goal is reached.

Fig. 57 - In the topmost picture, on can see an illustration of the distance transform [Jarvis & Byrne, 1986] which is used in this algorithm. Space is divided in an uniform grid, and distance to the goal is computed spreading from the goal. This is like a “mathematical wavefront”. To compute the optimum path to the goal, the start position just needs to descent this “distance gradient”. The distance transform also works correctly for multiple goals (middle picture). The nearest goal can be easily achieved, starting from any point in space. Ambiguities between straight and diagonal transitions is solved by attributing the Euclidean distance of 2 to the later ones. To avoid using floating-point numbers, the values of 1.000 and 1.414 are scaled to

3 and 4, respectively (bottommost picture). The here applied distance transform [Jarvis & Byrne, 1986] has the disadvantage of relative low efficiency, in that the planning algorithm spends much time to cover the entire space. (adapted from [Zelinsky, 1992])

Fig. 58 - Because of the above lack of efficiency in the distance transform, the final algorithm presented here will use a hybrid data structure: a quad-tree*. This structure already efficiently adapts to the irregularities of the environment in respect to the presence of VL. Each 2-D spatial zone has its won quad-tree. Only when a VL is present, are those zones fragmented to adjust to the VL’s size and shape. Free zones are represented by white squares, where occupied ones are represented by black squares. VL can have any size and shape, although they are sampled into squares (adapted from [Zelinsky, 1992])

Comments: The AA must know its absolute position in space, simple algorithm, sampled landmarks of any size and shape.

* A quad-tree is a tree data structure, where each node has exactly four branches.

III - Existing theories, models and implementations

57

VI-3.5 Moving array map centered on the robot Keywords: Array-like grid map centered on the robot, shaft-encoders, obstacle position storage.

[Gonçalves, Ribeiro, Kulzer & Vaz, 1996] implemented an AA with limited dead-reckoning and mapping capabilities. It had an uniform array which was always centered on the AA and moved along with it, shifting its contents in order to maintain itself consistent with the exterior environment. The array cells contained information about the presence or not of obstacles.

Fig. 59 - Illustration of the basic mapping process with a 5x5 map array. After some exploration, the AA stores the map array (bottom left) where the obstacles and dangers appear in its map array. Whenever the AA rotates or shifts forward, the map array travels with it, and that’s because it always needs an opposite rotation or shift, in order to maintain the stored obstacles at the correct location relative to the AA. When an obstacle falls off the array, then it is “forgotten”. Note that the movement information is given by wheel shaft-encoders with only 8 ticks by wheel revolution.

An internal dead-reckoning like mechanism maintained a local displacement vector, i.e. a vector which components tell whether the map array should be rotated and / or shifted. After a rotation and / or shift, this vector is cut off by the corresponding component.

The purpose of this simple mapping scheme is to allow the AA to have some short-term memory in the form a local map. This local map allows then the AA to plan its movement in order to successfully and efficiently solve dead-end and trap situations. Some preliminary experiments showed that the AA was really able to efficiently find an opening when trapped. Also, when the only opening was temporarily closed when the AA was inspecting it, and opened again afterwards, it was curious to note that the AA really does not see that opening (already marked as a previous obstacle) and does not exit through it. Turning this short-term memory on and off, one can clearly note the difference in overall behavior.

The whole mechanism was implemented on a 68HC11 microcontroller using about 10 KByte of RAM in interpreted C code. Rotations of 45o and 90o were allowed, with optimized arithmetic and lookup tables, to avoid sluggish code.

Comments: Very simple shaft-encoders, simple mapping scheme, very little processor intensive, coarse mapping, more efficient obstacle-avoidance when compared to memoryless method.

VI-3.6 Other implementations and theories Here are some more implementations and theories that are interesting: • [Arkin & MacKenzie, 1994] implemented an AA with perceptual sequencing of the algorithms in

the reactive system. They refer to the importance of a AA being able to reconfigure its perception strategies, to adapt them to the present situation, while it is moving. The main question is “what to perceive, and when?”. Reactive control is performed with motor schemes, where perceptual

III - Existing theories, models and implementations

58

schemes are chosen according to the current action (action-oriented perception). For example, it is considered a waste of resources to have long-range sensors activates, when the obstacle is near and can only be correctly viewed by short-distance sensors.

Fig. 60 - Starting from an initial state, the AA transits between states according to the situations it encounters along the way. For example, after it finds the goal (state S2), it transits to state (S3) it moves fast in direction of the goal. After a while, it transits to state (S4) where speed is lowered for a secure docking procedure. (adapted from [Arkin & MacKenzie, 1994])

• [Tani & Fukumura, 1994] simulated a navigation learning model based on distance sensors and targeted to a goal. First, the AA is teached (with supervision) on a goal-oriented task, using local sensory data. Analyzing the changes in the temporal sensory data flux, it learns to construct a neural mapping from sensory input to the desired action outputs, generating a hypothetical vector field oriented towards the goal. Future work is going to try to implement unsupervised reinforcement learning, instead of the present supervised one.

Fig. 61 - Example of the trajectories taken by the AA after supervised learning. Obstacle circumventing trajectories were also simulated with success. adapted from [Tani & Fukumura, 1994])

• [Fennema, Hanson, Riseman, Beveridge & Kumar, 1990] propose an AA that is capable of navigating in a partially modeled environment, through the visual processing of modeled VL features. There are cannot exist non-modeled obstacles. Navigation is based on a servoing at the action level of the AA: it guides itself through some modeled external features. These features are pre-inserted in a database.

Fig. 62 - Before any action begins, a feature of a VL of the environment is selected for guidance, using knowledge from a database. Action is then performed step-wise: after each movement, the difference between the expected and the real deviation of the selected feature is used for movement parameter tuning. This results in a “action-level” perceptual servoing. (adapted from [Fennema, Hanson, Riseman, Beveridge & Kumar, 1990])

III - Existing theories, models and implementations

59

• [Whitehead & Ballard, 1990] present a reinforcement learning model that tries to overcome the problem of the environment’s state space size. Q-learning [Watkins, 1989] is used. Ambiguous perceptions, which may lead to different actions, are resolved by several limited views on the environment, where a non-ambiguous representation is chosen.

• [Tsuji, 1986] developed an AA with 3D stereo vision that constructed a map of perspectives with extracted VL patterns out of a panoramic view of the environment. This 3D information was used to predict the localization of the VL in the perspective map.

• [Dunlay, 1988] used a perspective map with information about the heights obtained from depth images. This map was used to detect obstacles and to determine the angle and speed of movement. In this case and in the previous one, the coordinate system was centered on the AA.

• [Hebert & Kanade, 1986] analyzed depth images and constructed property maps of the observed surfaces, represented in Cartesian coordinates seen from above. The type of surface and other geometric information was thus obtained, making space segmentable in traversable and obstacle zones.

• [Daily et al, 1988] also construct a top view of the environment, based on height information, called Cartesian map of elevation. From that map, they compute a series of possible trajectories for navigation of an AA through the terrain. In this and the previous cases, a 2D map is built out of the 3D world.

• [Elfes, 1987] developed a navigation and mapping scheme which was based on sonar. It builds sonar maps viewed from top and updates them with new information. In this case, 2D information is used to build up 2D maps, and that’s because integration is direct, contrary to the previous two models.

• [Asada, 1988, 1990] presented the construction of 3D maps of the local environment from distance information. Adopting the hierarchy of [Elfes, 1987], he extended it to a sensory, a local and a global map. Local maps are integrated by comparison of properties of the obstacle zones.

• [Kuipers & Levitt, 1988] propose a hierarchy composed by four spatial semantic levels: from the lowest to the highest, there is the sensory-motor level (perceptions and actions), procedural (actions for place finding and line following), topological (places, paths and their relations), and metric (metric relations between places and paths). In this model, a “test” procedure is used, where ambiguous places are distinguished by local exploration.

• [Kuipers & Byun, 1987] use the same previous “test” procedure for place disambiguation.

• [Schwartz & Sharir, 1983] developed an approximation to motion planning, called piano movers* and which consists in computing a collision free trajectory between the polygonal AA and the polygonal obstacles. The positions of all obstacles must be previously known. The performance of this type of algorithm is evaluated according to the total number of existent vertices. A related mechanism is navigation through Voronoi diagrams [Takahashi, O.; Schilling, R. (1989)], where the AA has to travel through the midpoints between obstacles.

• [Noreils & Chatila, 1995] study the problem of appropriate architectures for AA control. They emphasize that reactions should be adapted to the task to accomplish, instead of being the simple result of stimulus-action pairs like in [Brooks, 1985b, 1986]. They also emphasize that a purely reactive and stimulus-driven AA could never achieve its goal, due to behavior conflicts and systematic responses that could be incompatible with that goal. It is therefore necessary for the reactivity of an AA to be programmable and controllable. The AA must also have robustness and flexibility to treat errors and failures. They present an architecture divided into three levels: functional (low-level perception and action functions), control system (translates plans into tasks

* This name was inspired on the fact that real piano and expensive furniture movers have to carefully plan their movements, so that no

collisions occur that might damage the goods. In this kind of real problem, the position of all obstacles must be previously known.

III - Existing theories, models and implementations

60

for the lower level to execute, and recovers from errors), and planning (generates plans to achieve goals). They believe that this approximation responds better to the AA’s needs.

• [Barto, Sutton & Anderson, 1983] [Barto & Sutton, 1981] [Watkins, 1989] [Wilson, 1987] present systems that are capable of performing simple navigation tasks and others, by reinforcement.

• [Borenstein & Koren, 1991] [Guldner & Utkin, 1995] [Hwang & Ahuja, 1992] [Kim & Khosla, 1992] [Rimon & Koditschek, 1992] all present implementations based on potential vector fields or gradient fields. Here, the AA follows the directions of the vectors at its successive positions. This approach is highly artificial because these fields must be somehow globally generated.

• Other implementations can be found in [Zaharakis & Guez, 1990] [Wang & Aggarwal, 1989] [Hu, Kahng & Robins, 1993].

VI-4. Other considerations There are three main types of action that may lead to place achieving. Those are Servoing /

Guidance, Routing, and Mapping. While servoing and routing influence more directly the actions, mapping is more a storage for future reference. Servoing / Guidance is like a control classical system that tries to maintain an output relative to a reference. Routing is like a set of instructions to get to a destination, which are executed in strict order no matter what the intermediate results are. Either one of them has advantages and disadvantages over the others, as well as a preferred field of application.

Positive Negative Other

Servoing

Guidance

- Precise desired path maintenance - Very little amount of hardware required - No memory needed - Comparisons are not possible - Very simple and fast processing - No special coding necessary

- Cannot handle unexpected events - No flexibility - High data rate

- Low-level raw data - Stimulus-driven

Routing

- Little amount of hardware required - Path comparisons possible - Medium data rate - Fast processing - No special coding necessary - Egocentric compass possible

- Unprecise desired path maintenance - Cannot handle all unexpected events - Little flexibility - Some operational memory - Errors lead to disorientation -Requires precise distance and action sequences - Paths must be well specified, with a correct sequence of places - Shortcuts and detours are impossible - Manipulations or operations between routes are impossible

- Medium-level data abstraction - Stimulus and memory-driven - Each route has a sub-goal

Mapping

- Can handle unexpected events - Very high flexibility - Low data rate (very abstracted) - Egocentric compass possible - Disorientation can be recovered - Does not require precise distance and action sequences - Trajectories do not have to be precisely specified - Instant position-finding from anywhere - Detours and shortcuts are possible

- Unprecise desired path maintenance - Well-designed underlying hardware structure required - Large operational memory - Slow operation - Very special coding necessary

- High-level abstraction data processing - Stimulus and memory-driven - The total trajectory to the goal may be computed at once

Tab. 5 - Based on [O’Keefe & Nadel, 1978], a distinction is made between servoing / guidance, routing and mapping. One must note that some positive aspects of mapping surpass negative ones, so that mapping is in general a very useful mechanism for organisms to survive. Besides, maps are one of the most efficient storage devices with very large capacity. Just think about it: exploring and storing just once every new path in an environment, an animal is capable of navigating to any point from any other point, starting and ending anywhere.

III - Existing theories, models and implementations

61

Fig. 63 - Example of two solutions to a place-find problem: Routing and Mapping. On the left, one could generate a route from home to the University Campus laboratory: ”go down the 16th avenue for 1Km, passing the Main street, reaching an intersection. There you must turn right to the 13th street and keep going for about 800m reaching another intersection with the Museum road, where you turn left and walk for about 1.1Km. On the next intersection, turn right and keep going for more 300m. There you should be seeing the Weil Hall, where you enter. This is only the execution of a sequence of precise instructions, without tolerance to possible errors and misplacements. Some corrective capacity is still present, since one tries to reach some VL, but it is not sufficient in case lost of the desired path (note that it is admitted here that the observer does not have access to mapping data in the brain or from a map). However, if using a map mechanism, in the brain or in a map, than the observer can reacquire his current position when lost and recompute trajectories. Also, the possibility of new trajectories like shortcuts and detours are now possible. Similar thought may be made for the map on the right, which shows some English countryside.

CCCCHAPTER HAPTER HAPTER HAPTER IVIVIVIV

“Our projects never have the dimension of our dreams” - Anonymous

IV - Project development

63

After all the discussions made until now, the accumulated knowledge is going to be applied and

modified to suit the project of the AA. There are several problems that must be solved and thought about, before going to the actual implementation. This chapter is thus made up of ideas that will be used in the real implementation.

Ideas for implementable mechanisms (not necessarily working) are going to be briefly presented. For each problem, a correctly working mechanism is presented as the last one, resulting from a evolving design sequence. Simplicity, minimalism, optimality, and appropriateness are issues of major importance.

VI-1. Practical ideas abstract To have a more global idea of the existing possibilities of implementation, some general choices are

going to be made, concerning hardware as well as abstract mechanisms of the AA. Also, some punctual considerations are made

VI-1.1 Ideal aspects and real limitations With the side-effect of abstracting everything that has been talked about until here, the essential and

idealized aspects that will be most important for the real implementation, will now be presented. These aspects include idealized structures for reaction and basic navigation mechanisms, landmark recognition, map construction and usage, etc., as well as their related problems and limitations that arise in reality. Here, the word ideal will be used in the sense that an ideal mechanism performs as good as possible in a given environment. In other words, in those environmental conditions it is the best that mechanism can do.

After each ideal mechanism presented, suggestions will be made to minimize the effect of the corresponding limitations or problems. These suggestions aim to achieve an implementation of an AA that will not become brittle because of those limitations and problems, being able to cope with errors and recover from them. In one word, it is desired the AA to be robust. All this preliminary information is obtained by close inspection of existing implementations, theories, and models, as well as experimental information gathered by the author.

One must keep in mind that all these discussions are about a small AA platform that must perform in a more or less prepared environment, and not in a factory where everything is cluttered. This AA will perform in a flat floor where VL with vertical walls lie around. Furthermore, as the real implementation

IV - Project development

64

approaches, the palette of possible directions to take, gets more and more focused. These directions were explained in previous sections and therefore will just be exposed again.

VI-1.1.1 Low-level reactions • Speed and readiness - for the AA to survive in a dynamic and unknown environment with

unexpected pitfalls, it must be able to react as fast as possible to those pitfalls. Ideally, the AA should take some action as soon as some attention-demanding event occurs (obstacle encountered, VL sensed, etc.). The subsumption architecture [Brooks, 1985b, 1986] provides a good approach to this fast response reactive mechanism, and is going to be used as the basic reaction mechanism in this AA.

VI-1.1.2 Landmark and place recognition • Perception - the AA can perceive a VL in several ways (visually, auditory, etc.). Using cameras

provides the most flexible way to do this, but also generated information with so many degrees of freedom, that makes it too difficult and almost impossible to implement a robust algorithm. It also requires much more computing power than other types. Instead of vision, one can use tactile information given by contact or nearness sensors. This method is equivalent to the tactile object-shape recognition by a blind person, using minimal sensors, and is going to be used in this AA, where it tries to recognize VL with different shapes.

• Place ambiguity - when the AA faces ambiguous VL, it has to disambiguate with extra information. Increasing the precision of data, one can discriminate between similar loose VL, but generalization is heavily degraded. Instead, one should use TI or other source already available in the mapping mechanism, which allows to discriminate between VL without having to sacrifice generalization. This later information could be expectation [Mataric, 1989, 1990b, 1992] or additional sensory data [Snow, 1995]. Building expectation flows into the mapping mechanism, has the advantage of the AA being able to better discriminate between similar places without any additional source of information. The AA could also explore the immediate surroundings like people and animals do. To reduce the ambiguity problems, only distinguishable VL are going to be used for the real implementation of the AA.

• Recognition tolerance - ideally, the AA should try to recognize a VL as best as possible, concerning tolerance to perception errors and VL deformations. A Bayesian-like performance level*, that distinguishes different VL with the best possible “threshold”, is something desirable. In the real implementation, this means that the AA should still recognize a VL as the correct VL, until this VL is so distorted by the sensors that it looks more like another VL. Here, the “Bayesian threshold” is the similarity mid-point between VL. The present AA is going to achieve optimum decision capabilities, in that it just decides for the VL it recognizes best.

• Recognition speed - ideally, the AA should be able to recognize a VL as fast as possible, with a good certainty. In other words, as soon as a VL can be well distinguished from the others (above a certain threshold), the it should be given as recognized. Fatal misrecognitions could happen, just like in Humans. Experience is going to determine a good speed for the present AA.

* Here, it is not meant that the recognition algorithm will have a Bayesian structure, but that for the problem in hand it will try to give the best

possible tolerance to sensory errors.

IV - Project development

65

VI-1.1.3 Map construction and usage • Dead-reckoning - the use of a mechanism that keeps track of the AA’s positions and

displacements over large spaces is very difficult and brittle, since there are too many errors generated by wheel slippage, shaft-encoder errors, and other possible sources. These errors are generally accumulative and render this type of information useless after some time. Generally, several straight meters is still an usable distance. However, if the AA performs curves and other sinuous trajectories, going back and forth, the usable distance becomes much smaller. The use of VL by the AA can drastically increase this usable distance, since errors can always be cut off by at each VL reached. This relational dead-reckoning is most probably used by animals. The present AA is going to use motor shaft-encoders for dead-reckoning.

• Topological information - this type of information will prevail as the basis of map construction. MI is useless without TI*. The present AA will have both, in several VL storage ANN.

• Metric information - the use of MI without any other source of complementary information, such as TI or other types of qualitative information, definitely results in an unusable AA for environments that are not precisely engineered for it. It becomes too brittle for even considering using it in a dynamic environment, where generalization and robustness are of major importance. However, for reactive and highly local actions, one can use such things as predefined motor-schema.

• Partially visible landmarks - if a VL is big enough for the AA to see only a part of it a each time, one must build a system that somehow glues those views into one VL [Bachelder & Waxman, 1994]. Ideally, each possible view of one VL should be memorized into a unique data structure. Circumventing the whole VL is a very simple method that might work well for the tactile recognition mechanism. For identification purposes this seems to be a must, although animals might recognize the same place without necessarily having to fuse those views (different views, different but contiguous places). The present AA will always have full access to the entire VL.

• Dynamic construction - ideally, the spatial map is continuously updated, while the AA explores or searches some place. There are no specially dedicated exploration and search phases. Every time the AA finds a VL, the map is updated accordingly (a new VL originated a new PN in the map; an old VL can update slight changes to the existing PN in the map; a removed VL decrements the PN certainty of the existence of such a previous VL). The same applies to distances and angular bearings, that must also be updated as the AA goes around the environment. Updates are a difficult issue and will only be addressed theoretically.

• Correct topology - it is not necessary for the internal map to maintain a correct topology, in the sense of a direct correspondence between relative PN spatial positions in the AA’s cortex and the PF spatial positions. As in the HC, the PN may be arranged in any way, having only their synapses virtually realizing this topology. Of course, like in the HC, one can extract topologically correct information from this map, in order to build a virtual image of what the AA has in its memory. The present AA will store maps as data-structures with no direct resemblance to the real map, just as in the biological HC.

• Net-like map - ideally, the map should allow to incorporate the necessary information and be efficient. Implementing it as a neural net, where the nodes are the PN and the links are the synaptic circuits that connect those PN, turns it into a very efficient and complete data structure to represent the environment by VL. There is no wasted memory, unless some shortcuts start to be represented as well†. A grid-like structure would be too wasteful in memory, and algorithmically more difficult to extract useful direction information. Its uniform structure does not account for the lack of regularity of the environment. The present AA will have a complex mapping mechanism that theoretically allows for the existence of the most complex observed behaviors in animals, like shortcuts and detours.

* It is much simpler and intuitive for a traveler to get a destination direction by means of relative directions given from interlinked sub-goals,

than been given metric distances and angular bearings to maintain. While the first type of information allows the traveler to account for errors in a much more robust manner, the second one does not allow flexibility.

† First, only simple links between pairs of neighboring VL are represented. After a while, when the AA tries shortcuts, these could also be represented as well. Although one can always figure out a shortcut knowing the individual simple paths, this new shortcut link can hold important information about the feasibility, danger, and more accurate displacement measures of that shortcut. Do not forget that those simple paths have errors that accumulate in the shortcut computation.

IV - Project development

66

VI-1.1.4 General aspects • Decentralized knowledge - ideally, to minimize damage sensitivity and maximize processing

speed, data structures should not be centralized in one processing site. Instead, they should be distributed. This is valid for reactive systems, map information, detection and recognition systems. To have the highest possible survival rate, the AA should have some neural-like “brains”. The behavior refinement in [Payton, Rosenblatt & Keirsey, 1990] provide a means of distributing the systems at the reactive level, where the AA in [Mataric, 1989, 1990b, 1992] provides a means of distributing the map mechanism. These distributed mechanisms operate in a bilateral cooperation bond, where each one provides and receives information to and from the others to perform the tasks correctly. There is, however, a central controlling site in this AA, since it is powered by a microcontroller.

• Sensory fusion, fission and sequencing - ideally, sensory data processing should be fast, simple*, and robust. Sensory fusion calls for too much computing power, and also makes it very difficult to synthesize and analyze algorithms on the resulting mixed data stream. Sensor fission is based on sensor separation and can cause problems derived from the independence of the different sensory channels, such as behavior collision or incompatibility. Sensor sequencing might be the best choice for fast and efficient processing, and is going to be used in this AA, since it will be able to turn on specifically needed sensors at each time.

• Information usage - ideally, one should be able to process each piece of information that the AA can grab from the external and internal environments with precise algorithms, allowing to precise actions. This is clearly impossible and unwanted for limited resource AA. On one hand, too much computing power would be necessary. On the other hand biological robustness relies on processing of highly abstracted data, to allow the animal to respond and generalize well to a changing environment. The most important thing, is to choose well what abstractions and processing to make. In sensory sequencing one can choose the relevant sensors dynamically for each task [Arkin & MacKenzie, 1994]. [Payton, Rosenblatt, Keirsey, 1990] provide a good way of combining different sensory channels without the burden of heavy processing and complexity of if-then-else clauses in inhibitory paths [Brooks, 1985b]. This architecture also allows to add some arbitration refinement which avoids the bad consequences of information loss in the SA [Brooks, 1986]. This refinement allows the AA to be more opportunistic by having more available information at decision time.

• Physical movements - Ideally, there should be a good combination of movement accuracy and smoothness. Smoothness is important to keep mechanical strains as low as possible†. To add smoothness of movements to a simple motor command mechanism, some simple dynamics [Schöner, 1995] [Branco & Kulzer, 1995] will improve the overall movements of the present AA.

VI-1.2 Implementation aspects

Sensor suite

Expensive high-precision

Cheap low-precision

CameraSonarLaser Infra-redLight

sensors

Active Passive

* Simple at each sub-system, but increasingly complex as abstraction increases along those sub-systems. † In small robots like the one implemented in this work, smoothness is not very important. In bigger platforms however, this can be of major

importance to avoid mechanical breakdown.

IV - Project development

67

Cht. 2 - There are several choices for the sensors’ characteristics. Although passive sensors (transducers) do not have the

range limitation of active ones (transceivers), IR was chosen because of the already built platform that is going to be explained later.

Autonomous control

CentralizedFunctional

modulesBehaviors

Cht. 3 - The problem of the AA’s autonomous control is traditionally made up of a central processing centered, or composed into several functional modules (planner, environment modeler, etc.). The path followed in this work centers about the decomposition into behaviors which interact in order to get the AA’s job done.

Environment mapping

Uniform grid

Tree / Grid hybrid

Tree / Mesh

Map information

type

Topological

Both types

Metric

Cht. 4 - Representation of the outer environment (mapping) can be done in grid or tree form. Furthermore, both types can be mixed, resulting in a multi-resolution representation of space. In this work, the tree type of representation is going to be used, since it is adequate for open spaces and still simple to implement. It also suits experimental HC data better. Furthermore, the information type must be both topological and metric, if an efficient mapping is desired.

Reactive system

Subsumption architecture

Refined behaviors

Sequenced states

Cht. 5 - The basic reactive system may be implemented according to the subsumption architecture, sequenced state-

machine like, or with refined behaviors. This last one seems to be more interesting.

Computational hardware

Central workstations

links

On-board microcontroller

with host link

On-board microcontroller

Cht. 6 - Computational resources can be concentrated on a powerful external workstation which sends all commands and signals to the AA platform. The AA can also have its own less powerful microcontroller, without any external influences. This

IV - Project development

68

can be multiprocessors or a single microcontroller. This later AA can also have a interface link with an external host, preferentially via RF, which would send debugging data to a graphical display. This last possibility is going to be exploited in this work. Note that it is useless to build a powerful processing board, composed of many microcontrollers, to control many different tasks with a central control. It is much more useful to dedicate one microcontroller to each task or a few simple tasks. That’s real parallel processing, like in the brain, without any central control.

Response speed

Slow

Dynamic reactivity

Real-time

Internal states (memory)

Few or noneExtensive memory

Reflexive system Static reactivity

Cht. 7 - Reaction speed of an AA can be slow or real-time. Memory utilization can be extensive, or little or none (also called static reactivity). When the AA has both real-time and no memory characteristics, then it is called a reflexive system. When it has both memory and real-time response, then it is called dynamic reactivity. In this work, it is going to be attempted to give the AA both characteristics, to allow the AA to have better survival chances.

Power source

External link

External link with backup

battery

On-board battery

Cht. 8 - The computational system present on a AA must have its power source, which depends on the consumption. If there are no high-consumption devices on the system, then the AA can be powered by a battery pack mounted on the AA itself. Since this work is going to be centered on a low-power microcontroller and low-power sensing devices, a battery pack is going to suit fine. The motors are going to draw the most of the consumed power, limiting the total life time. Nevertheless, the AA is going to be totally independent.

Landmark recognition

Static Dynamic

Cht. 9 - Landmark recognition can be static (vision-based image recognition), or dynamic (spatio-temporal sequence recognition). Since the last one is easiest (object circumvention, for example), it is the most interesting for now.

IV - Project development

69

VI-2. Basic reaction mechanism

VI-2.1 Movement smoothness These speeds may be given directly or smoothly. As observed from a variety of implemented

obstacle-avoidance algorithms, the smooth version would be preferred since the direct method originates too much jumping and jerking when backing up from a frontal obstacle. To smoothen out AA movements, a very simple speed dynamics [Schöner, 1995] [Branco & Kulzer, 1995] is going to be used.

Grf. 1 - General form of the dynamics used for controlling each motor’s speed. The attractor is set by software to have the desired final speed for the motor, where the dynamics forces slowly the real speed to achieve that value. This guarantees some controlled smoothness in motion changes. Varying the gain factor (steepness of the line), one can vary the response speeds.

The corresponding differential equation that yields this dynamic behavior is: dV

dtV V Vdesired=⋅

= − , where Vdesired is the software set

wanted final motor speed. Transitory response speed is kept constant.

Grf. 2 - This is the resulting temporal response for the motor’s speed, due to the dynamics shown previously. The speed approaches the desired attractor value describing an exponential curve, starting at any other initial speed.

VI-2.2 Obstacle-avoidance Contrary to the usual obstacle-avoidance algorithms [Branco & Kulzer, 1995], a more neuronal

mechanism is used which takes into account the theoretical knowledge extracted from [Payton, Rosenblatt & Keirsey, 1990] and practical from a real implementation [Rossey, 1996].

Fig. 64 - This is an example of what we would like the AA to do in different increasing danger obstacle-avoidance situations. From left to right, the danger increases, and so does the need to avoid it harder. When the AA doesn’t see any obstacles, it just does what its plan says: going fully forward in this example. When it starts seeing an obstacle, it should also start driving only slightly away from it, since it doesn’t it is still at a safe distance. When the AA starts seeing it nearer, then it should readily turn

IV - Project development

70

more than before. When the obstacle is right in front of it, the danger is maximum and the AA should veer as hard as possible away from it to avoid collision.

Fig. 65 - A five-neuron network for each sensor can partially implement the obstacle-avoidance mechanism in this opinion-guided reaction manner, where each neuron corresponds to a preferred direction. These two networks will be sending opinions to a five-neuron arbiter that then drives de AA motors. Each one of the sensor neurons receives direct sensor input and normalizes it to the range [-1;+1]. This can be seen as one of the adaptation processes which occur inside a biological neuron [Kulzer & Branco, 1994]. Each neuron has a different transfer function, just like in biological neurons. The output value of each neuron will be the normalized input passed through this transfer function. Each pair of neurons with the same direction correspondence feed the correspondent arbiter neuron. The distributed arbiter could be nothing more than a simple winner-take-all network, or something smoother like contrast-enhancement and localized averaging.

Fig. 66 - These are the type of functions that can be used for the above sensory neurons. As in [Payton, Rosenblatt & Keirsey, 1990], positive values mean desired directions, and negative ones mean danger or undesired directions. For example, the topmost leftmost function was designed so that this neuron will fire positively when the (RIGHT) sensor is very active, giving the opinion of veering hard left to avoid the obstacle. The opposite neuron (top rightmost) fires very negatively, so its opinion tells that veering hard-right is definitely dangerous. The bottommost neurons make just the opposite for the (LEFT) sensor.

IV - Project development

71

Fig. 67 - These are the motor result that come up from this neuronal architecture, for a simple left-wall avoidance. Only the

left sensor neurons’ activities are shown, since the right neurons are not active at all. When the AA is approaching the wall but still doesn’t see it, there is no opinion whatsoever about what direction to take, since the competence of this obstacle-avoidance level can only judge avoidance situations. At this point, some other plan (another level of competence) must give its opinion (Ex: follow road). When the AA starts seeing something, it veers softly to the right. If the wall is even more proximate, then the “veer hard-right” neuron becomes activated as well. Since there is no sufficient grain (neuron count), the AA still adopts the discrete soft-right direction. When the wall is dangerously near the AA, it veers hard-right. Note how non-winning neurons are giving their opinion too, since they could win when another level of competence is added-on. These non-winning opinions are of huge importance when it comes to choosing alternative directions, because of the overall non-appropriateness of one competence level’s most active opinion.

This approach with only two levels of competence has a problem with concave corners where the AA gets stuck as expected, since it does not have the level of competence that resolves the contention between both existing levels (each one pushes to the opposite side, when in a concave corner). This was successfully enhanced by [Rossey, 1995] who added a level of competence which detected and resolved that situation. After that addition and some extra neural structure, the AA was virtually untrappable.

Fig. 68 - This level of competence has to do with the detection of the situations where both previous levels have antagonist opinions. This level receives the difference between those two sensors. Note how the functions return large values when the difference is small.

The basic rules for designing the first scrap-version of the transfer-functions, seem to be as follows: • If, for a certain sensor value, you do not know what to do, then just leave it at zero. Other levels of

competence will contribute with some value for this neuron index.

• If, for a certain sensor value, you know exactly what to do, then assign it a positive value.

• If, for a certain sensor value, you know exactly what NOT to do, then assign it a negative value.

• The magnitude of the values different from zero, should be proportional to the assertion strength you want to give it. Normally, opinions nearby the most active ones, should have a bell-shaped decay.

• All functions should have a normalized and centered [-1 ; +1] input and output range.

• More vitally important levels of competence will have larger weights to the arbiter than less vital ones.

According to these rules of thumb, a wall-following competence was introduced. This level of competence had a serious problem of not being able to resolve the occurrence of convex corners. This is definitely due to the sensors’ limitation of the robot. It shows that not any sensor distribution allows for correct behaviors afterwards, rendering any mechanism almost useless. Using only the right and the front left sensors, with a very simple competence level, the robot was perfectly able to negotiate concave corners. Convex corners were never successfully negotiated and asked for some external help.

Due to the sensors’ limitations, and after spending quite some time trying to solve the problem, the robot was always helped to get around those convex corners and the next steps were implemented.

IV - Project development

72

VI-3. Internal compass and path integration

Fig. 69 - When the AA moves with wheel speeds (SL) and (SR) for a short interval (∆T), respectively, then the distances (∆L=SL⋅∆T) and (∆R=SR⋅∆T) are traveled. With this information, the angle change (∆α) can be computed as follows.

Considering the approach of short distances, the following approximate computations can be made:

∆ ∆∆ ∆

α α≈ ≈−

tgR L

D Eq. 1

rV

R L

VR L

D

≈+

= =−

∆ ∆

∆∆ ∆

2

$ α

Eq. 2

Note that ∆α is measured relative to the perpendicular of the wheel distance line. To get the current angle of the AA after some random travel, it is ideally only necessary to integrate all the infinitesimal angle changes:

( )α αcompass t

t

t t

t

t t

t

ddR dL

D DdR dL= =

−= −∫ ∫ ∫

1 Eq. 3

Noting that dRt and dLt are independent in t, this result further simplifies:

αcompass

t

t

t

t

dR dL

D

R L

D=

=−

∫ ∫∆ ∆ Eq. 4

This result means that the AA can keep an internal compass, just by computing a number that receives the difference between wheel distances given by shaft-encoders. There is no approximation, this is thus a correct compass, and this last equation will be the one implemented. The only additional care that the compass angle computing code must take, is to normalize the calculation between -180o and 180o. Also, it must maintain the D, ∆R and ∆L values either in shaft-encoder counts, or in millimeters. In this compass mechanism, one can easily see that it does not matter in what units those variables are expressed, as long as they have the same unit. Since there will be one counter for each shaft-encoder, the count will be useless as soon as they overflow. Using 16 bit counter, the experimental prediction is that if the AA travels continuously at full speed, those counters will remain correct up to 15 minutes. For larger operating intervals, some extra resetting mechanism is needed.

IV - Project development

73

Fig. 70 - To compute path integration vector (VT), it is necessary to integrate all infinitesimal displacement vectors (V). Each

current vector has the current internal compass angle, and the magnitude is given by the mean shaft-encoder distance.

The total vector that points from the start position to the current one, is given by:

r rV dV dx dy dV dVT t

t t t

t

t

t

t

= =

=

∫ ∫ ∫ ∫ ∫; cos ; sinα α Eq. 5

This expression cannot be further simplified to remove the infinitesimal operators, which means that this vector will only be an approximation when implemented in the AA. The AA will make as many samples of Vt as possible, summing all of them to compute the total displacement on-line:

rV V VT t

t

t

t

∑ ∑∆ ∆cos ; sinα α Eq. 6

VI-3.1 Shaft-encoder problems Since the wheel speeds can be positive or negative, and each shaft-encoder uses only one detector,

there is a problem related to the motor inertia when changing speeds. For example, in the extreme case of changing from 100% forward (where the counter is being incremented) to 100% backward (where the counter is being decremented), since the counting direction is directly given by the motor command the counter is already decrementing while the wheel still goes forward while it slows down to reverse. This causes large accumulative errors in the internal compass and path integration.

Fig. 71 - On the right, the segmentation of the internal motor wheel circle. Since the shaft-encoder pad is divided in four sections [Snow, 1995], and since these are never perfectly equal, one must take a whole revolution as a count, instead of half a revolution. This avoids jitter in speed measurements. On the right, the wave-form of a stopping and reversing motor is shown. To solve the above explained problem, the counter should be given the order to reverse counting only after the corresponding wheel has reversed as well. In other words, there should be a software master-slave flip-flop kind of mechanism where the master receives the immediate motor-direction flag, while passing it to the slave only when an acceleration is detected. This acceleration triggering signal is generated whenever the condition (∆Ti+1<∆Ti), as shown in the picture for i=4 and i=5. This mechanism even works correctly when a motor is only slowing down and accelerating without changing direction. In this later case, the flip-flop output remains unchanged as desired.

Another problem that may confuse the compass and path integration, is related to missed shaft-encoder pulses because of a deficient signal-processing circuit which extracts “1’s” and “0’s” from the analog signal. Much care must be taken in this circuit, to avoid unexplainable errors.

VI-3.2 Turn-to behavior For the AA to be able to turn to a desired compass heading in an efficient way (the most fit have the

better probability to survive in nature), it has to have a mechanism that finds the shortest turning direction.

IV - Project development

74

Fig. 72 - Finding out whether the AA should turn clockwise or anti-clockwise, to reach the desired headings (d1) or (d2), is not trivial if one keeps in mind that the compass angle can break from +180o to -180o. Here are the four possible cases, where “compass” is the current heading. The resulting formula is:

[ ] ( )direction sign compass desired sign compass desiredclockwise

anti clockwiseo= − − ⋅ −

− ⇒ −

1801

1 Eq. 7

VI-3.3 Go-to behavior For the AA to be able to return home or to go to a specific location, it must have a mechanism that

keeps the correct heading direction at all instants. It is not sufficient to turn the AA to the desired direction and then letting it go until the distance is reached, because it would deviate from the desired heading due to wheel speed differences. The mechanism must have a servoing feedback loop where the AA always tries to minimize the vector it must travel.

Fig. 73 - When the AA engages a go to location behavior, it must correct deviations on-line. In other words, the AA performs a servoing behavior, where it successively recalculates the desired location vector from the current location vector and the last distance traveled. In this picture, this servoing mechanism is largely exaggerated for easier viewing. Note that the successive location vectors are defined from the START to the successive END positions.

The AA recalculates the location vectors in the following manner: r r r r r r r r r

V V d V V d V V d2 1 1 3 2 2 4 3 3= − = − = − ; ; ; ... Eq. 8

VI-4. Landmark discrimination and recognition VL discrimination and recognition will be implemented by some basic AI algorithms and by ANN. The

most difficult part of this task, is related to the classification and recognition mechanism. All other mechanisms are straightforward and conventional. A great deal of attention was given to this module, since all the higher levels of behavior depend on it. Also, only distinguishable VL are going to be used, since the non-distinguishable case would produce much more ambiguity for the AA. The distinction is going to be made on the shape of the VL.

IV - Project development

75

VI-4.1 Finding behavior Exploratory finding of a VL is implemented the simplest way: the AA just wanders about, until it sees

an object in front of it. When it sees one object either with the FR, FL or FC sensors, then it assumes that a VL was found.

Directed finding of a desired VL that was already mapped earlier, is accomplished the same way, since the only difference is that the AA knows where to go to find that particular VL, with the go to behavior explained above. As before, the AA assumes it found the VL when it sees an object in front of it. The necessary location vector for the VL direction determination, is obtained by the more complex mapping mechanisms explained later.

Fig. 74 - In both finding processes, the AA keeps running straight until its sensors sense an object in front of it. Only front and lateral sensors are used for this purpose.

VI-4.2 Docking behavior

Fig. 75 - As soon as a VL is found, the AA rotates until its right side is parallel to the VL’s wall. As soon as it is docked with the VL, which is determined by equal distances to the right lateral mounted sensors, the AA starts circumventing the VL. This docking mechanism is always performed in such a way to leave the AA with the VL to the right. This guarantees that the sampled angles and path remain the same in subsequent circumventions for the recognition mechanism.

VI-4.3 Circumvention behavior In this work, the blind person approach is extensively used. In other words, it was chosen that the AA

would have to perform wall-following as the recognition data gathering mechanism, since this was the most the AA could perform with its limited sensing capabilities. Animals use enhanced vision for distal VL recognition, which is most powerful but also difficult.

Fig. 76 - After finding and docking a VL, the AA starts wall-following the VL, trying to keep a fixed distance from it. The AA goes negotiates convex and concave corners as it goes. For this task, the AA uses front and lateral sensors. Note that the VL is a closed and simple shape. In case of large or open shapes, the AA would have trouble knowing where to stop. Because of this, only simple shaped VL are going to be used in this work.

IV - Project development

76

Fig. 77 - If a normal circumvention* and corner detection are used (left picture), then there are problems like corner misses, limitation to polygonal VL, corner sequence match needs dynamic shifting. If a engineered precise circumvention is used (middle picture), then it is very slow, and continues to have the same problems as before, solving only the missed corners. If a neural mechanism is used for recognition (right picture), then there is a tolerance to missed / new corners, VL need not to be polygonal, training is very simple, and the matching process is dynamic and autonomous. For this later mechanism, one can even use the sloppy circumvention used for normal corner detection. As explained earlier, the circumvention direction is always done with the VL to the right. Just like in a classical control system, wall-following can elicit oscillations and over-shoots, if the motors respond too slowly to the commands. In other words, if the phase shift get near -180o due to delays in the control loop, then it can even get unstable and veer away from the VL. Some care must be taken into the selected motor speeds.

VI-4.4 Classification and recognition While wall-following the VL, the AA must some how gather data about the VL as seen by the AA. This

data is most definitely related to the VL’s shape. If only the corners and distances between them are taken as classification data [Seed, 1994] [Caselli, Doty, Harrison, & Zanichelli] [Doty, Caselli, Harrison & Zanichelli] [Doty & Seed, 1994], then only polygonal VL could be recognized, the matching algorithm would be brittle, and the corner extraction itself would be brittle. To avoid these problems, a completely different approach must be made. Furthermore, this method must be fast enough to recognize VL, since the survival of a AA depends on that.

Fig. 78 - Two VL with much different shape: while landmark (A) is polygonal and has well-defined corners, landmark (B) is rounded and does not have well-defined corners nor straight walls. Conventional algorithms [Seed, 1994] [Caselli, Doty, Harrison, & Zanichelli] [Doty, Caselli, Harrison & Zanichelli] [Doty & Seed, 1994] fail completely when trying to learn the shape of (B).

VI-4.4.1 Absolute temporal Kohonen 1-D network Using a 1-D Kohonen network or Self-organizing feature map [Kohonen, 1982, 1988], one can

construct a memory structure where the neurons represent the temporally sequential positions of the AA around the VL. [Euliano & Príncipe, 1995] made a temporal 1D self-organizing feature map net where the nodes organized themselves along a temporal sequence of patterns, in a 2-D input space.

* Note that the circumvention implies the simultaneous actions of moving, scanning and registering.

IV - Project development

77

Fig. 79 - On the leftmost pictures, two VL are learned for the first time, in that the AA circumvents them. While

circumventing, 2-D spatial patterns (dots) are stored in a temporary buffer. After a turn of 360o or some other stopping criteria, the AA assumes it has circumvented the VL and a 1-D SOFM is trained with the gathered data. The middle picture shows possible training results that would be something like the neurons distributed around the input data. At this point, it is not yet very important to know what the criterion for the number of patterns and neurons is, because this model is yet going to be modified several times. The rightmost pictures shows a recognition attempt, where the AA circumvents the VL again. Of course, the path will not be exactly the same, but since the patterns elicit a similar temporal sequence of winner neurons. It can be easily seen that even if the VL changes or the path deviates (to some maximum degree), the same winner sequence is obtained. In these figures, one can already see the similarity between hippocampal PN and PF: while the space between black nodes could be viewed as PF, the ANN neurons are going to be the PN that fire when the AA moves within those PF.

The trained network is stored in memory for later use. The weights of those neurons contain the 2-D coordinate values relative to the starting point, where their indexes show the sequence: 1-2-3-4-5-6-7-8-9-… This every-time incrementing sequence is used to detect a match for any landmark (it is not stored). When the AA tries to recognize the VL and the path generates the winners 1-1-1-2-2-2-3-3-3-4-5-5-6-6-7-7-7-8-8-9-9-…, one can easily extract the matching sequence 1-2-3-4-5-6-7-8-9-… after trashing the repetitions. This is similar to time-warping [Neil, 19], but now more in a sense of space-warping. Note that any VL is recognized when the corresponding ANN produces the monotonically incrementing sequence 1-2-3-4-5-6-7-8-9-… If one network produces a non-monotonically incrementing sequence of winners, then the AA is not before the VL represented by that particular ANN.

There are several properties of this model that are advantageous relative to others: • The VL can be of any shape, not only polygonal. • Space-warping is easily performed just by extraction of repeating winners. • The matching process is straightforward and needs only a monotony detector. • The recognition process copes well with deviations from the previously learned path. • The sampling rate is not critical and could be based on compass changes, traveled distance or a combination.

Fig. 80 - The major problem with this particular configuration lies in the fact that 2-D patterns are composed by absolute coordinates of type (x,y). This poses a problem when the AA starts circumventing the VL: it does not know where the first pattern should be sampled. If this first pattern fails, then all the patterns will have a persistent deviation and the ANN does not work at all. In other words, while the weights of neuron (1) are (0;0), of 2 are (0;1), etc., if the first pattern starts at (0;0) near neuron (6) that has weights (0;2), and assuming the same coordinate system, then the coordinates of the pattern sequence will be wrong and no correct matching can occur. On the other hand, patterns will accumulate absolute errors that rise with the distance from the first pattern.

This ANN must be modified in order to avoid this catastrophic problem, but maintaining all the other promising features. Therefore, the implementation in [Euliano & Príncipe, 1995] cannot be used directly for this particular application (aside of not allowing on-line learning). Also, this ANN should not have a specific training period, where data is first gathered. This does not sound biologically plausible.

VI-4.4.2 Ring network with temporal activity propagation The need for a neural mechanism that whose matching capabilities are invariant to the circumvention

starting point, leads to a somewhat changed ANN. A very simple polygonal example of what is going to

IV - Project development

78

be done follows, just to keep explanations simple. Everything can be extrapolated to any VL shape. The model is going to be tested on three different VL.

Fig. 81 - To avoid accumulative errors, only local relational values are taken, namely the turn angles between each pair of distance segments. These segments are kept uniform in size, which means that the AA samples at equidistant points. This simplifies the ANN structure explained next. Note that the sum of all turn angles gives 360o. This is still detected by the internal compass mechanism. To avoid false detection of complete VL circumvention, the internal dead-reckoning mechanism must also detect a small residual displacement from the starting position. This prevents the AA of detecting a full circumvention of landmark (A), just before the +90o concave corner. Note that the sampled segments need not be parallel to the VL’s walls, which is shown in landmark (C). Each segment spans from one point to another, even of it intersects the VL. Only when the current segment has the threshold distance, a new one is initiated. Movement kinematics of an animal play an important role in PN firing [Muller & Kubie, 1987], and also movement between locations [Muller & Kubie, 1987] [Muller & Kubie, 1989] [Schmajuk, 1990].

Fig. 82 - When the AA circumvents a VL for the first time, it generates a neural network where the neurons have some similarity to the PN [O’Keefe & Nadel, 1978] in that they represent positions around the VL. At each distance segment, a new neuron is added and its synaptic circuitry contains the turning angle from the previous neuron. For example, when the turn (3) is made, a new neuron is added and its synapses will contain a “angular weight” of -90o. Here, landmark (A) starts differing from (B) only at turn (6). It is then expected that the AA can only discriminate between the two VL after it turned on (6).

Fig. 83 - Details of the first network generation for VL (A). Each turn is represented in a synaptic link. Each “turn” is regarded as the angle formed by the last two displacement segments performed by the AA. Note that this synaptic link can be a complex circuitry and not a simple synapse. These synaptic circuits will then receive motor information the during recognition phase, where the mismatches are computed. Note also that the AA is always ahead in respect to the turn it is memorizing. This network constructing is similar to the synapse sensitization in the rat’s HC [O’Keefe & Nadel, 1978] [Pavlides, Greenstein, Grudman & Winson, 1988]. Phase progress proportional to the distance traveled is assured by the shaft-encoders, just like in the rat [O’Keefe & Nadel, 1978].

IV - Project development

79

Fig. 84 - Details of the activity propagation during the recognition phase. Here, when the AA circumvents a VL, it feeds all synaptic circuits in parallel with motor information (turn information). All neurons start off with 100% activity. This activity then propagates only to the right neighbor, attenuated by a factor that depends on the mismatch coefficient. This mismatch coefficient is a multiplicative value that blocks more or less the propagating wavefront. If there is a good turn match with the synaptic link, then the wavefront is less attenuated. If there is a large mismatch, then the wavefront is more attenuated. The topmost picture illustrates a matching sequence for landmark (A) with its principal wavefront propagating always at 100%, while the bottommost one shows two mismatches when circumventing landmark (B) and using the (A)-network. This last one will have its wavefronts heavily attenuated over time. In this example, a mismatch of 90o caused an attenuation of a factor of 2, while a mismatch of 180o corresponded to a factor of 4.

Landmark (B)

N 1 N 2 N 3 N 4 N 5 N 6 N 7 N 8 100 100 100 100 100 100 100 100 50 100 50 100 50 100 50 100 100 25 100 25 100 25 100 25 12 100 12 100 12 100 12 100 100 6 100 6 100 6 100 6 3 100 3 100 3 100 3 100

50 3 50 3 50 3 50 3 1 12 1 12 1 12 1 12 6 1 6 1 6 1 6 1 0 6 0 6 0 6 0 6 6 0 6 0 6 0 6 0 0 6 0 6 0 6 0 6 6 0 6 0 6 0 6 0 0 3 0 3 0 3 0 3 3 0 3 0 3 0 3 0 0 3 0 3 0 3 0 3 3 0 3 0 3 0 3 0

Tab. 6 - Activity propagation through all neurons, as the AA circumvents both VL. Since the responses of the (A)-network are shown for both VL, only the circumvention of landmark (A) generates a strong wavefront. For landmark (B) the wavefront dies out as soon as some distinction points are reached. The different neurons are denoted by (N) followed by the neuron index. The Italic Bold values indicate the principal wavefront diagonal. This principal wavefront is the one that is in the right sequence and that survives.

Landmark (A)

N 1 N 2 N 3 N 4 N 5 N 6 N 7 N 8 100 100 100 100 100 100 100 100 100 100 50 100 50 100 100 25 25 50 100 25 100 25 50 25 25 25 25 100 12 100 25 12 6 12 25 12 100 6 50 12

12 6 6 25 6 100 6 12 12 12 3 6 12 6 100 1 0 3 6 0 3 3 1 100

100 0 1 6 0 3 3 0 0 100 0 1 3 0 3 0 0 0 100 0 0 3 0 0 0 0 0 100 0 0 1 0 0 0 0 0 100 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 100

100 0 0 0 0 0 0 0

IV - Project development

80

Fig. 85 - Visualization of the various wavefronts of the (A)-network, when both VL are circumvented separately. When landmark (A) is circumvented, the wavefront propagates through the neurons, at the same displacement speed as the AA. Eight wavefronts are initially present, but only the one that is in the right sequence will survive. When landmark (B) is circumvented, all wavefronts die out and the network becomes inactive after a while. Detecting the most active wavefront among all networks in memory, allows an easy method to determine the recognized VL. This can be easily done by a WTA network. Furthermore, The AA can figure out at which position of the VL it currently is, by simple inspection of the neuron index where the principal wavefront currently is. This is the position where the AA also finds itself in the VL circumvention path. The AA can then incorporate this extra information into the mapping mechanism, in order to compute trajectories from the current VL to another, starting at any point of the current one. In these simple artificial example, the AA recognized landmark (A), starting at the same first position as in the learned sequence. However, it could start at any one of the 8 positions, that the networks would still generate the same type of wavefronts with only one surviving. This will be shown in later simulations of the real implemented networks and corresponding algorithms. Note how this recognition process is similar to that of the blind person: first, she assumes that the found VL could be anything (100% activity in all ANN); while circumventing the VL, she starts assuming that some VL are more probable than others (wavefronts) start to die out slowly); after circumventing a sufficient amount of the VL, she assumes it to be only one, while rejecting others (one principal wavefront survives and all the others die out). This graph shows clearly the relation PF and PN. PN are added to the ANN while the AA moves within spatial regions around the VL. When the AA again circumvents the same VL, those PN fire exactly when the AA moves within the corresponding PF, just as in the biological HC. This peak activity moves from one PN to the next, as the AA passes from one PF to the other. The TR of the HC is thereby replicated, as it synchronizes the ANN activity progress from one PN to another.

This recognition mechanism allows the AA to give a VL as recognized as soon as it is “sufficiently” discriminative from the others in memory. There is a tradeoff, of course: the AA could fail to detect a new VL because of a too soon of a recognition assumption, but this is no better in the blind person case. Similarly as in the rat the TR starts with broad frequencies and then focuses [O’Keefe & Nadel, 1978] [Miller, 1991], initially there is a confusion of wavefronts that focus into one single principal one when the VL is discriminated.

IV - Project development

81

Note that there is no artificial intelligence if-then-else structure whatsoever, being the main features of this model the following:

• The VL can be of any shape, not only polygonal. • Neurons are uniformly created while the AA circumvents the VL. • The synaptic circuitry of each created neuron contains angular information. • The need for space-warping is eliminated through the use of uniform spatial sampling. • The matching process is straightforward and needs only a simple WTA network to select the ANN

with the largest activity. • Turn angle deviations are well tolerated (this will be shown in later simulations). • Optimum decisions are inherent to this model, since a WTA network takes the most active

network, i.e. the best match of all, even if the winning wavefront is low. • The position in the circumvention is easily known from the principal wavefront position. • The structure and propagation mechanism are very similar to a Hidden Markov Model* [Rabiner &

Juang, 1993] in respect to the “recall” process. • Rotations and translations of the VL do not affect the performance, just like in the rat [O’Keefe &

Nadel, 1978] [Muller, Kubie & Ranck, 1987]. • Changes in the VL’s turn angles arrangement changes the ANN’s response, which is similar in the

rat [Muller & Kubie, 1987]. The major problem that arose from this new model is related with the sampling positions. The AA

learned a ANN with weights given by the turn angles. Now what happens if the AA circumvents the VL starting at an intermediate position?

Fig. 86 - In this example, when the AA circumvents the VL, the corresponding ANN stores one angle sequence in its synaptic circuitry, while receiving a totally different sequence when the AA tries to recognize the same VL. This is caused by the fact that it started sampling segments at a different position, which is unpredictable, since the AA does not have any position synchronism mechanism. The VL is not recognized at all, and one must implement something that makes this recognition process invariant to the exact starting position.

VI-4.4.3 Averaging motor-cortex as a preprocessor for the 1D network The previous problem rises from the desirability of invariance of recognition to the sampling starting

point. The recognition mechanism already handles sampling starts at different neuron positions, but does not handle intermediate deviations from this learned sequence (start positions between neurons).

The solution to this invariance problem lies in an “averaging motor-cortex” mechanism that takes a window of motor activity to compute the segments that are going to be fed to the recognition ANN.

* In the Hidden Markov Model, the probability of a given sequence of observations is obtained by the product of single observation

probabilities or matches. At the end, depending on the overall match or mismatch of the input sequence with the stored values, this probability will be higher or lower. However, the model presented here does not have the problems of “null transitions” and “null state probabilities”, because it was implemented in a different manner. Furthermore, training is local and simple.

IV - Project development

82

Fig. 87 - Simples illustration of the advantage of having a preprocessing motor-cortex. In the top sampling sequence, the standard method is used: each new segment starts at the end of the previous one, so that each turn angle is given by the last sampled pair. Here, the -90o is missed simply because the pair that generates it is never formed. The top left picture indicates the segment starts and ends. In the bottom sequence however, instead of starting each new segment at the end of the previous segment, sub-samples are made between those start/end points. Now, a new pair of segments is generated at every new sub-sample distance. The start and end points of each segment “walk” along these sub-sampling points in a way that could be viewed as a “averaging sub-sampling window”. The window size is the length of two segments. This ensures that the matching turn angles will emerge somewhere in time, with a temporal displacement of 4 in this case. This sub-sampling necessity is very similar to the phenomenon of overlapped hipercolumn receptive fields in the retina of the eye [Kulzer & Branco, 1994] [Hubel & Wiesel, 1974]. It is also similar to the need of overlapped sample segment windowing in speech recognition [Rabiner & Juang, 1993 ].

Fig. 88 - These are the tuning curves of the mentioned motor-cortex. Each motor-cortex neuron has its maximum response at a certain turn angle. The motor sensory area feeds these neurons with the two segments which represent the turning angle, where the motor-cortex outputs the most activity at the corresponding neuron. This motor-cortex in turn feeds the landmark ANN. Note that this motor-cortex and its tuning-curves structure is very similar to the organized columnar orientation sensitive structure of the visual cortex [Kulzer & Branco, 1994] [Hubel & Wiesel, 1977]. There are also neurons in the postsubiculum dorsal section of the HC formation, which are sensitive to the animals head direction [Taube, Muller & Ranck, 1990a, 1990b] [McNaughton, Barnes & O’Keefe, 1983] [McNaughton, 1988]. In the visual cortex, systems like this are also called computational maps. They allow the existence of mechanisms such as lateral inhibition and facilitation that make big sense in such maps [Kulzer & Branco, 1994]. Huge amounts of continuous information streams are also efficiently processed in that the results are rapidly output in a organized manner. This organization of unimodal data also allows higher layers to interface easily [Kulzer & Branco, 1994]. For the real implementation, cosines are going to be used, because they offer a smooth progressive roll-out of the peak when errors occur. Note that this motor-cortex resembles to a fuzzy logic mechanism with its response curves.

IV - Project development

83

Fig. 89 - Schematic illustration of the different pathways and types of information that go from the motors to recognition ANN’s. The motor shaft-encoders generate continuous motion data that converges into the sensory cortex. Here, turn angle data is output at the sub-sampling rate. The motor-cortex discriminates between different turning angles and outputs decoded “data lines” to the temporal ANN’s. These perform the actual VL recognition and learning. There is only one more problem that was generated by this sub-sampling mechanism: now, instead of generating turn angle information at the standard rate related to ANN’s neuron distances, information is generated at the higher sub-sampling rate. This causes a lack of correct activity propagation synchronization.

Fig. 90 - The top left picture shows the affected turns for this illustration. The top right picture shows the standard sampling sequence, where the activity peak propagates at the same speed as the turns are sampled (one neuron per turn/sample). The sub-sampling technique solved the matching problem, but now the activity peak must also take this in consideration, traveling slower along the axons. This is shown in the bottom sequence. After the matching turn (1), the peak travels along the axon during the sub-samples, until it reaches the next neuron exactly on the next matching sample. This way, the synchronization mechanism is preserved. Note that only one wavefront is shown, for simplicity. There will be as many wavefronts between each pair of neurons, as there are sub-samples between them. Only one wavefront will survive, since all the others are out of synchrony with the correct sequence. This will be further demonstrated in later simulations.

IV - Project development

84

Fig. 91 - Another way to look into the ANN’s creation. The two first pictures show a sequence of two neurons being sensitized. As these become active, they build up a Hebbian link with the correspondingly active motor-cortex neuron. On the right picture, one can see all the built up links between the ANN and the motor-cortex. Shaded areas indicate activated neurons.

Fig. 92 - The resolution of the AA’s circumvention depends on the chosen size for the sample segments. If they are “too big”, then the AA will miss sharper turns like in (α), (β) and (χ) on the left picture. “Too small” segments produce too many samples which occupy more memory. As a rule of thumb, the segment size should coarsely be of the same size as the smallest segment of the VL, in this case the (χ). If the VL is round, then chose the size in such a way that the segments do not “cut” the VL too much ((α) and (β) on right picture). Similarly, if the refinement sub-samples are too much, more processing power will be needed. On the other hand, too little refinement causes the ANN not to catch matches that easily.

When circumventing a VL, if the final principal wavefront activity is above a certain minimum recognition threshold, then the VL is recognized. However, if it is not, then a new ANN is created for this particular new VL. Also, note that in this mechanism, local turn angle errors are not accumulative.

Fig. 93 - Detail of the first and subsequent formation of the motor window vectors pair. After the first pair of vectors has been formed, the motor cortex starts working normally. The motor vectors are calculated from displacement vectors as a difference. This simplifies the algorithm in that the AA just stores subsequently available displacement vectors at each sub-sampling point. Note that 9 sampling points to get a pair of motor window vectors, where the turn angle can finally be extracted. The algorithm thus always computes them from a walking window of 9 displacements. Note also that the displacement reference point can be any one.

IV - Project development

85

Fig. 94 - Illustration of the real circumvention. After docking on the VL point (•), the AA starts circumventing it until � where the first motor window vector pair is formed. It is only after this, that the AA starts classifying / recognizing the VL. Based on dead-reckoning, the AA stops in the proximity of the real VL processing starting point. Another more robust and biological stopping method was suggested in conversations with [Euliano & Príncipe, 1995] where the emergence of a second principal wave-front when circumventing the VL for the second time, would be used to determine the stopping point. As said earlier, open and complex shapes are very difficult to circumvent and classify correctly. A solution could be to view these shapes as non-VL and take them as wall and obstacles as in [Caselli, Doty, Harrison, & Zanichelli] [Doty, Caselli, Harrison & Zanichelli] [Doty & Seed, 1994] [Seed, 1994].

This motor-cortex added the feature of turning the whole system completely invariant to synchronization aspects, since now the AA can start recognizing the VL from anywhere in that VL. Note that, however, this has nothing to do with the problem of equal VL in different places. All the method does is recognizing a VL as being a certain one, without regarding duplicates and different absolute positions in the environment.

VI-4.4.4 Similarities with the hippocampal phenomenon This network constructing has many similarities with the internal processes of the HC, as well as with

macroscopic behaviors that result from those processes. It must be kept in mind that these ANN work differently from the HC, in that they encode sequences of turning angles for only one VL at a time, whereas the HC encodes sequential VL constellations. Therefore, these comparisons are made already with that “scale” difference in mind. Also note that all analogies arose because of the direct biological inspiration of the model, leaving only the motor-cortex mechanism as an added plausible biological feature in the animal’s brain.

The similarities are the following: • Place neurons - like in the HC, the synaptic circuitry of these similar PN acquire place

information, in this case only turning angle data around a VL, instead of more complex sensory data [O’Keefe & Nadel, 1978]. The position where the AA currently is (PF), elicits the peak of the principal wavefront at the corresponding PN. This peak moves from one PN to the next one, as the AA travels from one corresponding PF to the other. Note that, as in the biological HC, this activity propagation depends on the movements the AA makes from one PF to the next one (turn angle and fixed distance).

• Theta rhythm potentiation - like the TR which seems to successively sensitize the synaptic circuitry of PN in the rat’s HC [O’Keefe & Nadel, 1978] [Pavlides, Greenstein, Grudman & Winson, 1988] and which keeps a consistent relation between traveled distance and phase progression, here the synchronization is also made by distance triggering. In other words, the TR mechanism of the HC has been directly replicated in this model of neural VL shape storage.

• Theta rhythm phase - the phase progress in the HC laminar structure is relatively proportional with respect to the rat’s distance travel [O’Keefe & Nadel, 1978], as it is in the ANN, which is assured by the motor shaft-encoders in this AA.

IV - Project development

86

• Theta rhythm frequency - when the animal explores a novel environment, the TR has a much wider range of frequencies than when the animal is already familiar in that environment [O’Keefe & Nadel, 1978] [Miller, 1991], where the spectrum gets focused. In these ANN, when a VL is started to be circumvented, there are many wavefronts competing. But when a VL is recognized, there is usually only one principal wavefront present.

• Landmark rotation and translation - like in the HC [O’Keefe & Nadel, 1978] [Muller, Kubie & Ranck, 1987], it does not matter if a VL rotates or translates in space, since the ANN’s operation is independent of those spatial transformations.

• Landmark arrangement - like the spatial arrangement of VL change the HC’s responses [Muller & Kubie, 1987], this ANN also changes response if the current VL changed its arrangement of turning angles.

• Motor cortex - in a similar manner to the postsubiculum neurons in the dorsal section of the HC formation, which are sensitive to the animals head direction in an environment [Taube, Muller & Ranck, 1990a, 1990b] [McNaughton, Barnes & O’Keefe, 1983] [McNaughton, 1988] [Redish, 1995], here the motor-cortex also gives vital directional information about the turning angles, although for a different purpose. Also, there are monkey motor cortex population vectors coding for the reaching direction, where each group of neurons has a preferred direction {Georgopoulos, Kettner & Schwartz, 1988} (in [Burgess, Recce & O’Keefe, 1994]). Finally, these motor cortex neurons could be viewed as biologically leaky integrating cells which integrate motor movements in a memory window, thus producing a turning angle trace through time. This makes the mechanism biologically plausible.

• Movement kinematics - like the movement kinematics of the animal play an important role in PN firing [Muller & Kubie, 1987] [Hetherington & Shapiro, 1993], these ANN only receive information from the motor-cortex. Also, as movement between locations may play a major role in firing [Muller & Kubie, 1987] [Muller & Kubie, 1989] [Schmajuk, 1990], these ANN rely only on turning angle between positions and that the AA really moves (wave-front progression).

In the HC model from [Muller & Kubie, 1987], it seems to be plausible that the TR is synchronized to

the first place of the sequence to be recognized. The model implemented here, although also inspired in that HC model, does not have any mechanism that allows for this first synchronization. This leaded to the need of the averaging motor-cortex and invariance to the starting position.

VI-4.4.5 Mathematical analysis After getting the feeling of the operation of these ANN’s, here are some mathematical results that

shed some light on the expected experimental results. To get a general wavefront activity propagation equation, it is assumed that there is an initial activity C and that each neuron contributes with its own activity Ai. The propagated value sequence from time instants 1 through N (end of the ANN), will be:

( )

( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( ) ( )

W C

W W M A C M A

W W M A C M M A M A

0

1 0

2 1

1 1 1 1 1 1

2 2 2 1 1 2 2 1 2 2 2

=

= ⋅ + = ⋅ +

= ⋅ + = ⋅ ⋅ + ⋅ +

α α

α α α α

...

where W(t) is the wavefront activity value at time instant t, and Mi is the match coefficient of the synaptic circuitry of neuron i.

IV - Project development

87

This results in the general form:

( ) ( ) ( )W N C M A M Ai

i

N

i j j

j i

N

i

N

N= +

+

= = +=

∏ ∏∑1

111

1

α α Eq. 9

where W(N) is the activity when the wavefront got through the ANN once. Depending on whether Ai=0 or Ai=”noise”, and if the first component of this equation is used or not, there are three different operation variants that will be further analyzed ahead.

VI-4.4.5.1 Hidden Markov Model similarity

Note that Hidden Markov Models [Rabiner & Juang, 1993] are widespread and many applications in sequential and temporal problems. Research is even trying to mix neural networks with these mechanisms [Bengio, et al, 1992] [Cho & Kim] and use it where only ANN were thought of giving best results [Pieraccini, 1993] [Yang, Xu & Chen, 1994] [Zhu, 1991], in the hope of getting the best of both - Markov Models for catching the dynamics temporal processes, and ANN for classifier front-ends. Markov Models are still a good alternative to temporal ANN.

In a Hidden Markov Model the probability of a sequence of statistically independent observations O=(o1 … oT), for a model λ, and for a particular state sequence q=(q1 … qT), is given by:

( ) ( ) ( )P O q P o q b ot tt

T

q tt

T

t| , | ,λ λ= =

= =

∏ ∏1 1

Eq. 10

Since the sequence of neurons in the ANN is always the same (1-2-3-4-…-N), assuming C=1 and Ai=0, looking at observation ot as being similar to the turning angle αI, and looking at the probability bt as being similar to the matching coefficient Mi, the above equation could be reduced and transformed into:

( ) ( ) ( ) ( )P O q P O b o Mq tt

T

i ii

N

t| , ,λ λ α= = =

= =

∏ ∏1 1

Eq. 11

which is the first component of Eq. 9. In other words, if all neurons start at an initial activity of 1.0 and the wavefronts propagate thereafter only being affected by the matching coefficients, the ANN performs as a Hidden Markov Model, specifically as a right-only-model. Differences lie in the facts that in this ANN there are no null transitions and training is local.

Assuming ( )δ α==

∏ M ii

N

11

, the activity ratio between two ANN’s A and B is given by the ratio of the

overall matching coefficients, where the more discriminative Mi(αi) are, the larger ration results:

rA

B

δ Eq. 12

This will be the process used in the real implementation on the AA. Note that the training process is very different here. While Markov Chains are trained with multiple sequences, this ANN is trained only once. What is emphasized here, is the “recall” process, which is very similar since it involves a kind of “certainty propagation”.

IV - Project development

88

VI-4.4.5.2 Saturating activity

Starting with Eq.1, and assuming Ai=1, the activity value after one circumvention will be:

( ) ( )W N M j j

j i

N

i

N

=

+ =

= +=

∏∑ α11

1

1 Α Eq. 13

and one can find out the activity value after n circumventions:

( )

( ) ( )

( ) ( )

( )

W N

W N W N

W N W N

W nN

n

=

= + = +

= + = + +

=−

+

Α

Α Α Α

Α Α Α Α

Α

2

3 2

1

1

2

1

δ δ

δ δ δ

δ

δ

...

Eq. 14

In words, the wavefront saturates at a finite value only if the overall matching coefficient δ is less then one, that is, if there are deviations from the learned path. The question is, how to control this saturation level to a useful value.

The activity ratio between ANN’s A and B is now given by:

( )( )

rA B

B A

=−

Α

Α

1

1

δ

δ Eq. 15

VI-4.4.5.3 Additive activity increase

Since making Ai=1 does not lead to a useful model, making each neuron adding matching activity

( )A Mi i i= α , and always adding the activity wavefront to the next neuron, another operating mode is

obtained:

( )

( )

( )

W N

W N

W nN n

=

= + =

=

Α

Α Α Α

Α

'

' ' '

'

...

2 2 Eq. 16

In words, this mode of operation does not provide good discrimination because all wavefronts just keep on growing linearly, some faster than others.

The activity ratio between ANN’s A and B is now given by:

rA

B

Α

'

' Eq. 17

IV - Project development

89

VI-4.5 Last remarks One could think that adding some competition between landmark encoding ANN would enhance the

overall recognition performance, since the most promising ANN would make some inhibition on the others. Competition will not be used because it could lead to fatal misclassifications due to an initial poor activity start of the correct ANN, that would be killed by a initially stronger but incorrect one. Competition like WTA should only be used at the end, where the correct ANN had time to fully activate.

There may persist two last problems with this recognition model: • Although the sub-sampling solves the problem of exact sampling positions, the surviving wavefront

still needs the matching instants to be correctly synchronized. In other words, if for any reason the AA deviated from the correct path in such a way that causes an accumulating deviation on the matching instants, the principal wavefront could get weakened and even another one could try to emerge. This should be tolerated by the lateral wavefronts as explained later, but if it causes serious problems, then leaky neurons can be tried in place of the memoryless ones, trying to generate a “spreading” principal wavefront with a larger peak that still catches these deviation.

• The fact that activity always decreases leads to a less biologically plausible model, in that the winning network could have almost no activity (although being much higher than the others). The ideal thing would be to have the winning network to saturate at maximum activity, while the others died out. This implies some sort of simultaneous competition. Also, for Hebbian learning mechanisms to work, large activity would be a must.

• As in Hidden Markov Models, activity propagates and always decreases or remains the same. It never increases. This could be a limitation if a corner is badly detected and the correct network’s principal wavefront almost dies out. This wavefront could never recover and another network’s wavefront wins. But, what if this “bad” detection really corresponds to a much different corner in a different VL? Then this behavior would have been correct. In other words, one can not look at this as a real problem.

• Contrary to fixed-point ANN like gradient recurrent networks such as Hopfield recurrent networks and Recurrent Backpropagation [Pearlmutter, 1995], where the outputs tend to a unique point after the process has begun to converge, these recognition ANN implement a system where surprises may occur. In other words, if one ANN has a large activity while all others are much lower (leading to the thought that the VL has been recognized), it can still happen that this ANN dies out and another one survives despite its low activity (after all, it was not the expected VL).

• This VL recognition mechanism could be viewed as being a microscopic version of the biological mapping mechanism in animals. In other words, instead of storing and relating PF, this circular ANN takes a smaller place sequence around a VL. If the sampling positions are places, then we have a mechanism that closely implements a similar model as in [O’Keefe & Nadel, 1978]. This kind of ANN could also be used in macroscopic mapping (next section), where the AA could recognize sequences of VL, when lost and trying to reacquire its current position. Still related to animal mapping, one could look at the traveling wavefronts as being context information. In other words, when a wavefront reaches a certain neuron inside the ANN, this means that this neuron receives an indication that it should fire next (expected result in that context).

• The overall architecture can be viewed as an “analog shift-register” which shifts voltages (wavefronts) according to the gating values (matching). This shift-register is actually implemented in software, and is realized by the axons and neurons themselves which have a controlled propagation speed.

IV - Project development

90

VI-5. Map construction The map construction mechanisms lay over and depend on the previous landmark discrimination and

recognition mechanisms. Here, only a theoretical framework will be presented, giving all necessary items, up to efficient shortcut and detour mechanisms.

VI-5.1 Basics The chosen architecture for the environment map construction, was one that represents space by a

mesh of nodes or VL. In other words, VL will be interconnected by polar vectors that irradiate from each one of them. Here, the ideas of [Poucet, 1993] will be closely followed.

Fig. 95 - Example of a stored landmark map mesh. As pointed out by [O’Keefe & Nadel, 1978], each pair of VL has a “bi-directional” polar vector link in-between. This way, (L2) has a commutable link to (L3). (L3) has two such links to (G) and (L4).

Bi-directional links between each pair of VL, is characterized by a displacement and an angle relative to a local reference:

{ }l di i i=→

;α Eq. 18

Fig. 96 - Landmark (L1) has a polar displacement vector { }

rd di i i= ;α directed towards (L2), according to the local reference

(R1). (L2) also has a polar vector to (L1) (not shown), but according to reference (R2). Note that r rd d12 21≠ because of the

different references each one is related to.

As will be seen, this model really allows all the operations observed in the biological case.

VI-5.2 Special operations of map construction and usage Here is a list of plausible operations and events that may occur in the mapping mechanisms of the

brain, specially the HC and related cortexes. This list abstracts many key-points referred in read papers. • Expectancy - when an animal travels through its environment, it expects to find certain places or

VL in the trajectory. PN are more expected to fire when the animal really executes motor movements to the expected place [Hetherington & Shapiro, 1993] [Eichenbaum, Wiener, Shapiro & Cohen, 1989] [McNaughton, Barnes & O’Keefe, 1983] [Wiener, Paul & Eichenbaum, 1989] [Muller & Kubie, 1987] [Muller, Kubie & Ranck, 1987] [Foster, Castro & McNaughton, 1989]. This most probably due to the lateral synaptic circuitry that links different PN corresponding to

IV - Project development

91

neighboring PN [Hetherington & Shapiro, 1993]. In a certain sense, this expectancy is similar to the traveling wavefront used in the VL recognition ANN, since it pre-excites the next PN.

• Fast build-up - when an animal first explores an environment, it seems that relevant spatial information is rapidly built up [Burgess, O’Keefe & Recce, 1993, 1994]. Movements and trajectories become more precise only in a later stage. Complete TI and partial MI seems to be the first stored information.

• Shortcuts and detours - after the exploration phase, an animal is perfectly capable of finding direct shortcuts and detours to goals. It seems these paths are generated out of existent partial spatial information, as a “temporary not stored side-product”. It also seems plausible that these novel paths are subsequently stored in memory so they can also be learned with precision (they pass from shortcuts or detours to normal stored paths).

• Goal concentration - it seems evident that a person needs to keep herself concentrated on the goal it wants to go to, when computing paths. When a person is distracted, it often follows a well known or usual path, independently of being the goal reaching one or not. In other words, the internal path computation mechanism seems to compute all sub-paths or sub-goals on-line (as we go). In neural terms, it seems that the goal must be always spreading activity or being active in some way, to allow correct path computations. The “distracted” usual path following is similar to a feedback network [Hetherington & Shapiro, 1993] and to the fact that the TR disappears for automated navigation in usual environments. It is frequent for us to note that we deviated from our previous intended path, to follow another one, usually more familiar.

Fig. 97 - Illustration of the simple feed-forward and “goal concentration” situations. When the person does not think where she wants to go, it frequently happens that she ends up following the most usual path (S - a - b - c - G2). This is like “letting oneself go with the stream”, similar to gradient-descent mechanisms. On the other hand, if the person wants to go to (G1), she seems to most likely have to think on (G1) to make the correct turn at (a) and follow (a - b - G1). Note that the confidence from (a) to (c) is higher then from (a) to (b). Confidences are given here by the thickness of the links.

VI-5.3 Adding a new landmark When a new VL is found, discriminated and stored, a new node is added to the current map. This

node will have a vector going out and pointing to the last VL. The magnitude will be the displacement traveled, and the phase will be relative to a local chosen reference. It is most simple to consider this reference as corresponding to the position where the first neuron in the recognition ANN fires maximally (wavefront passing by), where the corresponding AA compass value will do as the zero reference.

The last VL will also receive the same polar vector, but inverted (rotated by 180o), to point to the currently found VL:

rr r

VV V

t

t c

+ =+

1 2 Eq. 19

IV - Project development

92

In words, the current displacement vector Vc changes the actual vector Vi to reflect VL position changes or simply displacement errors. Note that it is somewhat difficult to determine the magnitude of the displacement vector between VL. However, the ideal would be to have some mechanism that would determine the displacement between the VL mass center (assuming uniform density). With another added mechanism that would tell how far the AA is from this center just before leaving the VL, it could calculate the exact displacement to the VL to find, “colliding” with it in the “find behavior”. Walls and obstacles are not considered in this theory.

VI-5.4 Updating a path Whenever the AA finds a known VL, it updates the polar vector pointing to the last VL and the one

pointing to it as well, only if that path had already been traveled sometime in the past. This update could be just storing the mean polar vector, out of the existing one and the new displacement estimate.

Note that updating a map is essential to keep it usable. Its like giving it some elasticity and flexibility [Brooks, 1985a].

VI-6. Map-based navigation

VI-6.1 Activity propagation The present map mechanism will work through an activity propagation or activity spreading. Very

similar to [Mataric, 1989, 1990b, 1992] where activation was also spread from the goal to the current location. More specifically, when the AA “wants” to reach a particular VL, it must “think” about it, activating the corresponding map node. This node then naturally spreads its activity over all links (synapses in the HC) reaching all connected nodes. These, in turn, will continue spreading activation. This necessity of “thinking on the goal” is in fact plausible, because of the previous observed operations people seem to do.

Fig. 98 - Illustration of the resonance mechanism that may happen in the PN of the HC. Looping synaptic circuitry may produce a phase locked loop (PLL) like resonance that “ties” two PN together during activity propagation. The symbolism used is depicted on the right. This resonance may be stronger or weaker, depending o the synapses’ weights. Note that this resonance could also exist through connections that link the HC with the isocortex [Miller, 1991].

Fig. 99 - Activity propagation example from a goal to a sub-goal and then to a sub-sub-goal. Starting with 100% activity at the goal, it propagates to the sub-goal by means of some kind of resonance mechanism, decaying by the link confidence factor of 0.5. This factor represents the certainty of the link being good (traversable, no danger). The sub-goal receives the propagated activity of 100% * 0.5 = 50%. This activity will be further propagated to the sub-sub-goal with 50% * 0.8 = 40%. This continues until the activity reaches the starting node. This propagation mechanism can also be seen as several wave-fonts which spread in all directions from each node.

IV - Project development

93

Fig. 100 - Different activity propagation situations. When spreading from a sub-goal to several sub-sub-goals, normal

propagation occurs, where each sub-sub-goal receives an activation that just depends on the confidence value of each link. When focusing, a sub-sub-goal receives different propagation (incident waves) from different sub-goals, where only the strongest one survives in the sub-sub-goal (winner-take all resonance mechanism, where only the strongest loop remains active and all the others die out). In this case, the better confidence is given by the bolder link. When a wave-front happens to be propagated around a closed loop of nodes, it is clear that it will not survive when it reaches the first node, or even earlier, because it is already weakened when it reaches that node (note that this first node receives more activity from some other one). By this selective propagation mechanism, a resonance path is formed, which represents the final computed path to the desired goal. Note that this path is formed by the surviving chain of resonances, obeying to the optimality rule based on the confidence values of each link. Also, before starting any resonance path goal finding, the AA must first know where it is, by recognizing any VL. Again, equal VL can produce initial confusion, which is eliminated as soon as it reaches some distinctive VL.

Speculations may be made about the triggering source for successive motor commands. These may be generated by the resonances themselves, in that the achieved nodes and correspondent expectancy peak (explained later) combine to trigger some motor gateway with the displacement information towards the next VL.

VI-6.2 Learning and normal path computation

Fig. 101 - When the AA first explores the environment, the VL are loosely linked between each other (low confidences). In the animal, this produces broad ranging TR frequencies, maybe because of confusing resonances with difficult to lock PLL’s [Miller, 1991] due to noise overwhelm. When the environment gets more explored, links get stronger and resonance loops start to consistently form. The banal environment produces a narrow range of TR frequencies in the animal, maybe because now resonance loops are strong and consistent (noise is overridden).

IV - Project development

94

A normal path to the goal is intrinsically computed by the activation propagation mechanism described previously. It arises as the surviving resonance chain between nodes. The AA just has to follow these surviving resonance origin nodes, until it reaches the desired goal. Detours are a side-product of this mechanism like in [Mataric, 1989, 1990b, 1992], as explained later, whereas shortcuts require extra processing.

VI-6.3 Shortcuts and detours

Fig. 102 - Taking as the map example on the top left mesh, there are several possible situations that may occur during mapping. A “normal” path would be the one defined by (S - L2 - L3 - G). The shortest shortcut would be (S - G) as shown. This type of path, because of not being “node-to-node”, requires extra processing as shown next. In case of the link (L3 - G) being obstructed, a detour would be (S - L2 - L4 - L5 - G). Note that this detour mechanism does not require any extra processing: if the sub-path (L3 - G) is obstructed by some danger, the AA would ignore that link in the next activity propagation. The result is that a detour is inherently generated.

Fig. 103 - For the shortcut computation case, activity propagation also propagates the link vectors to the starting VL. To show the local propagation mechanism, vector (A) is assumed to already be the shortcut to the goal landmark (LN), relative to reference of (L2). Now, (L2) must propagate this vector to (L1) in such a way that the new shortcut comes relative to this new (L1) reference. On the right, note how the vectors are quite different from reference to reference. One cannot simply add them as being of the same reference. Reference angles must be compensated. Being (B) the displacement between (L1) and (L2), one wants to compute shortcut (C). While (ρ) is the phase relative to the reference of (L2), (σ) is relative to (L1).

IV - Project development

95

First of all, the partial shortcut (A) is added to the displacement (B), obtaining the shortcut (CL2) which is still relative to the reference of (L2):

{ }r r rC A B CL L L2 2 2= + − =( ) ;ρ where

( )

( ) ( )( ) ( )

C A B AB

arctgA B

A B

= + + −

=+

+

2 2 2 cos

sin sin

cos cos

χ γ

ρχ γ

χ γ

Eq. 20

Note that, in this case, γ is positive and χ is negative, where (χ-γ) gives the angle between vectors (A) and (B). Note also that, since vector (BL2) goes to (A), it must be inverted.

To transform this shortcut relative to the reference from (L1), a change must be made on its phase, to take the reference phase difference into account.

Fig. 104 - Vector diagram used for the computation of the vector (CL1) phase (σ).

From the previous diagram, one can extract that the angle from (R2) to the visible (B) is: r rR B2

= +γ π Eq. 21

So, the angle difference from (CL2) to this (B) is:

( )r rC BL2

= = + −β γ π ρ Eq. 22

From here, one gets:

( )α σ β γ π ρ σ ρ α γ π− = = + − ⇔ = + − − Eq. 23

Arranging the terms, one may pin out the difference between the references:

( )[ ]σ ρ π α γ ρ= − − − = −

∧r rR R2 1

Eq. 24

that is exactly the amount that must be subtracted from the original phase (ρ).

IV - Project development

96

Fig. 105 - Illustration of the other way of looking into this calculation: The angle difference that goes from (R2) to (R1) is exactly γ+π-α = π-(α-γ) as shown in the previous equation.

Fig. 106 - [Touretzky, Redish & Wan] present an extremely interesting concept in terms of a neural basis for polar vector (phasor) rotation mechanisms, which could perform the previous required rotation operations for mapping. They propose an architecture called the sinusoidal array, that seems to be plausible to exist in the parietal cortex (plausibility for MI storage). An input phasor is neurally rotated by means of gated channels that are driven by some control input. (adapted from [Touretzky, Redish & Wan])

Note that shortcuts should only be taken when the total path confidence is above some threshold. If the shortcut is a too risky step to take at once (it is not guaranteed that the AA will find any VL on that path, to reorient itself) then the AA will discard it and take a normal path to the next VL. When it gets nearer the goal, then it could eventually chose for a shortcut.

VI-6.4 Failure and reacquisition When the AA fails to find a VL or loses itself in the environment, it must somehow overcome the

problem and try to reacquire its current position relative to the internal map.

Fig. 107 - On the left, when the AA fails to find (L3) on the normal path (note that (L3 - G) lacked confidence, after the previous detour), then it just turns towards the next VL (in this case, (L4)), continuing the planned path. Accumulative trajectory errors are now larger at (L4), but are eliminated as soon as the AA finds it. On the right, after losing itself, the AA tries to find any VL. After finding (L4), it can re-plan its path to the goal. Again, note that equal VL cause confusion, This confusion can only be solved by some distinctive factor, such as an unique VL or some expectancy addition. This later biological possibility is explained next.

IV - Project development

97

VI-6.5 Expectancy addition Expecting to see a certain VL when going to a certain place, can greatly reduce ambiguities and

recognition failures. The expectation idea is simple: when the AA starts its planned way towards a VL, this VL will receive an expectancy excitation as it approaches. This is similar to what happens with animals, where really traveled paths make important contributions to PN firing [Hetherington & Shapiro, 1993] [Eichenbaum, Wiener, Shapiro & Cohen, 1989] [McNaughton, Barnes & O’Keefe, 1983] [Wiener, Paul & Eichenbaum, 1989] [Muller & Kubie, 1987] [Muller, Kubie & Ranck, 1987] [Foster, Castro & McNaughton, 1989].

In other words, this excitation could be expressed as:

Ed

ei

Li

=+

1

1r Eq. 25

This activity tends towards 0.0 when the AA is getting away from the corresponding VL. When the AA approaches the VL, then the value tends towards 1.0. Note that only one VL spreads the most expectancy activity at a time, that is, the next VL in the desired path, just like in the biological HC. The others get active when this one is reached, and so on. Seen from a different point of view, imagine that the animal seeks a certain VL. Since it will probably move closer to that particular VL, other expectations will fade out and this particular VL will get active telling the animal that it is getting closer. If the animal erroneously approaches another VL, then the desired VL neurons will not get as much as active, telling the animal that it missed the place, even if others get active. Note that this is not a potential field approach. Missed VL can then elicit the motivation to seek another and shift attention to other VL neurons (PN). The centers of the VL could be used for the centers of the expectancy peaks.

Expectancy

0

0,5

1

1,5

Radial distance

Act

ivit

y

Fig. 108 - Illustration of the expectancy mechanism in operation when the AA travels in the VL sequence (1 - 2 - 3 - 4) (on

the left). Expectancy activity can be imagined as having a circular fading shape, with the peak centered on the VL in question. When the AA approaches VL (2), the displacement vector (

rd L2

) takes successively lower magnitudes and ( E e2) grows. This

activity can then be used to somehow facilitate the VL recognition, by biasing it relatively to the others. On the bottom right, a sample curve is shown, according to the previous expectation activity equation. Note that the activity is never zero. Note also that, if the AA happens to fail a VL it was heading to, the expectancy mechanism for all the others works independently from that failure, allowing the AA to try to head towards the next one. This is illustrated on the top right, where all VL have some distance to the AA at all times, which corresponds to different expectancy values. Eventual competition mechanisms could even mask out only the most active expectancy, to avoid the others from interfering.

Since this expectancy mechanism depends on the distance to a VL, it is the same to say that it depends on real motor movements performed by the AA. This is similar to the observed physiological behavior of animals. Furthermore, one could speculate about the function of vision and motor activity: visual VL contact could be the learning trigger for motor displacements that led to that VL. Then, PN

IV - Project development

98

continue to fire even without visual contact, because of the motor association, but without updating possible changes in the VL position. Only visual activity can trigger this updating as well.

Note that this expectancy mechanism gives the AA the feeling of context in its environment. Context is a key feature that allows Humans to disambiguate between situations, without even noticing the ambiguity that would arise without context. Comparing to memory usage to improve the efficiency and survival rate in reactive systems, here one uses context information to greatly improve the efficiency of the higher-level mapping system.

VI-6.6 Final aspects During activity propagation, confidence values are considered for the spreading wave-fronts. These

confidences depend on the distances between VL and on the animal’s experience on those links. For example, the confidence due to distance called intrinsic confidence, could be defined as follows for each link:

Cdd

i

=1 Eq. 26

In words, the intrinsic confidence of each link decreases with increasing distance, which is logical to consider, since higher distances generally mean larger accumulated errors during travel. The total intrinsic confidence for a path would be:

C

d

T

jj

N=

=

11

1

Eq. 27

where N is the number of links that form the path. Note that the total confidence is the inverse of the total distance, as it should be, to allow correct path comparisons in terms of distance.

The other type of confidence, called experimental confidence, is a value that ranges from 1.0 (high confidence) and 0.0 (no confidence). It depends on the animal’s own experience, like dangers, punishments, ease of travel, etc., that exist for each explored link. This confidence will be denoted by Ce.

To compute the overall confidence value of a certain shortcut, detour or normal path, it is just necessary to add up all distances and then apply the experimental confidences as well:

Cd

Ci

i

ej

= ⋅∑ ∏

1 Eq. 28

This is not very interesting because each link should have its confidence values propagated, instead of propagating and summing up the distances, for a inversion at the end. It does not sound very biologically plausible, nor elegant, because it is not continuous. Instead, one has to find a simple formulation that allows a continuous computation.

First, the total link confidence will be defined as follows:

CC

d d C di

ei

i i ei vi

= = =1 1 Eq. 29

IV - Project development

99

In words, because Ce lies between 0.0 and 1.0, it acts as enlarging the actual link distance, giving rise to the virtual distance (dv). Now, the successive total path confidences CT are given by:

CC

d dT

e

v

11

1 1

1= =

Cd

C

d

C

d dT

e e

v v

21

1

2

2

1 2

1 1=

+

=+

Cd

C

d

C

d

C

d d dT

e e e

v v v

31

1

2

2

3

3

1 2 3

1 1=

+ +

=+ +

Generally, we will have:

C

d

Ti

vjj

i=

=

11

1

Eq. 30

to compute the next confidence Cti+1 from this one, one has to relate both:

C

d d d

d

d

d

C

dC

Ti

vjj

i

vjj

i

vi

vjj

i

vi

vjj

i

Ti

vi

Ti

+

=

+

= +

=

+

=

+

= =

+

=

+ ⋅

=

+ ⋅∑ ∑

1

1

1

1 1

1

1

1

1

11

11 1

11

11 1

11

1

Eq. 31

So, to propagate the wave-front strength (confidence), each node has to compute it from the incoming wave-front strength, leading to an iterative computation method:

CC

dC

Ti

Ti

vi

Ti

+

+

=

+ ⋅1

1

11

Eq. 32

Some factors that may influence each link’s experimental confidence, could be the following: • Successful link traversals enlarge the confidence.

• Not traversable obstacles or unsuccessful traversals lower the confidence.

• Dangers such as holes, punishments (bumping into black object which IR does not see), and other hazards, lower the confidence.

IV - Project development

100

VI-6.7 Comments Activation spreading starting at the desired goal node was chosen, because of appearing to be the

most plausible. This mechanism seems to explain some interesting Human and animal behavioral observations. Since activity propagation is made at every intermediate VL the AA achieves, this mechanism is opportunistic and flexible to changes, i.e., it is not a fixed plan that the AA must follow to the end. Theoretically, this mapping strategy does not need VL to be “edge-connected” like in [Doty, Caselli, Harrison & Zanichelli] [Doty & Seed, 1994]. Knowing the reference, the AA may turn into any direction in a controlled manner.

Expectancy activity may be an indispensable tool to give the AA the notion of context in an environment. It may solve most ambiguities that arise when trying to recognize a VL. This concept of context was also important in the VL recognition ANN.

This map construction, update and use mechanism, was implemented in the present AA, but never tested due to lack of time and VL recognition related problems.

In a possibly far future, the efficient use of cameras for VL recognition would potentially eliminate the need of VL circumvention, allowing an AA to travel more freely in open space, instead of having to contact with VL to orient itself.

CCCCHAPHAPHAPHAPTER TER TER TER VVVV

“To dream is beautiful, but if it does not convert into reality, it is useless” - Anonymous

V - A real implementation

102

After all the previous considerations, studies and exposures, its now time to proceed to the steps of a

real implementation of an AA with advanced navigational capabilities. Since this work was done incrementally, step-by-step, the most important breakthroughs and failures will be exposed here. The aim of this sequencing is to allow the reader to follow the path taken by the author to solve the different problems.

VI-1. THOMAS The AA will be based on a previously assembled platform called “THOMAS” [Snow, 1995]. THOMAS

was an AA implemented with the purpose of landmark circumvention and recognition.

VI-1.1 Initial platform When THOMAS was handed in for improvements made throughout this work, it already had several

hardware features: • 7 IR sensors - 2 on the left side, 2 on the right side, 1 at the front center, and 2 front cross-eyed. • 1 Sonar - 1 sonar pair mounted on a tilt/rotate head on the front. • 2 Motors - 2 motors at the front and 1 caster wheel at the back. • 2 Shaft-encoders - 1 shaft-encoder per wheel • CPU board - 68HC11 microcontroller [Motorola] with 64Kbyte of RAM and all necessary drivers.

Fig. 109 - Seven IR sensors are distributed around the platform. Three sensors at the front and two sensors on each side, make up a good sensor suit for wall-following and obstacle-avoidance.

V - A real implementation

103

The 68HC11 microcontroller is a very versatile and powerful unit and there is variety of books about it and its applications [Driscoll, Coughlin & Villanucci, 1994] [Jones & Flynn, 1993] [Miller, 1993] [Motorola, 1991] [Peatman, 1989].

Very simple and low-cost IR sensors were used, instead of more complicated and unnecessary sensor processing like optic flow [Branco, Costa & Kulzer, 1995] [Lappe & Rauschecker, 1992] [Wang & Mathur, 1989]. Simplifications like this are fundamental, due to limited processing power.

Both motors were driven by Pulse Width Modulation (PWM), which allows the motors also to rotate at low speeds with high torque. It is important for these motors to overcome local floor irregularities. This mechanism is similar to the neural mechanism presented for the same purpose [DeWeerth, et al, 1991].

One must be very careful with the information the sensors give, because wrong or deficient operation can lead to hard to trace problems. After some experiments, [Rossey, 1996] concluded that his AA had trouble with the information that the cross-eyed sensors were giving. There was phantom obstacle detection due to interference between sensors. Collimating the sensors seems to be the best solution. Another serious problem is the saturation and reversal of some of those IR sensors, where the analog reading may even decrease with very near obstacles. This kind of problems may seriously confuse the built-in behaviors and lead to an unexplainable behavior.

VI-1.2 Preliminary improvements THOMAS received several improvements before going to implement the objectives of this work. The

aim of these improvements where to enhance debugging and communication capabilities, and will be coarsely explained next.

• Liquid crystal display and keyboard - an LCD was mounted on top of the existing 68HC11

board, for debugging and messaging purposes. Also, for command input and debugging, a keyboard with 3 x 4 keys was mounted.

• RF transceiver - one RF emitter and one RF receiver were mounted on the AA for Personal Computer (PC) link option. The purpose of this link is to enable on-screen debugging, command input and graphical visualization of internal mapping structures. The transmitter and receiver modules are manufactured by [Aurorel]. Their operating frequency of 433 MHz at 10 mW works fine in Portugal, and respects the requirements of the official authorities on telecommunications for unlicensed RF operations. They work well at a speed of 9600 Baud, over a distances that can reach more than 10 meters inside a house.

• Downloader - a new high-speed downloader was implemented in Delphi [Borland, 1996a-e], which allows downloads to be made at a speed of 115.200 Baud over a physical line. Downloads can also be executed over the RF link, but only at 9600.

• Auto-start - to avoid having to press buttons and using jumpers to select between download mode and reset, a very simple circuit and software patch were added to control this task. A single press on the reset button causes the CPU to place itself in download mode. When the program is fully loaded, the software patch causes a “controlled reset” by forcing a COP failure [Motorola, 1991]. After this artificial reset, the CPU will run the user’s initialization and main program.

V - A real implementation

104

Fig. 110 - Top view and photograph of the AA with the various visible enhancements.

VI-2. Implementation and simulations The real implementations of the different modules now follow. Some simulations are also made to

fine-tune the parameters of the modules.

VI-2.1 Programming platform considerations All the code for the AA has been written on the ICC11 v3.5 C compiler / assembler package from

ImageCraft [ImageCraft, 1996], in detriment of the relatively slow IC (Interactive C) [Sargent & Wright, 1994] which is interpreted. Furthermore, this ICC11 compiler is much more powerful at the language level. The major features of this package are the following:

• C cross compiler for the Motorola family of HC11 microcontrollers [Motorola].

• Very powerful command-line tools for the DOS operating system, and an IDE - Integrated Development Environment for Windows.

• Full ANSI C compliance, except long types which are still 2 byte only.

• Standard library functions available, except those which do not make sense in an HC11 embedded application.

• Full floating point math support. These functions are non-reentrant, thus one must be careful when using them in interrupts and interruptible code.

The major guidelines during development were the following: • Write code as readable as possible, using appropriate type definitions, variable and function

names. Furthermore, appropriate data structures were used, which encapsulated self-contained data items for recognition and mapping. In other words, there are no external variables lying around and that should be inside those structures.

• Most modules are written in C language, while the more time dependent modules are written in assembly. Also, C language was preferred in intermediate situations, where readability was important.

V - A real implementation

105

VI-2.2 Landmark discrimination and recognition As explained in the previous chapter, this module will contain a 1D ring network with temporal activity

propagation, and a averaging motor-cortex as a preprocessor for incoming motor information.

VI-2.2.1 Simulations Simulations were made in MatLab© v4.2b [MathWorks, 1994] with a small difference in the motor-

cortex: instead of having different tuning-curves centered around a discrete set of equidistant turning angles, the tuning-curve is centered around the tested synaptic circuitry learned turning angle. This simplifies the necessary activity propagation code to one single line:

( )( )[ ]Match Link Input

Out Match Wavefront

= ⋅ −

= ⋅

max ,cos0 α Eq. 33

where Out is the propagated neuron activity (output) which is always non-negative. Link is the previously learned (stored) turning angle in the synaptic circuitry, Input is the current motor input (turning angle), and wavefront is the incoming wavefront activity. Constant α is a gain factor that determines the width of tuning-curves. Almost all simulations proved α=0.6 to be a good approximation.

This approximation should give a good perspective of what is going to happen in the AA. Note that these simulations only provide a coarse feeling to the operation of the 1-D activity propagating network.

To simplify visualization, the wavefront activity progress graphs will display each wavefront as a line, instead of displaying each neuron’s activity which is diagonally peaked as seen previously.

Fig. 111 - The left topmost picture shows two test VL. One is slightly deviated from the other, to allow easy visualization. The principal samples have length 1. The right topmost picture show the corresponding turning angles for each shape. These are the learned ones, assuming that the learning circumvention path started at the bottom left corner of the two VL. Since the starting sample is perfectly synchronized for this first simulation, the circumvention turning angles are also perfectly overlapped to the turning angle templates. Here, the principal samples are denoted by (+) signs. Note that the first differentiating corner appears only after circumventing more than half of the VL. One VL has 8 samples and the other has 9. There is no noise, deviations and sub-sampling yet. The middle and bottom graphs show the different wavefront activity values over time (as the AA circumvents the black VL) for each VL (left graphs for black VL and left graphs for the dotted one). As one can see, while for the first network there is one unique wavefront that survives, all of them die out in the other. Note that while the AA is circumventing the VL, both principal wavefronts remain at maximum activity. Only when the AA passes the differentiating corners, the principal wavefront of the second network dies out, as expected. This works exactly as in the blind person who assumes that the circumvented VL could be either one template, until she reaches the differentiating features. Note also that the surviving principal wavefront is the leftmost one, that is, the one that started with the first learned corner (bottommost picture).

V - A real implementation

106

Fig. 112 - This is the view of the firing sequence of the neurons, just like in the first simulation by hand. The peak of the principal wavefront travels along the entire circular ANN.

Fig. 113 - A zoomed simulation for 2 cycles, clearly shows the second network wavefront death after the differentiating corners. This means that, after less than one complete circumvention, the VL are discriminated and recognition can be given as completed.

Fig. 114 - The same as before, but with a principal sampling deviation of 4. In other words, the AA started sampling 4 samples advanced, in relation to the previously learned sequence. There is still a principal wavefront with the same characteristics, but with a different index. Now, the forth wavefront starts off synchronized, and that is why only this one survives. The other network behaves similarly to the previous case. The only difference is that its wavefronts also have a shift in index. These shifts can be easily seen in the bottom 3-D graphs.

V - A real implementation

107

Fig. 115 - Same as the first case, but already with refinement. No deviations yet, however. This is only to show that the mechanism works still the same way, with one principal wavefront clearly indicating that the first VL has been recognized. There are two “side-effect” wavefronts however, that progress laterally to the principal wavefront. Because of this side-effect, the activity ratio shown at the bottom is not as good as in previous examples. Still, it allows excellent discrimination. A sub-sampling rate of four (refinement between principal samples, should be adequate for sampling distances of the length of the AA, since it does not turn much during this length.

Fig. 116 - These graphs are transversal cuts of the previous bottom graphs. Here one can better see the progress of each wavefront. Each graph contains the progress for 5 sub-sample instants with the following indexes, from left to right, top to bottom: 1-5, 6-10, 11-15, 16-20, 21-25, 31-35, 61-65. Even when all non-principal wavefronts have died out, there still remain at least two lateral ”phantom” non-principal wavefronts around the principal one. This is simply because the sub-sampling refined the transitions between pairs of turn angles, which allows lateral wavefronts to still catch some matching. This effect can play a major role in path tolerance, when the AA globally deviates from the original path. In other words, the lateral wavefronts catch small local deviations, allowing large global deviations.

V - A real implementation

108

Fig. 117 - Here is an example of a circumvention path where the AA started approximately at the middle of the previously learned principal samples, further including a deviation of two sub-samples between principal ones. Still, as expected, the performance did not change significantly.

Fig. 118 - The final tests: noise added to each turn with uniform distribution (the worst) and maximum amplitude of 22.5o. Even in this bad case, where the turn angles are twisted, it is still visible that discrimination remains acceptable. The only problems that can be caused are related to the high activity of the lateral phantom wavefronts, which can confuse the AA of where it finds itself on the recognized VL.

Fig. 119 - Two different cases for one cycle only. While in the first one there is an activity ratio (discrimination factor) of over 2.0 (0.9502 / 0.467), in the second one this ratio is less than 1.77 (0.8268 / 0.4676), but the AA can still distinguish between the two VL. Note that the noise values are very large. Note that the left 6 graphs refer to one experiment and the 6 right ones to another one, each group testing the two VL as in previous experiments.

V - A real implementation

109

VI-2.2.2 Particular cases

Fig. 120 - When a VL has symmetry like in this example, the AA can still recognize it as before, but is now unable to tell where it is. In other words, the AA does not know at which one of the symmetric positions it finds itself. The same would happen to a blind person. The only way to go around this, is to have an extra information source, like another sensor to detect differences between those symmetric positions. As expected, the ANN forms four different principal wavefronts for a square VL, each one representing one possible position. The distance between positions equals the distance between two corners. Note that symmetry is related to the -90o turn at each corner. The left picture is a cut of the left bottom one, where one can see the principal and lateral wavefronts. An extreme case of symmetry would be a circular column, where the AA would have no idea of its position around that VL.

Fig. 121 - In this example, where noise is large, the principal wavefront did not manage to always stay above the others. At the beginning of the circumvention, the other wavefronts surpassed it, but then died out quicker. The principal wavefront (top line) stayed more active then all others, after a while.

Fig. 122 - Two very similar VL can cause trouble to the AA when noise in large. On the left picture, it happened that the AA recognized the incorrect second VL, whereas on the right it recognized the correct first one.

V - A real implementation

110

VI-2.2.3 Final experiments Finally, some experiments are going to be presented, to show some possible ways of improving the

biological plausibility of these activity propagation ANN. These guidelines would require further research to achieve the desired system.

Fig. 123 - In these cases, the propagation mechanism was modified to include some extra local neuron activity: in the left case, the neuron’s own matching activity is added, while in the right case only the own neuron’s noise is added. In both cases the neurons start with zero activity, instead of maximum initial activity. The principal wavefront now form in that it gains activity over time. These activities tend to saturate over a long period of time. The problem with this kind of mechanism is that one must be very careful when dimensioning the gain values. For different VL there is the need to retune those values, to avoid undesired saturation, oscillation or death of multiple wavefronts. It is very brittle, but with some more research it could eventually lead to a better and more biologically plausible model for these ANN.

In the above cases, the equations used were the following:

( )( )[ ]( )( )[ ]

( )

Neuron Link Input

Match Link Input

Out Neuron Match Wavefront

= ⋅ ⋅ −

= ⋅ −

= + ⋅

05 0 10

0 0 5

. max ,cos .

max ,cos . Eq. 34 - Left above case

and

( )( )[ ]( )

Neuron

Match Link Input

Out Neuron Match Wavefront

=

= ⋅ −

= + ⋅

0 01

0 0 6

.

max ,cos . Eq. 35 - Right above case.

All meanings remain the same. The only difference between them is in the source of the neuron’s own activity contribution. While in the first it also comes from the matching value, in the second it is simply noise. Both modifications lead to similar results.

Fig. 124 - Depending on noise, the ANN saturates to different values and at different speeds.

V - A real implementation

111

Ratio vs. Error

0

50

100

150

0 0.5 1 1.5 2

Maximum error

Rat

ioPeak Activity vs. Error

0

0.2

0.4

0.6

0.8

1

0 0.5 1 1.5 2

Maximum error

Pea

k ac

tivi

ty

Ration vs. Peak activity

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80

Discrimination ratio

Pea

k ac

tivi

ty

Grf. 3 - On the left is the plot of the resulting discrimination ratios with increasing angular error. The ratio starts at 20 and then scatters for increasing errors. Note that all the three graphs are relative to simulations for 2 cycles, a refinement of 4 and a maximum uniform angular error in radians. In the center, the peak activity of the correctly recognized VL decreases from 1.0 and scatters at the same time. The trend seems to be a general decrease, as expected. On the right, the peak activity vs. Ratio trend seems to be a line starting at (1,20) and ending at (0,0), also scattering very much.

VI-2.3 Map construction

Shaft-encoders

Compass

Path integration

Motor cortex

Working memory

Landmark storage

Map storage

IR Sensors Landmark behaviors

MotorsGoal

behaviors

High-level planning

Cht. 10 - Block diagram of the overall structure that leads to map building inside the AA. The whole structure was implemented in software on the 68HC11, but never tested due to time-consuming problems with the circumvention and VL recognition processes. “Landmark behaviors” refer to the finding, docking and circumvention behaviors, while the “goal behaviors” refer to the directed VL search. The “working memory” refers to the temporary ANN that is constructed while circumventing a VL. If this VL is new then it is stored, and otherwise trashed. The “high-level planning” is an entity that controls the sequence of actions taken by the AA (currently controlled by the designer).

V - A real implementation

112

VI-3. Real tests, calibrations and results Now, some tests, calibrations and experimental results are going to be presented, where the real

performance of the implemented modules will be shown.

VI-3.1 Basic reaction mechanism The obstacle-avoidance mechanism worked well, although it is probably not going to be used. As in

[Branco & Kulzer, 1995] it was smooth and accurate. The wall-following mechanism worked very smoothly on straight walls and concave corners, but it did

not perform well on convex corners mainly because of a sensor disposition limitation. To allow the AA to go on to the next tests, a little help was given in these cases.

VI-3.2 Internal compass and path integration First of all, the compass constants must be calibrated. Since it was shown that the calibration of

distance measurement is not necessary, this calibration is only going to be made to show the accurateness of the overall shaft-encoders based dead-reckoning system.

595 606 596 604 605 613 610 596

Tab. 7 - Setting the compass distance constant at 1.4 mm per shaft-encoder tick, these were the experimental results of the accumulated traveled distance over a pattern distance of 600 mm. This gives a mean distance of 603 mm and a standard deviation of 6 mm . Since this is not critical and keeping in mind that the AA stopping point is difficult to determine, it is going to be left like this. Note that, from now on, all distances will be expressed in millimeters, whereas angles will be in degrees.

Fig. 125 - Illustration of the X-Y coordinate system and quadrant disposition in relation to the AA. Whenever the AA travels along a path, the resulting path integration displacement vector is taken from the origin, where the angle counts positive in counter-clockwise direction. (A) is the displacement vector magnitude, whereas (β) is the corresponding angle.

Fig. 126 - When the AA turned a complete circle of 360o or circumvented a VL with about four meters in perimeter, and starting with zero displacement and angle, the final observed displacement error is measured. This experiment assumes a wheel distance constant such that the compass angular error is practically zero. With this constrain, (X) and (Y) errors are measured externally and compared with the internal values. The results are shown in the table below.

V - A real implementation

113

Trials Average

Xinternal -58 -111 +38 -139 -88 -123 -81 -80 mm Xinternal = internal displ. X value

Yinternal -90 -45 -54 -22 -45 -101 -65 -60 mm Yinternal = internal displ. Y value

Dinternal 108 119 66 140 98 159 104 113 mm Dinternal = internal displ. magnitude

βinternal -122 -157 -54 -170 -152 -140 -141 -133o βinternal = internal displ. angle

Xreal -90 -130 +10 -160 -20 -60 -140 -84 mm Xreal = external measured X displ.

Yreal -90 -40 -70 -30 -50 -90 -60 -61 mm Yreal = external measured Y displ.

Tab. 8 - These are the experimental results of the internal compass and path integration mechanisms operating together, with a wheel distance constant of 166 mm to achieve zero compass error. As it is shown in this table, although the AA knows that it rotated 360o and stopped, it is actually not at the starting point. There is always a residual displacement reflected by (Xreal) and (Yreal), externally observable. Besides knowing the compass heading, the AA also knows approximations of that residual displacement via the (Xinternal) and (Yinternal) internal path integration values. It is astonishing how the average of these last ones is near the average of the external measurements. Relative to the external average values, the internal values had an average error of 5% for the (X) value, and 1.7% for the (Y) value. Relative to the total path of nearly 4000 mm, these errors go down to 0.025% and 0.1% in average, respectively. The maximum relative errors are of 1.7% and 0.4% respectively. These last percentages are the really useful ones, and are not bad at all. Note that the residual displacements are not errors, since the AA knows they exist.

Trials Average

∆α -5 -7 -20 +1 +10 -8 +13 -16 -12 -12 -5.6o D 341 122 209 172 153 128 421 171 472 128 232

Tab. 9 - Results from 10 trials of a rectangular VL circumvention, with a perimeter of about 4 m. The maximum compass error lies below 6%, while the maximum displacement error lies below 12%. These results seem to be better than in [Seed, 1994].

Fig. 127 - When the AA starts from a “Home” position (X) and is left free to avoid obstacles (*), as soon as the command of

going home is issued through the keyboard, it returns home in a more or less straight line. The final displacement errors (∆d) are shown in the table below.

Trials

Total Distance (m) 3.8 3.0 3.6 2.0 2.1 2.7 4.4 5.0 5.7 3.3 4.0 Total turns Φ ∆T i

i

=∑ α (o) 240 260 220 180 180 180 320 420 200 650 700 Displacement error (∆d) (cm) 25 19 27 20 12 6 21 50 30 60 9

Tab. 10 - After different path lengths and total absolute turn angles, the AA stopped near the home starting point, with variable displacement error. Since this error depends much more on the total absolute turn angle, it makes more sense to analyze it relative to this total angle. The worst relative homing displacement error was about 20%, for the 650o turn angle case. With normal turn angles of less than 250o the error was no more than 10%. Talking about “normal”, it is meant that usually the maximum the AA would do, would be to simply turn around and go straight home or until it finds a VL were it circumvents it. For the really common cases were the AA just travels from one VL to another, without making more than 90o to get off of the current VL, the errors go well below 4%.

V - A real implementation

114

VI-3.3 Landmark discrimination and recognition Some tests are going to be performed on two VL, to demonstrate the discrimination, recognition, and

robustness of the designed neural mechanisms.

VI-3.3.1 Almost square landmark shape

Fig. 128 - After the previous calibrations, some test circumventions were executed around a rectangular (almost square) VL with side lengths of 72 and 82 cm, which give AA side paths of about 102 and 112 cm respectively (see above picture). The motor cortex was set to have a resolution of 22.5o, so being able to discriminate between the angles of -90o, -67.5 o, -45 o, -22.5 o, 0 o, 22.5 o, 45 o, 67.5 o, 90 o. This means that, when an angle falls between two classes, then the nearest one will be stored (winner). The distance between stored turn angles, was set to 250 mm, slightly more than the AA’s length. The refinement (angular samples between each turn storage) was set to 4.

Trial Turn angle sequence Count Observation

Run 1 67.5 - 0 - 0 - 90 - 22.5 - 0 - 0 - 45 - 22.5 - 0 - 67.5 - (-22.5) - 0 - 22.5 14 Stopped normally

Run 2 67.5 - 22.5 - 0 - 45 - 45 - 0 - 0 - 67.5 - 22.5 - 0 - 22.5 - 67.5 - 0 13 Stopped too early

Run 3 67.5 - 0 - 0 - 67.5 - 0 - 0 - 45 - 45 - 22.5 - 22.5 - 45 11 Stopped too early

Run 4 90 - 0 - 0 - 45 - (-45) - 0 - 67.5 - 22.5 - 0 - 45 - (-22.5) - 0 - 0 - 90 14 Did not stop

Run 5 67.5 - 22.5 - 0 - 22.5 - 90 - 0 - 0 - 22.5 - 45 - 0 - 0 - 67.5 - 0 - 0 - 0 - 67.5 16 Did not stop

Run 6 67.5 - 22.5 - 0 - 45 - 45 - 0 - 0 - 45 - (-22.5) - (-22.5) - 45 - (-22.5) - 0 - 0 - 45 15 Did not stop

Run 7 90 - 22.5 - 0 - 0 - 67.5 - 0 - 0 - 67.5 - 22.5 - 0 - 0 - 90 - 0 - 0 - 67.5 - 22.5 16 Did not stop

Run 8 22.5 - 0 - 0 - 45 - (-22.5) - 0 - 67.5 - 22.5 - 0 - 22.5 - 67.5 - 0 - 0 - 22.5 - 67.5 - 0 16 Did not stop

Run 9 67.5 - 22.5 - 0 - 45 - 22.5 - 0 - 0 - 67.5 - 0 - 0 - 67.5 - 22.5 - 0 13 Stopped normally

Run 10 67.5 - 0 - 0 - 67.5 - 22.5 - 0 - 22.5 - 22.5 - 0 - 0 - 90 - 0 - 0 13 Did not stop

Run 11 67.5 - 0 - 0 - 67.5 - 22.5 - 0 - 0 - 67.5 - 0 - 0 - 45 - 22.5 - 0 - 0 - 45 15 Stopped normally

Run 12 45 - 0 -0 - 22.5 - 67.5 - 0 - 0 - 45 - 0 - 0 - 22.5 - 67.5 - 0 - 22.5 14 Did not stop

Run 13 90 - 0 - 0 - 45 - 45 - 0 - 22.5 - 45 - 0 - 0 - 45 - 45 - 0 - 0 14 Stopped normally

Tab. 11 - Stored angle sequences for the test runs, and corresponding stopping conditions. Many circumventions did not even stop, due to large dead-reckoning errors (more than 200 mm of a total of 4 m perimeter). The angles are given in clockwise direction. As it can be seen from this table, in each corner of the VL the AA turned a sum of approximately 90o as expected. However, due to the limited resolution of the motor-cortex, sometimes it is 67.5o or even 112.5o. Also, the sides are stored as having 0o turn rate. Generally, a great diversity can be observed, that would certainly be very difficult or even impossible to cope with in conventional recognition mechanisms. These test results show that the stopping criteria is very brittle, and that something better must be implemented in future work. Note that the AA started at different points of the VL, which does not influence the recognition, as demonstrated in the previous simulations.

V - A real implementation

115

Fig. 129 - On the left, are the reconstructed VL from the stored angular turn sequences showed in the previous data table. The numbers correspond to the trial. The asterisk shows the sample starting point. Note that, since these reconstructions inherently accumulate global errors, they clearly show that the final accumulated error is huge and inappropriate for recognition by mechanisms that rely on global data. Recall also that the herein applied neural recognition mechanism purely relies on local relative data. On the right picture, the ideal stored VL is shown, from the observer’s reconstruction point of view.

Now, to test the recognition power of the AA, some other circumventions will be made, were now six sequences of angular refined samples were recorded and shown:

-90

-40

10

60

-90

-40

10

60

-90

-40

10

60

-90

-40

10

60

-90

-40

10

60

-90

-40

10

60

Grf. 4 - These are the angular sequences of six recognition circumventions, i.e. refined angular turns. Here, the angles are in the correct anti-clockwise format. One can easily differentiate between the sides and corners. Also, many corners do not reach the 90o detection at all, spreading out into a sequence of smaller angles. As an example, the angle sequence of the left topmost graph is as follows (positive angles): 22.5 - 22.5 - 45 - 67.5 - 67.5 - 67.5 - 45 - 22.5 - 0 - 0 - 0 - 0 - 0 - 22.5 - 22.5 - 45 - 45 - 67.5

- 45 - 22.5 - 22.5 - 0 - 0 - 0 - 0 - 0 - 22.5 - 22.5 - 22.5 - 45 - 67.5 - 67.5 - 67.5 - 45 - 45 - 22.5 - 22.5 - 22.5 - 22.5 - 22.5 - 22.5 - 22.5 - 22.5 - 45 -

67.5 - 67.5 - 67.5 - 45 - 22.5 - 0 - 0 - 0 - 0 - 0 - 0 - 0 - 22.5 - 22.5 - 45 - 67.5 - 67.5 - 45 - 22.5 - 22.5 - 0.

Recognition 1 Recognition 2 Recognition 3 Recognition 4 Recognition 5 Recognition 6

Run1 0.75 0.76 0.80 0.87 0.90 0.82 Run2 0.62 0.56 0.59 0.62 0.49 0.70 Run6 0.52 0.74 0.65 0.56 0.69 0.51 Run12 0.90 0.82 0.90 0.80 0.82 0.80

Tab. 12 - Experimental recognition results for the previous graph sequences. Activities of the principal wave-front after one recognition circumvention around the VL. The smallest activity was 0.49 and the largest was 0.90. To correctly discriminate between VL, 0.49 should still be sufficient. Further tests will eventually confirm this expectation. Note that these values can be looked as being certainty values, since they vary from 1.0 (100% surely this VL) and 0.0 (100% surely not this VL). Note, for example, that RUN1 present consistently better values than RUN2. Observing in the previous shape reconstruction, that the second one is more similar to the original VL than the first one, it is clear that the difference lies in the “stopped too early” problem observed. Because there is a gap of continuity in the resulting turn angle sequence, there will also exist a major mismatch somewhere in the first side of the VL.

V - A real implementation

116

Grf. 5 - Experimental results graphs relative to the stored RUN12 and RECOGNITION1 recognition sequence. On the top left graph, all the wavefronts are shown, where one can see that 4 or 5 have the largest activity. Note that, since the VL was almost a square, and according to the previous simulations’ section with symmetric VL, it is expected that four principal wavefronts will arise. On the bottom left graph, the front view of the 3D wave progression is shown. There, the four expected wavefronts are clearly visible. The interesting thing is that there are two with more activity than the other two. This means that the AA was actually able to discriminate between the two pairs of symmetric positions in the rectangular VL (almost square). In other words, it was able to recognize it as really being a rectangle. On the right graph sequence, the wavefront activity progression is shown through time. From top to bottom, and from left to right, the activities were taken on the following sub-sample points: 3, 5, 7, 10, 20, and 55. 55 was the last point which corresponded to the stopping point. It is clear that many wavefronts still exist at the beginning, and that die out as the AA gets more sure of where it is on the VL. After less than a fifth of circumvention, the AA already displays the two principal wavefronts that correspond to one of two possible positions on the Vl. At the end, this discrimination get only even better. Note how similar each pair of wavefronts is, with only little differences. This shows that VL symmetries are faithfully represented in the AA. Note that the left bottom and right graphs are shown as a cut through the progression of the wavefronts that come out of the paper. While the vertical axis shows relative activity values, the horizontal one shows the wavefronts’ indexes (PN indexes, also VL PF indexes).

Grf. 6 - Same as above, but for stored RUN1 and RECOGNITION1 recognition sequence. Here, there was a slight bigger difficulty in determining where the AA was in the VL, since at almost half the VL the AA still had some false principal wavefront peaks. Also, in this and the previous experiments, one can clearly see (on the bottom left composed graph) that the principal wavefronts tend to adjust themselves. In other words, they shift slightly as the AA goes, because of early misadjustements. This is exactly the expected phenomenon mentioned in a previous section, where lateral wavefronts (principal wavefronts’ neighbors) tend to compensate for slight deviations, in that they take over as principal wavefronts.

V - A real implementation

117

Grf. 7 - On the left, the recognition graphs for stored RUN1 and sequences RECOGNITION1 through RECOGNITION6 are shown. On the right, the recognition graphs for the stored RUN12 are shown. On all of them one can clearly see the four largest wavefronts. In some, there still exists confusion between the two pairs of possible positions on the VL, but others display two clear pairs of principal wavefronts, where the largest activity pair corresponds to the two possible positions of the AA on the “rectangular” VL.

VI-3.3.2 Second landmark

Fig. 130 - Second VL used for testing. It is composed by the first one, with an extra square put on one side. On the right, the real sofa plus chair shaped VL as the AA goes around. Without the chair, it is VL (1), and with it, it is VL (2).

Trial Turn angle sequence Count Observation

Run 1 67.5 - 45 - 22.5 - 22.5 - 0 - 0 - 0 - 67.5 - 22.5 - 45 - (-45) - 45 - 0 - 0 - 22.5 - 67.5 16 Did not stop

Run 2 45 - 45 - 0 - 0 - 0 - 22.5 - 45 - 45 - (-45) - 45 - 22.5 - 0 - 0 - 90 - 22.5 - 0 16 Did not stop

Run 3 67.5 - 22.5 - 67.5 - (-45) - 45 - 22.5 - 0 - 22.5 - 90 - 0 - 0 - 45 - 0 - 0 - 0 - 0 - 90 17 Did not stop

Run 4 0 - 67.5 - 22.5 - 67.5 - (-67.5) - 67.5 - 0 - 0 - 0 - 67.5 - 22.5 - 0 - 67.5 - 22.5 - 0 - 0 - 0 17 Did not stop

Run 5 90 - 67.5 - 22.5 - (-67.5) - 67.5 - 0 - 0 - 22.5 - 67.5 - 0 - 0 - 67.5 - 22.5 - 0 - 0 15 Stopped too early

Run 6 0 - 22.5 - 67.5 - 45 - (-45) - 45 - 22.5 - 0 - 0 - 67.5 - 22.5 - 22.5 - 45 13 Stopped too early

Tab. 13 - Stored angle sequences for the test runs on the second VL. Again, most of the times, the AA did not detect the stopping point or stopped too early. Note that each sequence has now one negative angle, corresponding to the small new concave corner on the VL*. Again, the sequences were taken starting at different points. Subsequent recognition sequences RECOGNITION1 and RECOGNITION2 were also performed.

* Note that these problems exist only for the learning phase, where correct data should be stored. In the recognition phase however, this is

not a problem since the AA keeps circumventing a VL until it discriminates and recognizes it.

V - A real implementation

118

Fig. 131 - Again, a large variety of reconstruction can be made visible. Also again, the recognition power of the neural mechanism is going to be shown even in these apparent bad conditions.

Recognition 1 Recognition 2

Run1 0.71 0.66 Run2 0.84 0.69 Run5 0.82 0.69 Run6 0.61 0.53

Tab. 14 - Experimental recognition results for the recognition sequences on the second VL.

Grf. 8 - Same as for the first VL. Here however, only one principal wavefront is observed, as expected. Although the VL is not symmetric, there is still a lateral wavefront that may be catching local symmetries. On the top left graph, note how the principal wavefront and neighboring one are now better discriminated from the others. This is better visible below, on the activity progression. Now it is perfectly possible for the AA to determine its position within the VL since there is only one PN most active at a time, whereas all others are less active. This peak will “walk” around the VL as the AA circumvents it (as explained earlier, these graphs suffered a deviation correction, so that each PN appears at the same index line that comes out of the paper). Note that this feature is a side-effect of the neural mechanism used, since there is no additional programming needed to achieve this. On the right, the sequences were taken at the sub-sample points 3, 7, 10, 20, 30 and 60.

Grf. 9 - Same as for the first VL. Sometimes the principal wavefront is more discriminative than others. Still, there is a reminiscence of the four wavefronts which corresponded to a rectangle.

V - A real implementation

119

VI-3.3.3 Third landmark

Fig. 132 - Third VL configuration. It has curves as well as segments. The recognition method will map curves just as it maps

corners, there is no special distinction.

Grf. 10 - As in the second VL, also because this one is not symmetric, there will be one major wavefront, indicating that the

position in the VL is successfully extracted. Note that now only three major wavefronts appear, instead of the previous four. It seems that the AA detects combinations of side+corner with preference

VI-3.3.4 Presence of both landmarks 1 & 2 Now, the decisive tests are going to be performed, where both VL are compared.

LA1/LB1

S1 LA1/LB1

S2 LA2/LB2

S1 LA2/LB2

S2 LA6/LB6

S1 LA6/LB6

S2 LA12/LB5

S1 LA12/LB5

S2

Landmark A (1) 1.51 1.65 0.98 1.13 0.90 1.23 1.61 1.99 Landmark B (2) 1.34 1.37 1.40 1.32 1.22 0.90 1.35 1.20

Tab. 15 - These results show the recognition ratios for different stored links and recognition sequences. For example, (LA12/LB5 - S1) means that the stored links 12 from VL A(1) and (5) from VL B(2) were used, and tested with the recognition sequence (1). The leftmost column indicates which VL generated this sequence, either (A) or (B). For example, a ratio of 1.51 in the first table cell, indicates that the activity from ANN (A) was 1.51 times larger than the one from ANN (B) (recognition successful). There are some bad recognitions (bold), essentially due to stopping point problems, giving a 18.8% error rate.

Fig. 133 - These are the recognition ratio progresses as the AA circumvents the VL and feeds the ANN (referring to the

ratios in the previous table). Note how the ration starts increasing just after a while. This is due to the similarity between both VL until a certain point which depends on the circumvention start position. Observe also that the ratio increases up until a certain point where it decreases very much. By inspecting the corresponding stored turns and recognition sequences, one concluded without a doubt, that this is due to the length mismatch between the two. In other words, wrong stopping points cause strong mismatches in the recognition. One partial solution to minimize this problem, is to accept a recognition as soon as the ration becomes “large enough”. This can also cause errors where the ratio starts going down and then recovers.

V - A real implementation

120

Grf. 11 - After the first circumvention around the rectangular VL (top left), the ANN is still working well, with the four expected wavefronts. However, when the AA tries the second circumvention, the wavefronts die out rather quickly (bottom left). This is due to the frontier effects produced by the misadjustment of the end and beginning of the stored turn angle sequence. Remember that the stopping mechanism of the AA did not work well. This misadjustment causes a strong and persistent mismatch when the AA starts the second circumvention, because now all turns will be persistently too much out of phase (lateral wavefronts do not catch this deviation any more). On the right, two examples of wavefront progression are shown, where time goes from top to bottom (4 circumventions). These examples were made with additive activity increase (similar to [Euliano & Príncipe, 1995]) since it is the only way to allow activity to increase again after a failure, to see where it appears. It is easy to observe that activity wavefronts die out and appear in deviated positions, either to the right or to the left, depending on whether the stored sequence was too long or too short.

Recognition of landmark 1

1

10

100

1000

10000

40 60 80 100 120Tuning-curves aperture

Rat

io

Recognition of landmark 1

0,8

0,9

1

0 100 200 300 400Tuning-curves aperture

Rat

io

Grf. 12 - These are the recognition rates obtained for different tuning-curves aperture angles. The apertures are in degrees and correspond to the angular range from the peak to the zero response. An aperture of 0 means a narrowing tuning-curve, whereas one of infinity means a flat line (everything passes). The left graph shows the case of a wrong recognition, and the right graph shows a correct recognition. While the right one never works well, the right one works even better with narrowing tuning-curves. There is a limit, however: as soon as the tuning-curve is too small to even catch the smallest deviations, then the corresponding wavefront will die out.

VI-3.3.5 Presence of landmarks 1/3 and 2/3

LA1/LC1

S1 LA1/LC1

S2 LB1/LC1

S1 LB2/LC1

S2 LA2/LC2

S1 LA2/LC2

S2 LB2/LC2

S1 LB2/LC2

S2

Landmark A(1) / B(2) 2.78 2.5 1.37 2.18 0.97 0.86 1.51 1.42 Landmark C (3) 1.05 1.33 1.3 1.25 1.4 1.77 1.3 1.39

Tab. 16 - These results show the recognition ratios, either for the (1/3) or (2/3) VL combination. Here, an error rate of 12.5% was achieved. Also, the rates are generally higher than in the previous (1/2) VL combination. This seems to say that this third VL is easier to discriminate from the others.

V - A real implementation

121

VI-3.3.6 Last experiments

Grf. 13 - When the first VL is recognized with increasing angular apertures, then the wavefronts get increasingly focused and lateral ones get weaker.

Grf. 14 - Introducing WTA competition and neighborhood [Kohonen, 1988], one can minimize the lateral and other smaller wavefronts’ propagation. On the left, normal propagation was used, whereas on the right, competition was added. Only the principal wavefronts remain high. This gave much better results for some sequences, whereas others had much worse results. The problem with this approach is that it is seems to be absolutely necessary that the recognition ANN allows for multiple principal wavefronts to appear, even those that are smaller, thus adapting to the uncertainty of the current position of the AA on the VL at a later time (smaller ones still can take over correctly). ANN that employ WTA mechanisms, suffer from the problem of initially choosing the wrong wavefront and then being unable to recover relative to other ANN. Also, competition among ANN is not beneficial since there could be an ANN that starts out winning and does not allow for the correct one to recover in time. These are the reasons why competition was not used.

Grf. 15 - This is the effect of letting the AA circumventing the VL more than once (1.75x) in the store phase. The result is that more than four wavefronts appear (redundant ones). This makes it difficult to know where the AA is on the VL, since it does not have the notion of begin and end. This attempt of storing a redundant ANN, was to overcome the recognition problems that arose because of the incorrect stopping points. It was expected that the AA would behave better, but it did not. The recognition results stayed approximately the same. On the left side, VL 1 still had two major peaks, while on the left, VL B already converged to a single one. It would be interesting to implement a mechanism that would detect the point where redundancy begins (the AA starts circumventing for the 2nd time), and pruning the ANN in a way so that a perfect ring of the VL would be stored.

V - A real implementation

122

VI-3.3.7 Comments All the work done in this issue of VL recognition, was developed in order to present a careful study of

the operating characteristics of this novel neural mechanism. With many graphs and pictures, it intends to give a good insight, allowing directed improvements in the future.

There is much work to be done yet, by modifying thus architecture so it works better and more reliably. It has been shown here that it seems that motor turn information may be insufficient to allow for good and reliable discrimination between landmarks. To achieve better results, one could try using sensory constellations, i.e., more sensors that combine into outputs that somehow reflect specific sensory cue configurations is space. Something similar to [O’Keefe & Nadel, 1978] could be attempted, specially the sensory cortex output lines. Hopefully, places around the VL would be more discriminative and lead to easier to discriminate sensory constellation sequences.

CCCCHAPTER HAPTER HAPTER HAPTER VIVIVIVI

VI - Final words

124

VI-1. Comments Little importance is given to the temporal sequencing aspects of neural mechanisms throughout the

literature. “Thinking temporal” could eventually shed further light on experimental data as well as on plausible neural systems. The most available experimental data only shows static and local neural phenomena related with navigation aspects.

Memory usage and context information adding, still lack interest as well, although being virtually indispensable to build really robust and efficient AA.

Sometimes, little importance is given, also in trying to interconnect initially unrelated theories. Here, some work was done with the purpose of achieving one global theory of mapping (hippocampal, sensory, etc.) mechanisms. When one starts to integrate all the theories and data, many things start to come together, and explains some holes in each individual theory.

This report is relatively extensive, where one of the reasons is the concern of trying to transmitting exactly what the thoughts were. Hitting the same key more than once in different ways, might have accomplished this purpose. On the other hand, showing key development efforts and failures from the beginning, helps to avoid unnecessary error repetitions in eventual future work. It is very important to show the whole path, to understand the goal.

The most important contributions of this work were: • Landmark may be of any shape - not only does the mechanism work for polygonal VL, but also for

round VL or for VL with rounded sections.

• Tolerant landmark recognition - the recognition mechanism is based on a distributed structure that adapts itself automatically to give the best performance possible.

• Position of AA in landmark extractable - unless the VL is symmetric relative to both Cartesian axles, it is easy to know where the AA currently is on the VL.

• Biological plausibility - care was taken when choosing the neural mechanisms, to keep some biological plausibility, through the observation of experimental data and relation to it.

• Flexible shortcuts and detours trivial - the mapping mechanism produces normal paths as well as shortcuts and detours in a simple and direct way.

• Low-cost and low-power robotic platform.

VI - Final words

125

Generally, I think to have shown a plausible neural way of implementing autonomous mapping and navigation mechanisms. Also, I presented some innovative ideas, such as the motor-cortex and the way of propagating activity through the PN which have correspondent PZ around the VL.

The main thrust in this work, was generated by the need of thinking about new neural solutions for the problems of mapping and navigating with a map. Previous research was only a starting point to see where other mechanisms fail and where they really explain biological data. During the real implementation of the present AA, I tried to keep as faithful as possible to biological experimental data, also falling into necessary solutions that surprisingly appear to be present in animals.

VI-2. Lessons taken from this thesis • Bibliographic research turned out to be the most important part of this thesis, since it allowed to

get a global and detailed idea of what has been done and what have been the results. Without this, the future practical implementation may be of little use, redundant, not contributing, and even wrong.

• Novel theories, explanations and breakthroughs are achieved by means of many hours of serious sequential thought. If one tries to solve a theoretical problem, one must stick on it toughly.

• Sometimes, like in Backpropagation, exploration and reinforcement learning, etc., one has to “climb a hill” to descent afterwards more than the previous “knowledge valley”. Sometimes it is necessary to complicate something to get a different theory that is more simplifyable than the previous one. Simplifying existing theories and methods not always lead to the best minimum.

VI-3. Future work and improvements • Dead-reckoning mechanism - the shaft-encoder counters overflow if the AA is left running for

more than 15 minutes at full speed. This is not a serious problem if the AA resets those counters every time it finds a VL, and having in mind that accumulative errors of 15 minutes are destructive anyway. Nevertheless, this could be enhanced by letting the counters have more resources or by implementing some mechanism that keeps them at low values.

• Automatic compass constants - instead of calibrating the compass constants by hand, the AA could make this itself by some direction matching mechanism. At the beginning, the AA could perform some compass auto-calibration maneuvers. It eventually could also update this calibration as it explores the environment. The question is, how to make this possible. It certainly would give more flexibility to the AA, since it could auto-correct slightly damaged sensors.

• Recognition ANN - it would be nice to incorporate all recognition networks in only one global ANN, where different activity paths would exist for each VL. This would also keep memory requirements to a minimum, since neurons with similar synaptic circuitry (similar turning angles) could eventually be shared among different paths. The biggest problem would certainly be related to the process of extracting the different VL recognized. One could also implement something like in [Euliano & Príncipe, 1995], were different sequences occupy non-overlapping neural map space.

• Stopping method - another more robust and biological stopping method was suggested in conversations with Neil Euliano, where the emergence of a second principal wave-front when circumventing the VL for the second time, would be used to determine the stopping point and corresponding stored sequence pruning. This would substitute the brittle and unreliable dead-

VI - Final words

126

reckoning stopping criterion. It seems thus to be inevitable to have the AA circumventing a VL at least twice, also for better recognition ratio purposes.

• ANN adapting - like observed in the HC, Hebbian and anti-Hebbian learning could be eventually be used to efficiently adapt the synaptic circuitry of the already learned ANN. This is not a trivial issue, though it would be nice for the AA to adapt to slight changes on the recognized VL in order to achieve some average values (centroid of the circumvention data).

• Sensors’ robustness and calibration - in this work, the IR sensors were very brittle to use, since they depend on the reflecting surface. It would be nice to develop a sensor suit as much as invariant to the surfaces’ characteristics as possible. This would also solve the calibration problem, where one must adjust threshold optimum values for a given situation. One could also implement some sort of an auto-calibration mechanism.

• Motors linearization - the motors of this AA were not linearized and matched. Therefore, tweaking of the given motor speeds had to be done. It would be interesting to implement an automatic linearization mechanism, which would continuously update one transfer function for each motor, allowing the motor command to be transformed to the corrected speeds. Some system with low-impact on processing time would be best, being feedback out of the question. A more biological model like in [Grossberg & Kupperstein, 1989] would be nice, since motors do not change their characteristics much over time.

• Map use - all the mapping code has already been developed, aside from some minor but important details (VL exit point, VL size compensation, etc.)

• Dedicated CPU - since the dead-reckoning path integration mechanism needs high processor power all the time, it would be interesting to see how much improvement would be attained by dedicating one CPU to that task.

• Use of genetic and reinforcement learning - to get a really adapted AA, it would be interesting to implement some genetic selection for sensor suit, processing center, recognition and mapping strategies. This is, of course, not yet possible in the real sense of physical survival with a single AA. On the other hand, reinforcement learning can be implemented on a single AA, to find policies and “constant” values for the sensors, compass, and eventually VL recognition. A bumper mechanism, at least at the front of the robot, would be an indispensable add-on that would give a measure of “pain” for the AA to tune its strategies.

• Use of vision and other sensors - in a possibly far future, the efficient use of cameras for VL recognition would potentially eliminate the need of VL circumvention, allowing an AA to travel more freely in open space, instead of having to contact with VL to orient itself. Also, since it was shown that the exclusively used motor turn information might not be enough to reliably discriminate between VL, it would be highly recommended to try using more complex sensory constellations. These are nothing more than derived from other sensors that add up to the already available information, to give more differentiable sensory data sets to be used in the analog shift-register ANN.

• Generalized place learning - since the VL circumvention is like a micro-mapping at the VL level, with the added capability of recognizing sequences of places, and normal mapping can do shortcuts and detours, it would be very interesting to generalize mapping to only one complex mechanism. Like in the HC, a place is a place, whether it lies on a VL (micro-map) or it represents a VL constellation (macro-map). Being able to build an AA that is capable of fusing both shortcuts and detours, as well as place sequence detection mechanisms, would probably result in a much more robust system with similar overall capabilities like the biological HC. This would also solve the problem of finding places between VL. In other words, it does not make much sense to treat places around a VL and in a map, differently.

VI - Final words

127

• Possible application - the developed AA, with the above stated improvements, could eventually be useful for understanding physiological phenomena in animals and Humans. Since it still lies very much at the experimental level, real applications would require a more stable platform and more robust and autonomous overall behavior.

RRRREFERENCESEFERENCESEFERENCESEFERENCES

“Some books should be tasted, others swallowed, and a few should be chewed and digested” - Francis Bacon

References

a

Agre, P.; Chapmann, D. (1987), “What are plans for?”, Designing Autonomous Agents, MIT/Elsevier, 17-34. Angeline, P; Saunders, G.; Pollack, J. (1994), “An evolutionary algorithm that constructs recurrent neural

networks”, IEEE Transactions on Neural Networks, 5, 54-65. Arkin, R. (1987), “Motor schema based navigation for a mobile robot: An approach to programming by

behavior”, IEEE Conference on Robotics and Automation, 264-271. Arkin, R. (1989), “Motor schema-based mobile robot navigation”, The International Journal of Robotic Research,

8. Arkin, R.; MacKenzie, D. (1994), “Temporal coordination of perceptual algorithms for mobile robot navigation”,

IEEE Transactions on Robotics and Automation, 10, 276-286. Arkin, R.; Murphy, R. (1990), “Autonomous navigation in a manufacturing environment”, IEEE Transactions on

Robotics and Automation, 6, 445-454. Asada, M. (1988), “Building 3-D world model for a mobile robot from sensory data”, Proceedings of the IEEE

International Conference on Robotics and Automation, 918-923. Asada, M. (1990), “Map building for a mobile robot from sensory data”, IEEE Transactions on Systems, Man,

and Cybernetics, 37, 1326-1336. Asada, M.; Fukui, Y.; Tsuji, S. (1990), “Representing a global world of a mobile robot with relational local

maps”, IEEE Transactions on Systems, Man, and Cybernetics, 20, 1456-1461. Bachelder, I.; Waxman, A. (1994), “Mobile robot visual mapping and localization: A view-based

neurocomputational architecture that emulates hippocampal place learning”, Neural Networks, 7,1083-1099. Barshan, B.; Kuc, R. (1992), “A bat-like sonar system for obstacle localization”, IEEE Transactions on Systems,

Man, and Cybernetics, 22, 636-646. Barto, A.; Sutton, R. (1981), “Landmark learning: an illustration of associative search”, Biological Cybernetics,

42, 1-8. Barto, A.; Sutton, R.; Anderson, W. (1983), “Neuron-like elements that can solve difficult learning control

problems”, IEEE Transactions on Systems, Man, and Cybernetics, 13, 834-846. Beckerman, M.; Oblow, E. (1990), “Treatment of systematic errors in the processing of wide-angle sonar

sensor data for robotic navigation”, IEEE Transactions on Robotics and Automation, 6, 137-145. Bengio, Y.; et al (1992), “Global optimization of a neural network-Hidden Markov Model hybrid”, IEEE

Transactions on Neural Networks, 3, 252-259. Blum, K.; Abbott, L. (1996), “A model of spatial map formation in the hippocampus of the rat”, Neural

Computation, 8, 85-93. Borenstein, J.; Koren, Y. (1991), “The vector field histogram - Fast obstacle avoidance for mobile robots”, IEEE

Transactions on Robotics and Automation, 7, 278-288. Branco, A.; Costa, R.; Kulzer, P. (1995) “Robot obstacle avoidance using a behavioural approach”, Non-linear

dynamic systems class, University of Aveiro, 1995. Branco, A.; Kulzer, P. (1995), “WHISPER - A mobile robot platform for surveillance purposes”, Internal

publication of the Departamento de Electrónica e Telecomunicações da Universidade de Aveiro. Braunegg, D. (1993), “MARVEL: A system for recognizing world locations with stereo vision”, IEEE Transactions

on Robotics and Automation, 9, 303-308. Brooks, R. (1985a), “Visual map-making for a mobile robot”, Readings in Computer Vision, 438-443. Brooks, R. (1985b), “A layered intelligent control system for a mobile robot”, Proceedings of the ISSR Third

International Symposium on Robotics Research. Brooks, R. (1986), “A robust layered control system for a mobile robot”, IEEE Transactions on Robotics and

Automation, 2, 14-23. Brooks, R. (1987), “Intelligence without representation”, Workshop in Foundations of Artificial Intelligence. Brooks, R. (1989), “A robot that walks; Emergent behaviors from a carefully evolved network”, Neural

Computation, 1, 253-262. Brooks, R. (1990), “Elephants don’t play chess”, Robotics and Autonomous Systems, 6, 3-15. Buholtz, 1996 (1996), “Yellow submarine: an autonomous underwater vehicle”, Machine Intelligence Laboratory,

University of Florida. Bunke, H.; Glauser, T. (1993), “Viewpoint independent representation and recognition of polygonal faces in 3-

D”, IEEE Transactions on Robotics and Automation, 9, 457-462. Burgess, N.; O’Keefe, J.; Recce, M. (1993), “Using hippocampal ‘place cells’ for navigation, exploiting phase

coding”, Advances in Neural Information Processing Systems, 5, 929-936. Burgess, N.; O’Keefe, J.; Recce, M. (1994), “A model of hippocampal function”, Neural Networks, 7, 1065-

1081. Carpenter, G. (1989), “Neural network models for pattern recognition and associative memory”, Neural

Networks, 2, . Carpenter, G.; Grossberg, S.; Rosen, D. (1991), “Fuzzy ART: Fast stable learning and categorization of analog

patterns by an adaptive resonance system”, Neural Networks, 4, 759-771.

References

b

Caselli, S.; Doty, K.; Harrison, R.; Zanichelli, F., “Mobile robot navigation in enclosed large-scale space”,

Machine Intelligence Laboratory, University of Florida. Chapuis, N.; Varlet, C. (1987), “Short cuts by dogs in natural surroundings”, Quarterly Journal of Experimental

Psychology, 39B, 52-57. Chatila, R.; Laumond, J. (1985), “Position referencing and consistent world modeling for mobile robots”,

Proceedings of the IEEE International Conference on Robotics and Automation, 138-170. Chen, S.; Tsai, W. (1991), “Determination of robot locations by common object shapes”, IEEE Transactions on

Robotics and Automation, 7, 149-156. Cho, S.; Kim, J. (), “An HMM/MLP architecture for sequence recognition”, 359-369. Collett, T.; Cartwright, B.; Smith, B. (1986), “Landmark learning and visuospatial memories in gerbils”, Journal

of Comparative Physiology A, 158, 835-851. Daily, M.; et al (1988), “Autonomous cross-country navigation with the ALV”, IEEE Conference on Robotics and

Automation. Dayan, P.; Hinton, G. (1993), “Feudal reinforcement learning”, Advances in Neural Information Processing

Systems, 5, 271-278. Deng, X.; Papadimitriou, C. (1990), “Exploring an unknown graph”, Proceedings of the Annual Symposium on

Foundations of Computer Science, 335-361. DeWeerth, S. et al (1991), “A simple neuron servo”, IEEE Transactions on Neural Networks, 2, 248-251. DeYoe, E. A.; Van Essen, D. C. (1988), “Concurrent processing streams in monkey visual cortex”, Trends in

Neuroscience, 11, 219-226. Dhond, U.; Aggarwal, J. (1989), “Structure from stereo”, IEEE Transactions on Systems, Man, and

Cybernetics, 19, 1489-1510. Dorigo, M.; Schnepf, U. (1993), “Genetics-based machine learning and behavior-based robotics: A new

synthesis”, IEEE Transactions on Systems, Man, and Cybernetics, 23, 141-154. Doty, K.; Caselli, S.; Harrison, R.; Zanichelli, F., “Landmark map construction and navigation in enclosed

environments”, Machine Intelligence Laboratory, University of Florida. Doty, K.; Seed, S. (1994), “Autonomous agent map construction in unknown enclosed environments”, MLC-

COLT ‘94 Robot Learning Workshop. Drumheller, M. (1987), “Mobile robot localization using sonar”, IEEE Transactions on Pattern Analysis and

Machine Intelligence, 9, 325-332. Dudek, G.; Jenkin, M.; Milios, E.; Wilkes, D. (1991), “Robotic exploration as graph construction”, IEEE

Transactions on Robotics and Automation, 7, 859-865. Dunlay, R. T. (1988), “Obstacle avoidance perception processing for the autonomous land vehicle”,

Proceedings of the IEEE International Conference on Robotics and Automation, 912-917. Eichenbaum, H.; Cohen, N. (1988), “Representation in the hippocampus: What do hippocampal neurons

code?”, TINS, 11. Eichenbaum, H.; Wiener, S.; Shapiro, M.; Cohen, N. (1989), “The organization of spatial coding in the

hippocampus: A study of neural ensemble activity”, Journal of Neuroscience, 9, 2764-2775. Elfes, A. (1987), “Sonar-based real-world mapping and navigation”, IEEE Journal of Robotics and Automation,

3, 249-265. Ellis, R. (1991), “Geometric uncertainties in Polyhedral object recognition”, IEEE Transactions on Robotics and

Automation, 7, 361-371. Engelson, S.; McDermott, D. V. (1991), “Image signatures for place recognition and map construction”, Proceedings of SPIE Symposium on Intelligent Robotic Systems: Sensor Fusion, 4, 282-293. Etienne, A. (1987), “The control of short-distance homing in the golden hamster”, Cognitive processes in Spatial

Orientation in Animal and Man, 223-251. Etienne, A.; Teroni, E.; Hurni, C.; Portenier, V. (1990), “The effect of as single light cue on homing behaviour

of the golden hamster”, Animal Behaviour, 39, 17-41. Euliano, N; Príncipe, J. (1995), “Spatio-temporal self-organizing feature maps”, to be published in Proceedings

of the ICNN 96’ Conference, 1900-1905. Fennema, C.; Hanson, A.; Riseman, E.; Beveridge, J.; Kumar, R. (1990), “Model-directed mobile robot

navigation”, IEEE Transactions on Systems, Man, and Cybernetics, 20, 1352-1369. Feng, D.; Krogh, B. (1990), “Satisfying feedback strategies for local navigation of autonomous mobile robots”,

IEEE Transactions On Systems, Man, and Cybernetics, 20, 1383-1395. Fogel, D. (1994), “An introduction to simulated evolutionary optimization”, IEEE Transactions on Neural

Networks, 5, 3-34. Foster, T.; Castro, C.; McNaughton, B. (1989), “Spatial selectivity of rat hippocampal neurons: Dependence on

preparedness for movement”, Science, 244, 1580-1582. Fujimura, K.; Samet, H. (1989), “A hierarchical strategy for path planning among moving obstacles”, IEEE

Transactions on Robotics and Automation, 5, 61-69. Fujita, O. (1993), “Trial-and-Error correlation learning”, IEEE Transactions on Neural Networks, 4, 720-722.

References

c

Gallistel, C. (1989), “Animal cognition: The representation of space, time and number”, Annual Review of

Psychology, 40, 155-189. Gat, E.; Desai, R.; Ivlev, R.; Loch, J.; Miller, D. (1994), “Behavior control for robotic exploration of planetary

surfaces”, IEEE Transactions on Robotics and Automation, 10, 490-503. Goebel, R. (1993), “Perceiving complex visual scenes: An oscillator neural network model that integrates

selective attention, perceptual organization, and invariant recognition”, Advances in Neural Information Processing Systems, 5, 903-910.

Gonçalves, J.; Ribeiro, J.; Kulzer, P.; Vaz, F. (1996), “Navegação de veículos autónomos”, Senior Project, Universidade de Aveiro.

Griswold, N.; Eem, J. (1990), “Control for mobile robots in the presence of moving objects”, IEEE Transactions on Robotics and Automation, 6, 263-268.

Grossberg, S.; Kuperstein, M. (1989), Neural dynamics of adaptive sensory-motor control, Pergamon Press. Guldner, J.; Utkin, V. (1995), “Sliding mode control for gradient tracking and robot navigation using artificial

potential fields”, IEEE Transactions on Robotics and Automation, 11, 247-254. Hebert, M.; Kanade, T. (1986), “Outdoor scene analysis using range data”, Proceedings of the IEEE

International Conference on Robotics and Automation, 1426-1432. Herman, M.; Albus, J. (1988), “Overview of the multiple autonomous underwater vehicles (MAUV) project”,

Proceedings of the IEEE International Conference on Robotics and Automation, 618-620. Herman, M. et al (1988), “Planning and world modeling for autonomous undersea vehicles”, Proceedings of the

IEEE International Symposium on Intelligent Control. Hetherington, P.; Shapiro, M. (1993), “A simple network model simulates hippocampal place fields: II.

Computing goal-directed trajectories and memory fields”, Behavioral Neuroscience, 107, 434-443. Holland, J. (1985), “Properties of the bucket brigade algorithm”, Proceedings of an International Conference on

Genetic Algorithms and their Applications, 1-7. Hu, T.; Kahng, A.; Robins, G. (1993), “Optimal robust path planning in general environments”, IEEE

Transactions on Robotics and Automation, 9, 775-784. Hubel D.; Wiesel, T. (1974), “Sequence regularity and geometry of orientation columns in the monkey striate

cortex”, Journal of Computational Neurology, 163, 267-294. Hubel D.; Wiesel, T. (1977), “Functional architecture of macaque monkey visual cortex”, Proceedings of the

Royal Society of London B, 198, 1-59. Hughey, D. (1985), “Temporal discontiguity: Alternative to, or component of, existing theories of hippocampal

function”, Behavioral and Brain Science, 8, 501-502. Hwang, Y.; Ahuja, N. (1992), “A potential field approach to path planning”, IEEE Transactions on Robotics and

Automation, 8, 23-32. Jarvis, R.; Byrne, J. (1986), “Robot navigation: Touching, seeing and knowing”, Proceedings of the 1st

Australian Conference on Artificial Intelligence. Kaelbling, L. (1986), “An architecture for intelligent reactive systems”, Technical report, Artificial Intelligence

Center SRI International. Kaplan, S. (1973), “Cognitive maps, human needs and the designed environment”, Environmental Design

Research, 1, 275- 283. Kim, J.; Khosla, P. (1992), “Real-time obstacle avoidance using harmonic potential functions”, IEEE

Transactions on Robotics and Automation, 8, 338-349. Kohonen, T. (1982), “Self-organized formation of topographically correct feature maps”, Biological Cybernetics,

43, 59-69. Korning, P. (199X), “Training neural networks by means of genetic algorithms working on very long

chromosomes”, Computer Science Department, Aarhus University. Kortenkamp, D.; Chown, E. (1993), “A directional spreading activation network for mobile robot navigation”,

Proceedings of the 2nd International Conference on Simulation of Adaptive Behavior, 218-224. Kriegman, D.; Triendl, E.; Binford, T. (1989), “Stereo vision and navigation in buildings for mobile robots”,

IEEE Transactions on Robotics and Automation, 5, 792-803. Kubie, J.; Ranck, J. (1983), “Sensory-behavioral correlates in individual hippocampus neurons in three

situations: space and context”, Neurobiology of the Hippocampus, 433-447. Kuipers, B. (1978), “Modeling spatial knowledge”, Cognitive Science, 2, 129-153. Kuipers, B.; Byun, Y. (1987), “A qualitative approach to robot exploration and map learning”, Proceedings of

the Workshop on Spatial Reasoning and Multisensor Fusion, 390-404. Kuipers, B.; Levitt, T. (1988), “Navigation and mapping in large scale space”, Advances in Spatial Reasoning,

2, 207-251. Kulzer, P.; Branco, A. (1994), “Reconhecimento automático de caracteres”, internal publication of the

University of Aveiro. Kulzer, P. (1995), Apontamentos das aulas de Robótica Móvel leccionada pelo Prof. Dr. Keith Doty, no

Departamento de Electrónica e Telecomunicações da Universidade de Aveiro.

References

d

Kung, S.; Hwang, J. (1989), “Neural network architectures for robotic applications”, IEEE Transactions on

Robotics and Automation, 5, 641-657. Lappe, M.; Rauschecker, J. (1992), “A neural network for the processing of optic flow from Ego-motion in man

and higher mammals”, Neural Networks, , 375-391. Laumond, J. (1983), “Model structuring and concept recognition: Two aspects of learning for a mobile robot”,

Proceedings of the International Joint Conference on Artificial Intelligence, 839-841. Leonard, J.; Durrant-Whyte, H. (1991), “Mobile robot localization by tracking geometric beacons”, IEEE

Transactions on Robotics and Automation, 7, 376-382. Levitt, T.; Lawton, D.; Chelberg, D.; Nelson, P. (1987), “Qualitative navigation”, Proceedings of the DARPA

Image Understanding Workshop, 447-465. Levitt, T.; Lawton, D.; Chelberg, D.; Koitzsch (1988), “Qualitative navigation 2”, Proceedings of the DARPA

Image Understanding Workshop, 319-326. Levitt, T.; Lawton, D. (1990), “Qualitative navigation for mobile robots”, Artificial Intelligence, 44, 305-360. Lippman, R. (1987), “An Introduction to Computing with Neural Nets", IEEE ASSP Magazine. Lumelsky, V.; Mukhopadhyay, S.; Sun, K. (1990), “Dynamic path planning in sensor-based terrain

acquisition”, IEEE Transactions on Robotics and Automation, 6, 462-472. Malkin, P.; Addanki, S. (1990), “LOGnets: A hybrid spatial representation for robot navigation”, Proceedings of

the AAAI 8th National Conference on Artificial Intelligence, 1045-1050. Maniezzo, V. (1994), “Genetic evolution of the topology and weight distribution of neural networks”, IEEE

Transactions on Neural Networks, 5, 39-53. Mataric, M. (1989), “Qualitative sonar based environment learning for mobile robots”, SPIE Mobile Robots. Mataric, M. (1990a), “A model for distributed mobile robot environment learning and navigation”, MIT M.S.

Thesis in Electrical Engineering and Computer Science. Mataric, M. (1990b), “Navigating with a rat brain: A neurobiologically-inspired model for robot spatial”,

Proceedings of the 1st International Conference on Simulation of Adaptive Behavior, 169-179. Mataric, M. (1991), “A comparative analysis of reinforcement learning methods”, AI Memo 1322, MIT Artificial

Intelligence Laboratory. Mataric, M. (1992), “Integration of representation into goal directed behavior”, IEEE Transactions on Robotics

and Automation, 8, 304-312. McDermott, D.; Davis, E. (1984), “Planning routes through uncertain territory”, Artificial Intelligence, 22, 107-

156. McNaughton, B. (1988), “Neuronal mechanisms for spatial computation and information storage”, Neural

Connections, Mental Computation, 285-350. McNaughton, B.; Barnes, C.; O’Keefe, J. (1983), “The contributions of position, direction, and velocity to single

unit activity in the hippocampus of freely-moving rats”, Experimental Brain Research, 52, 41-49. McNaughton, B.; Chen, L.; Markus, E. (1991), “Dead reckoning, landmark learning, and the sense of direction:

A neurophysiological and computational hypothesis”, Journal of Cognitive Neuroscience, 3, 190-202. McNaughton, B.; Leonard, B., Chen, L. (1989), “Cortical-hippocampal interactions and cognitive mapping: A

hypothesis based on reintegration of the parietal and inferotemporal pathways for visual processing”, Psychobiology, 17, 236-246.

Mehrotra, R.; Grosky, W. (1989), “Shape matching utilizing indexed hypotheses generation and testing”, IEEE Transactions on Robotics and Automation, 5, 70-77.

Mishkin, M.; Ungerleider, L.; Macko, K. (1983), “Object vision and spatial vision: Two cortical pathways”, Trends in Neuroscience, 6, 414-417.

Montague, P; et al (1993), “Using aperiodic reinforcement for directed self-organization during development”, Advances in Neural Information Processing Systems, 5, 969-976.

Moore, A.; Atkeson, C. (1993), “Memory-based reinforcement learning: Efficient computation with prioritized sweeping”, Advances in Neural Information Processing Systems, 5, 263-270.

Moravec, H. (1981), “Robot rover visual navigation”, Ann Arbor, MI: UMI Research Press. Morris, R. (1981), “Spatial localization does not require the presence of local cues”, Learning and Motivation, 12,

239-260. Muller, R.; Kubie, J. (1987), “The effects of changes in the environment on the spatial firing of hippocampal

complex-spike cells”, Journal of Neuroscience, 7, 1951-1968. Muller, R.; Kubie, J. (1989), “The firing of hippocampal place cells predicts the future position of freely moving

rats”, Journal of Neuroscience, 9, 4101-4110. Muller, R. U.; Kubie, J.; Ranck, J. (1987), “Spatial firing patterns of hippocampal complex-spike cells in a fixed

environment”, Journal of Neuroscience, 7, 1935-1950. Muller, R.; Kubie, J.; Bostock, E.; Taube, J.; Quirk, G. (1991), “Spatial firing correlates of neurons in the

hippocampal formation of freely moving rats”, Brain and Space, 296-333. Nehmzow, U.; Smithers, T. (1990), “Mapbuilding using self-organizing networks in really useful robots”,

Proceedings of the 1st International Conference on Simulation of Adaptive Behavior, 152-159.

References

e

Nolfi, S.; Floreano, D.; Miglino, O.; Mondada, F. (1996) “How to evolve autonomous robots: Different

approaches in evolutionary robotics”, Technical Report, Institute of Psychology, National Research Council, Italy.

Nelson, R. C. (1989), “Visual homing using an associative memory”, Proceedings of the DARPA Image Understanding Workshop, 245-262.

Noreils, F.; Chatila, R. (1995), “Plan execution monitoring and control architecture for mobile robots”, IEEE Transactions on Robotics and Automation, 11, 255-266.

O’Keefe, J. (1976), “Place units in the hippocampus of the freely moving rat”, Experimental Neurology, 51, 78-109.

O’Keefe, J. (1989), “Computations the hippocampus might perform”, Neural Connections, Mental Computation, 225-284.

O’Keefe, J. (1990), “A computational theory of the hippocampal cognitive map”, Progress in Brain Research, 83. O’Keefe, J. (1991), “The hippocampal cognitive map and navigational strategies”, Brain and Space, 273-295. O’Keefe, J.; Dostrovsky, J. (1971), “The hippocampus as a spatial map: Preliminary evidence from unit activity

in the freely moving rat”, Brain Research, 34, 171-175. O’ Keefe, J.; Recce, M. (1993), “Phase relationship between hippocampal place units and the EEG theta

rhythm”, Hippocampus, 3, 317-330. O’Keefe, J.; Speakman, A. (1987), “Single unit activity in the rat hippocampus during a spatial memory task”,

Experimental Brain Research, 68, 1-27. Olton, D.; Samuelson, R. (1976), “Remembrance of places visited: Spatial memory in rats”, Journal of

Psychological Animal Behavior Proceedings, 2, 97-116. Pavlides, C.; Greenstein, Y.; Grudman, M.; Winson, J. (1988), “Long-term potentiation in the dentate gyrus is

induced preferentially on the positive phase of θ-rhythm”, Brain Research, 439, 383-387. Payton, D. (1990), “Internalized plans: A representation for action resources”, Robotics and Autonomous

Systems, 6, 89-103. Payton, D.; Rosenblatt, J.; Keirsey, D. (1990), “Plan guided reaction”, IEEE Transactions on Systems, Man,

and Cybernetics, 20, 1370-1382. Pearlmutter, B. (1995), “Gradient calculations for dynamic recurrent neural networks”, IEEE Transactions on

Neural Networks, 6, 1212-1228. Penna, M.; Wu, J. (1993), “Models for map building and navigation”, IEEE Transactions on Systems, Man, and

Cybernetics, 23, 1276-1301. Pick, H.; Rieser, J. (1982), “Children’s cognitive mapping”, Spatial abilities: Development and Physiology

Foundations. Pieraccini, E. (1993), “Planar Hidden Markov Modeling: from speech to optical character recognition”, Advances

in Neural Information Processing Systems, 5, 731-738. Poucet, B. (1993), “Spatial cognitive maps in animals: New hypotheses on their structure and neural

mechanisms”, Psychological Review, 100, 163-182. Prescott, T.; Mayhew, J. (1993), “Building long-range maps using local landmarks”, Proceedings of the 2nd

International Conference on Simulation of Adaptive Behavior, 233-242. Quirk, G.; Muller, R.; Kubie, J. (1990), “The firing of hippocampal place cells in the dark depends on the rat’s

recent experience”, Journal of Neuroscience, 10, 2008-2017. Rao, N.; Fuentes, O., “Perceptual homing by an autonomous mobile robot using sparse self-organizing sensory-

motor maps”, Department of Computer Science, University of Rochester. Rao, N.; Iyengar, S. (1990), “Autonomous robot navigation in unknown terrain: learning and environmental

exploration”, IEEE Transactions on Systems, Man, and Cybernetics, 20, 1443-1449. Recce, M.; O’Keefe, J. (1989), “The tetrode: A new technique for multi-unit extra-cellular recording”, Society of

Neuroscience Abstracts, 15, 1250. Redish, A. (1995), “The Hippocampus as a cognitive map II”, Computational Models of Neural Systems. Rimon, E.; Koditschek, D. (1992), “Exact robot navigation using artificial potential functions”, IEEE

Transactions on Robotics and Automation, 8, 501-518. Rolls, E.; Treves, A. (1993), “Neural networks in the brain involved in memory and recall”, Proceedings of the

1993 International Conference on Neural Networks, 9-14. Rossey, L. (1996), “Neuron based behavior for obstacle avoidance”, Machine Intelligence Laboratory, University

of Florida. Schmajuk, N. (1990), “Roles of the hippocampus in temporal and spatial navigation: An adaptive neural

network”, Behavioural Brain Research, 39, 205-229. Schmajuk, N.; Blair, H. (1993), “Place learning and the dynamics of spatial navigation: A neural network

approach”, Adaptive Behavior, 1, 353-385. Schmajuk, N.; DiCarlo, J. (1991), “A neural network approach to hippocampal function in classical

conditioning”, Behavioral Neuroscience, 105, 82-110.

References

f

Schöne, H. (1984), “Spatial orientation: The spatial control of behavior in animals and man”, Princeton Series in

Neurobiology and Behavior, Princeton University Press. Schöner, G.; Dose, M. (1992), “A dynamical systems approach to task-level system integration used to plan

and control autonomous vehicle motion”, Robotics and Autonomous Systems, 10, 253-267. Schöner, G. (1995), “Dynamics of behavior: theory and applications for autonomous robot architectures”, Institut

für Neuroinformatik, Ruhr-Universität. Schwartz, J.; Sharir, M. (1983), “On the piano movers problem II: General techniques for computing topological

properties of real algebraic manifolds”, Advances in Applied Mathematics, 298-351. Seed, S. (1994), “Mobile robot map building using landmark recognition in a behavior-based system”, M.S.

Thesis, Machine Intelligence Laboratory, University of Florida. Seibert, M.; Waxman, A. (1989), “Spreading activation layers, visual saccades and invariant representations for

neural pattern recognition systems”, Neural Networks, 2, 9-27. Seibert, M.; Waxman, A. (1992), “Adaptive 3D object recognition from multiple views”, IEEE Transactions on

Pattern Analysis and Machine Intelligence, 14, 107-124. Shapiro, M.; Hetherington, P. (1993), “A simple network model simulates hippocampal place fields: I.

Parametric analyses and physiological predictions”, Behavioral Neuroscience, 107, 34-50. Shapiro, M. et al (1990), “A simple PDP model simulates spatial correlates of hippocampal neuronal activity”,

Society of Neuroscience Proceedings Abstracts, 16, 473. Sharp, P. (1991), “Computer simulation of hippocampal place cells”, Psychobiology, 19, 103-115. Snow, K. (1995), “THOMAS - midterm report”, Machine Intelligence Laboratory, University of Florida. Speakman, A.; O’Keefe, J. (1990), “Hippocampal complex spike cells do not change their place fields if the

goal is moved within a cue controlled environment”, European Journal of Neuroscience, 2, 544-555. Stanton, P.; Sejnowski, T. (1989), “Storing covariance by the associative long-term potentiation and depression

of synaptic strengths in the hippocampus”, Advances in Neural Information Processing Systems, 1, 394-401. Sutherland, K.; Thompson, W. (1994), “Localizing in unstructured environments: Dealing with the errors”, IEEE

Transactions on Robotics and Automation, 10, 740-754. Takahashi, O.; Schilling, R. (1989), “Motion planning in a plane using generalized Voronoi Diagrams”, IEEE

Transactions on Robotics and Automation, 5, 143-150. Tani, J.; Fukumura, N. (1994), “Learning goal-directed sensory-based navigation of a mobile robot”, Neural

Networks, 7, 553-563. Taube, J.; Muller, R., Ranck, J. (1990a), “Head direction cells recorded from the postsubiculum in freely

moving rats: I. Description and quantitative analysis”, Journal of Neuroscience, 10, 420-435. Taube, J.; Muller, R., Ranck, J. (1990b), “Head direction cells recorded from the postsubiculum in freely

moving rats: II. Effects of environmental manipulations”, Journal of Neuroscience, 10, 436-447. Thompson, R. "Psicologia Fisiológica", Scientific American. Tolman, E. (1948), “Cognitive maps in rats and men”, Psychology Review, 55, 189-208. Touretzky, D. S.; Redish, A. D.; Wan, H. S. (), “Neural representation of space using sinusoidal arrays”, 869-

884 Tsuji, S.; Zheng, J.; Asada, M. (1986), “Stereo vision of a mobile robot: world constraints for image matching

and interpretation”, Proceedings of the IEEE International Conference on Robotics and Automation, 1594-1599.

Vanderwolf, C. (1969), “Hippocampal electrical activity and voluntary movement in the rat”, Electroencephalography and Clinical Neurophysiology, 26, 407-418.

Wang, H.; Mathur, B. (1989), “Computing optical flow in the primate visual system”, Neural Computation, 1, 92-103.

Wang, Y.; Aggarwal, J. (1989), “Integration of active and passive sensing techniques for representing three-dimensional objects”, IEEE Transactions on Robotics and Automation, 5, 460-471.

Watanabe, S.; Yoneyama, M. (1992), “An ultrasonic visual sensor for three-dimensional object recognition using neural networks”, IEEE Transactions on Robotics and Automation, 8, 240-249.

Watkins, C. (1989), “Learning from delayed rewards”, Ph.D. thesis, Cambridge University. Wehner, R.; Räber, F. (1979), “Visual spatial memory in desert ants, Cataglyphis bicolor (Hymenoptera:

Formicidae)”, Experientia, 35, 1569-1571. Whitehead, S.; Ballard, D. (1990), “Active perception and reinforcement learning”, Neural Computation, 2, 409-

419. Wiener, S. I.; Paul, C. A.; Eichenbaum, H. (1989), “Spatial and behavioral correlates of hippocampal neuronal

activity”, Journal of Neuroscience, 9, 2737-2763. Wilson, M. (1987), “Classifier systems and the animate problem”, Machine Learning, 2, 199-228. Winson, J. (1978), “Loss of hippocampal theta rhythm results in spatial memory deficit in the rat”, Science, 201,

160-163. Wu, J.; Penna, M. (1993), “An ANN for qualitative map building”, Proceedings of World Congress on Neural

Networks, 2, 135-143.

References

g

Yang, J.; Xu, Y.; Chen, C. (1994), “Hidden markov model approach to skill learning and its application to

telerobotics”, IEEE Transactions on Robotics and Automation, 10, 621-631. Zaharakis, S.; Guez, A. (1990), “Time optimal robot navigation via the slack method”, IEEE Transactions on

Systems, Man, and Cybernetics, 20, 1396-1407. Zelinsky, A. (1992), “A mobile robot exploration algorithm”, IEEE Transactions on Robotics and Automation, 8,

707-717. Zhang, Z.; Faugerhaus, O. (1992), “A 3D world model builder with a mobile robot”, The International Journal of

Robotics Research, 11, 269-285. Zheng, J.; Tsuji, S (1992), “Panoramic representation for route recognition by a mobile robot”, International

Journal of Computer Vision, 9, 55-76. Zhu, Q. (1991), “Hidden Markov Model for dynamic obstacle avoidance of mobile robot navigation”, IEEE

Transactions on Robotics and Automation, 7, 390-397. Zipser, D. (1985), “A computational model of hippocampal place fields”, Behavioural Neuroscience., 99, 1006-

1018. Zipser, D. (1986), “Biologically plausible models for place recognition and goal location”, Parallel Distributed

Processing, 2, 432-470.

BBBBIBLIOGRAPHYIBLIOGRAPHYIBLIOGRAPHYIBLIOGRAPHY

Bibliography

Aleksander, I.; Morton, H. (1991), An Introduction to Neural Computing, Chapman & Hall Borland (1996a), Delphi User’s Guide v2, Borland International. Borland (1996b), Delphi Object Pascal Language Guide v2, Borland International. Borland (1996c), Delphi Component Writer’s Guide v2, Borland International. Borland (1996d), Delphi Reference Library Guide v2, Borland International. Carpenter, G.; Grossberg, S. (1991), Pattern Recognition by Self-Organizing Neural Networks (Part III:

Adaptive Resonance Theory), Cambridge, MA: MIT Press. Connel, J. (1990), Minimalist Mobile Robotics - A Colony-Style Architecture for an Artificial Creature, Academic

Press. Desmond, M. (1986), Guia Essencial do Gato, Arte de Viver, Publicações Europa-América. Driscoll, F.; Coughlin, R.; Villanucci, R. (1994), Data Acquisition and Process Control with the M68HC11

Microcontroller, MacMillan Publishing Company. Fu, K.; Gonzalez, R.; Lee, C. (1987), Robotics - Control, Sensing, Vision, and Intelligence, McGraw-Hill. Goldberg, D. (1989), Genetic Algorithms in Search, Optimization & Machine Learning, Addison-Wesley

Publishing Company. Haykin, S. (1994), Neural Networks - A Comprehensive Foundation, MacMillan. ImageCraft (1996), ICC11 Imagecraft 68HC11 C Compiler and REXIS User Manual Version 3.0. Jones, J.; Flynn, A. (1993), Mobile Robots - Inspiration to Implementation, AK Peters. Kohonen, T. (1988), Self-Organization and Associative Memory, Springer Verlag. Miller, G. (1993), Microcomputer Engineering, Prentice Hall. Miller, R. (1991), Cortico-Hippocampal Interplay and the Representation of Contexts in the Brain, Springer-

Verlag. Motorola (1991), M68HC11 Reference Manual. O’ Keefe, J.; Nadel, L. (1978), The Hippocampus as a Cognitive Map. Clarendon Press. Peatman, J. (1989), Design with Microcontrollers, McGraw-Hill. Rabiner, L.; Juang, B. (1993), Fundamentals of Speech Recognition, Prentice Hall. Waterman, T. (1989), Animal Navigation, New York: Scientific American Library, 1989.

SSSSOFTWARE OFTWARE OFTWARE OFTWARE &&&& HHHHARDWAREARDWAREARDWAREARDWARE

Software & Hardware

Borland (1995), Borland C++ 4.02, http://www.borland.com. Borland (1996e), Delphi Developer 2.0, http://www.borland.com. ImageCraft (1996), ICC11 - C compiler v3.5 for the 68HC11, http://www.imagecraft.com. MathWorks (1994), MatLab v4.2b for Windows. Sargent, R.; Wright, A. (1994), IC - Interactive C for the 68HC11.

Motorola, 68HC11A0 Microcontroller. Aurorel, 433 MHz ASK transmitter (TX 433) / receiver (STD 433 DIL), Italy.

Author

AAAAUTHORUTHORUTHORUTHOR

Pedro Kulzer nasceu em Anadia, Portugal, em 1970. Recebeu a Licenciatura em Eng. Electrónica e Telecomunicações pela Universidade de Aveiro, Portugal, em 1994. Os seus interesses principais de investigação estão no campo das estratégias de navegação neuronais em agentes autónomos, empregando mecanismos de mapeamento. Pedro Kulzer was born in Anadia, Portugal, in 1970. He received the Bachelor degree in Electrical Engineering from the University of Aveiro, Portugal, in 1994. His primary research interest is in the field of autonomous agents navigations strategies with neural networks, employing mapping mechanisms.

Email: [email protected], [email protected]. Web page: http://www.geocities.com/CollegePark/7449

Aveiro, November 23rd, 1996