• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • Redigera dokument
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Generic Model of Motivation in Artificial Animals Based on Reinforcement Learning

Sammanfattning
This thesis is a part of a broader research project at Chalmers University of Technology focused on ecosystems’ simulations using reinforcement learning artificial animals, called animats. The scope of this project is to provide animats with a reward signal which should ultimately drive animats’ learning towards adaptation of their environment. We introduce a framework based on basic biological mechanisms of homeostatic regulation, i.e. the regulation of physiological conditions, to reward animats for maintaining their optimal homeostatic state, i.e. for maintaining homeostasis. As such, homeostasis is each animat’s objective. Previous, theoretical work adopting homeostatic regulation as a mechanism of reward generation lack the ability of regulating needs’ importance and needs’ interaction, and as shown by our results fail in environments where animats eventually die. We extend on previous theoretical efforts of modeling homeostatic regulation by defining the animat’s happiness as a function of its needs through several simple univariate utility functions. Modeling the utility of each need singularly enables high flexibility in design and easily configurable interactions between different needs. Moreover, in this framework vital needs have priority over non-vital or sensory needs. We show that this framework can be used to elicit six important animat behaviors, emulations of real animal behaviours, and in particular can be used to recreate typical behaviours observed in free-living planktonic copepods such as quick escape reactions from fast-approaching predators and diel vertical migration. We compare 2 models for reward generation utilizing different happiness functions to previous theoretical work and to a generalization of said previous work in a diverse array of environments, showing that one model of motivation is superior in all tested environments and allows animats to learn the six objective behaviours. The models are also compared against a baseline reward, rewarding staying alive. We show that the proposed models produce a better performance compared to the baseline model, implicating that motivational models based on homeostatic regulation are a good choice for reward generation for animats. Finally, we test the models in a more general marine environment, showing that using this framework animats can learn copepod behaviour.
Examinationsnivå
Student essay
URL:
https://hdl.handle.net/2077/71555
Samlingar
  • Masteruppsatser
Fil(er)
CSE 21-08 Ferrari Kleve.pdf (7.488Mb)
Datum
2022-05-06
Författare
Kleve, Birger
Ferrari, Pietro
Nyckelord
reinforcement
learning
reward
shaping
animat
homeostasis
ecosystem
motivation
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV