• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • View Item
  •   Home
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Generic Model of Motivation in Artificial Animals Based on Reinforcement Learning

Abstract
This thesis is a part of a broader research project at Chalmers University of Technology focused on ecosystems’ simulations using reinforcement learning artificial animals, called animats. The scope of this project is to provide animats with a reward signal which should ultimately drive animats’ learning towards adaptation of their environment. We introduce a framework based on basic biological mechanisms of homeostatic regulation, i.e. the regulation of physiological conditions, to reward animats for maintaining their optimal homeostatic state, i.e. for maintaining homeostasis. As such, homeostasis is each animat’s objective. Previous, theoretical work adopting homeostatic regulation as a mechanism of reward generation lack the ability of regulating needs’ importance and needs’ interaction, and as shown by our results fail in environments where animats eventually die. We extend on previous theoretical efforts of modeling homeostatic regulation by defining the animat’s happiness as a function of its needs through several simple univariate utility functions. Modeling the utility of each need singularly enables high flexibility in design and easily configurable interactions between different needs. Moreover, in this framework vital needs have priority over non-vital or sensory needs. We show that this framework can be used to elicit six important animat behaviors, emulations of real animal behaviours, and in particular can be used to recreate typical behaviours observed in free-living planktonic copepods such as quick escape reactions from fast-approaching predators and diel vertical migration. We compare 2 models for reward generation utilizing different happiness functions to previous theoretical work and to a generalization of said previous work in a diverse array of environments, showing that one model of motivation is superior in all tested environments and allows animats to learn the six objective behaviours. The models are also compared against a baseline reward, rewarding staying alive. We show that the proposed models produce a better performance compared to the baseline model, implicating that motivational models based on homeostatic regulation are a good choice for reward generation for animats. Finally, we test the models in a more general marine environment, showing that using this framework animats can learn copepod behaviour.
Degree
Student essay
URI
https://hdl.handle.net/2077/71555
Collections
  • Masteruppsatser
View/Open
CSE 21-08 Ferrari Kleve.pdf (7.488Mb)
Date
2022-05-06
Author
Kleve, Birger
Ferrari, Pietro
Keywords
reinforcement
learning
reward
shaping
animat
homeostasis
ecosystem
motivation
Language
eng
Metadata
Show full item record

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV