• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Applied Information Technology / Institutionen för tillämpad informationsteknologi
  • Kandidatuppsatser/Bachelor theses / Institutionen för tillämpad informationsteknologi
  • Redigera dokument
  •   Startsida
  • Student essays / Studentuppsatser
  • Department of Applied Information Technology / Institutionen för tillämpad informationsteknologi
  • Kandidatuppsatser/Bachelor theses / Institutionen för tillämpad informationsteknologi
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

MIXED MEMORY Q-LEARNER An adaptive reinforcement learning algorithm for the Iterated Prisoner’s Dilemma

Sammanfattning
The success of future societies is likely to depend on cooperative interactions between humans and artificial agents. As such, it is important to investigate how machines can learn to cooperate. By looking at how machines handle complex social situations, so-called social dilemmas, knowledge about the components necessary for cooperation in artificial agents can be acquired. In this study, a reinforcement learning algorithm was used to study the Iterated Prisoner’s Dilemma (IPD), a common social dilemma game. A reinforcement learning algorithm can make decisions in the IPD by considering a given number of its opponent’s last actions, thus representing the agent’s memory. This study investigated the role of different memory lengths on the performance of the agent in the IPD. The results showed that different memory lengths are preferable depending on the opponent. A new algorithm was created called Mixed Memory Q-Learner (MMQL), which could switch memory length during play to adapt to its opponent. It could also recognise its opponent between games, thus continuing its learning over several interactions. MMQL performed better against certain opponents in the IPD but did not learn to cooperate with cooperative players. Further capabilities might therefore be added to the algorithm to invite cooperation, or the environment can be manipulated. The results suggest that flexibility in how a situation is represented and the ability to recognise opponents are important capabilities for artificial agents in social dilemmas.
Examinationsnivå
Kandidatuppsats
Bachelor thesis
URL:
http://hdl.handle.net/2077/69664
Samlingar
  • Kandidatuppsatser/Bachelor theses / Institutionen för tillämpad informationsteknologi
Fil(er)
Thesis (1.352Mb)
Datum
2021-09-21
Författare
Dollbo, Anna
Nyckelord
Machine learning
reinforcement learning
game theory
iterated prisoner’s dilemma
state representation
Q-learning
Serie/rapportnr.
2021:081
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV