MIXED MEMORY Q-LEARNER An adaptive reinforcement learning algorithm for the Iterated Prisoner’s Dilemma

Dollbo, Anna

dc.contributor.author	Dollbo, Anna
dc.date.accessioned	2021-09-21T07:35:20Z
dc.date.available	2021-09-21T07:35:20Z
dc.date.issued	2021-09-21
dc.identifier.uri	http://hdl.handle.net/2077/69664
dc.description.abstract	The success of future societies is likely to depend on cooperative interactions between humans and artificial agents. As such, it is important to investigate how machines can learn to cooperate. By looking at how machines handle complex social situations, so-called social dilemmas, knowledge about the components necessary for cooperation in artificial agents can be acquired. In this study, a reinforcement learning algorithm was used to study the Iterated Prisoner’s Dilemma (IPD), a common social dilemma game. A reinforcement learning algorithm can make decisions in the IPD by considering a given number of its opponent’s last actions, thus representing the agent’s memory. This study investigated the role of different memory lengths on the performance of the agent in the IPD. The results showed that different memory lengths are preferable depending on the opponent. A new algorithm was created called Mixed Memory Q-Learner (MMQL), which could switch memory length during play to adapt to its opponent. It could also recognise its opponent between games, thus continuing its learning over several interactions. MMQL performed better against certain opponents in the IPD but did not learn to cooperate with cooperative players. Further capabilities might therefore be added to the algorithm to invite cooperation, or the environment can be manipulated. The results suggest that flexibility in how a situation is represented and the ability to recognise opponents are important capabilities for artificial agents in social dilemmas.	sv
dc.language.iso	eng	sv
dc.relation.ispartofseries	2021:081	sv
dc.subject	Machine learning	sv
dc.subject	reinforcement learning	sv
dc.subject	game theory	sv
dc.subject	iterated prisoner’s dilemma	sv
dc.subject	state representation	sv
dc.subject	Q-learning	sv
dc.title	MIXED MEMORY Q-LEARNER An adaptive reinforcement learning algorithm for the Iterated Prisoner’s Dilemma	sv
dc.type	Text	eng
dc.setspec.uppsok	Technology
dc.type.uppsok	M2
dc.contributor.department	Institutionen för tillämpad informationsteknologi	swe
dc.contributor.department	Department of Applied Information Technology	eng
dc.type.degree	Kandidatuppsats	swe
dc.type.degree	Bachelor thesis	eng

Filer under denna titel

Namn:: gupea_2077_69664_1.pdf
Storlek:: 1.352Mb
Format:: PDF
Description:: Thesis

Fil(er)

Dokumentet tillhör följande samling(ar)

Kandidatuppsatser/Bachelor theses / Institutionen för tillämpad informationsteknologi

Visa enkel post