dc.contributor.author | Dollbo, Anna | |
dc.date.accessioned | 2021-09-21T07:35:20Z | |
dc.date.available | 2021-09-21T07:35:20Z | |
dc.date.issued | 2021-09-21 | |
dc.identifier.uri | http://hdl.handle.net/2077/69664 | |
dc.description.abstract | The success of future societies is likely to depend on cooperative interactions
between humans and artificial agents. As such, it is important to investigate how
machines can learn to cooperate. By looking at how machines handle complex
social situations, so-called social dilemmas, knowledge about the components
necessary for cooperation in artificial agents can be acquired. In this study, a
reinforcement learning algorithm was used to study the Iterated Prisoner’s
Dilemma (IPD), a common social dilemma game. A reinforcement learning
algorithm can make decisions in the IPD by considering a given number of its
opponent’s last actions, thus representing the agent’s memory. This study
investigated the role of different memory lengths on the performance of the agent
in the IPD. The results showed that different memory lengths are preferable
depending on the opponent. A new algorithm was created called Mixed Memory
Q-Learner (MMQL), which could switch memory length during play to adapt to its
opponent. It could also recognise its opponent between games, thus continuing its
learning over several interactions. MMQL performed better against certain
opponents in the IPD but did not learn to cooperate with cooperative players.
Further capabilities might therefore be added to the algorithm to invite cooperation,
or the environment can be manipulated. The results suggest that flexibility in how a
situation is represented and the ability to recognise opponents are important
capabilities for artificial agents in social dilemmas. | sv |
dc.language.iso | eng | sv |
dc.relation.ispartofseries | 2021:081 | sv |
dc.subject | Machine learning | sv |
dc.subject | reinforcement learning | sv |
dc.subject | game theory | sv |
dc.subject | iterated prisoner’s dilemma | sv |
dc.subject | state representation | sv |
dc.subject | Q-learning | sv |
dc.title | MIXED MEMORY Q-LEARNER An adaptive reinforcement learning algorithm for the Iterated Prisoner’s Dilemma | sv |
dc.type | Text | eng |
dc.setspec.uppsok | Technology | |
dc.type.uppsok | M2 | |
dc.contributor.department | Institutionen för tillämpad informationsteknologi | swe |
dc.contributor.department | Department of Applied Information Technology | eng |
dc.type.degree | Kandidatuppsats | swe |
dc.type.degree | Bachelor thesis | eng |