AI Safe Exploration: Reinforced learning with a blocker in unsafe environments

Koivisto, Marco; Crockett, Philip; Spångberg, Axel

dc.contributor.author	Koivisto, Marco
dc.contributor.author	Crockett, Philip
dc.contributor.author	Spångberg, Axel
dc.date.accessioned	2019-11-12T10:40:01Z
dc.date.available	2019-11-12T10:40:01Z
dc.date.issued	2019-11-12
dc.identifier.uri	http://hdl.handle.net/2077/62440
dc.description.abstract	Artificial intelligence can be trained with a trial and error based approach. In an environment where a catastrophe can not be accepted a human overseer can be used, but this might lower the efficiency of the learning. The study includes implementation of an artifact meant to replace the human overseer when training an AI in simulated unsafe environments. The results of testing the implemented blocker shows that it can be used for avoiding catastrophes and finding a path to reach the goal in 17 out of 18 runs. The single failed execution shows that the implemented blocker is in need of improvement in terms of data efficiency. Shaping rewards solely to reduce number of steps and catastrophes for a reinforcement learning agent has been done successfully to some degree, but further steps can be taken to lower the number of catastrophes and steps.	sv
dc.language.iso	eng	sv
dc.subject	Artificial Intelligence	sv
dc.subject	Reinforcement learning	sv
dc.subject	Safe exploration	sv
dc.subject	Blocker	sv
dc.subject	Machine Learning	sv
dc.subject	Baby AI Game	sv
dc.subject	Gym Mini Grid	sv
dc.title	AI Safe Exploration: Reinforced learning with a blocker in unsafe environments	sv
dc.type	text
dc.setspec.uppsok	Technology
dc.type.uppsok	M2
dc.contributor.department	Göteborgs universitet/Institutionen för data- och informationsteknik	swe
dc.contributor.department	University of Gothenburg/Department of Computer Science and Engineering	eng
dc.type.degree	Student essay

Files in this item

Name:: gupea_2077_62440_1.pdf
Size:: 469.9Kb
Format:: PDF
Description:: CSE Group 8 - Koivisto, Crockett ...

View/Open

This item appears in the following Collection(s)

Kandidatuppsatser

Show simple item record