Creating safer reward functions for reinforcement learning agents in the gridworld

dc.contributor.author	De Biase, Andres
dc.contributor.author	Namgaudis, Mantas
dc.date.accessioned	2019-11-12T11:20:08Z
dc.date.available	2019-11-12T11:20:08Z
dc.date.issued	2019-11-12
dc.identifier.uri	http://hdl.handle.net/2077/62445
dc.description.abstract	We adapted Goal-Oriented Action planning, a decision-making architecture common in video games into the machine learning world with the objective of creating a safer artificial intelligence. We evaluate it in randomly generated 2D grid-world scenarios and show that this adaptation can create a safer AI that also learns faster than conventional methods.	sv
dc.language.iso	eng	sv
dc.title	Creating safer reward functions for reinforcement learning agents in the gridworld	sv
dc.type	text
dc.setspec.uppsok	Technology
dc.type.uppsok	M2
dc.contributor.department	Göteborgs universitet/Institutionen för data- och informationsteknik	swe
dc.contributor.department	University of Gothenburg/Department of Computer Science and Engineering	eng
dc.type.degree	Student essay