dc.contributor.author | De Biase, Andres | |
dc.contributor.author | Namgaudis, Mantas | |
dc.date.accessioned | 2019-11-12T11:20:08Z | |
dc.date.available | 2019-11-12T11:20:08Z | |
dc.date.issued | 2019-11-12 | |
dc.identifier.uri | http://hdl.handle.net/2077/62445 | |
dc.description.abstract | We adapted Goal-Oriented Action planning, a decision-making architecture common in video games into the machine learning world with the objective of creating a safer artificial intelligence. We evaluate it in randomly generated 2D grid-world scenarios and show that this adaptation can create a safer AI that also learns faster than conventional methods. | sv |
dc.language.iso | eng | sv |
dc.title | Creating safer reward functions for reinforcement learning agents in the gridworld | sv |
dc.type | text | |
dc.setspec.uppsok | Technology | |
dc.type.uppsok | M2 | |
dc.contributor.department | Göteborgs universitet/Institutionen för data- och informationsteknik | swe |
dc.contributor.department | University of Gothenburg/Department of Computer Science and Engineering | eng |
dc.type.degree | Student essay | |