Creating safer reward functions for reinforcement learning agents in the gridworld

De Biase, Andres; Namgaudis, Mantas

Abstract

We adapted Goal-Oriented Action planning, a decision-making architecture common in video games into the machine learning world with the objective of creating a safer artificial intelligence. We evaluate it in randomly generated 2D grid-world scenarios and show that this adaptation can create a safer AI that also learns faster than conventional methods.

Degree

Student essay

Date

2019-11-12

Author

De Biase, Andres

Namgaudis, Mantas

Language

eng

Metadata

Show full item record