AI Safe Exploration: Reinforced learning with a blocker in unsafe environments
Sammanfattning
Artificial intelligence can be trained with a trial and
error based approach. In an environment where a catastrophe
can not be accepted a human overseer can be used, but this
might lower the efficiency of the learning. The study includes
implementation of an artifact meant to replace the human
overseer when training an AI in simulated unsafe environments.
The results of testing the implemented blocker shows that it can
be used for avoiding catastrophes and finding a path to reach
the goal in 17 out of 18 runs. The single failed execution shows
that the implemented blocker is in need of improvement in terms
of data efficiency. Shaping rewards solely to reduce number of
steps and catastrophes for a reinforcement learning agent has
been done successfully to some degree, but further steps can be
taken to lower the number of catastrophes and steps.
Examinationsnivå
Student essay
Samlingar
Datum
2019-11-12Författare
Koivisto, Marco
Crockett, Philip
Spångberg, Axel
Nyckelord
Artificial Intelligence
Reinforcement learning
Safe exploration
Blocker
Machine Learning
Baby AI Game
Gym Mini Grid
Språk
eng