Improving sample-efficiency of model-free reinforcement learning algorithms on image inputs with representation learning
Sammanfattning
Reinforcement learning struggles to solve control tasks on directly on images. Performance
on identical tasks with access to the underlying states is much better. One
avenue to bridge the gap between the two is to leverage unsupervised learning as a
means of learning state representations from images, thereby resulting in a better
conditioned reinforcement learning problem. Through investigation of related work,
characteristics of successful integration of unsupervised learning and reinforcement
learning are identified. We hypothesize that joint training of state representations
and policies result in higher sample-efficiency if adequate regularization is provided.
We further hypothesize that representations which correlate more strongly with the
underlying Markov decision process result in additional sample-efficiency. These hypotheses
are tested through a simple deterministic generative representation learning
model (autoencoder) trained with image reconstruction loss and additional forward
and inverse auxiliary losses. While our algorithm does not reach state-of-the-art
performance, its modular implementation integrated in the reinforcement learning
library Tianshou enables easy use to reinforcement learning practitioners, and thus
also accelerates further research. We also identify which aspects of our solution are
most important and use them to formulate promising research directions. In our
tests we limited ourselves to Atari environments and primarily used Rainbow as the
underlying reinforcement learning algorithm.
Examinationsnivå
Student essay
Samlingar
Datum
2022-10-14Författare
Guberina, Marko
Desta, Betelhem Dejene
Nyckelord
sample-efficient reinforcement learning
state representation learning
unsupervised learning
autoencoder
Språk
eng
Metadata
Visa fullständig postRelated items
Showing items related by title, author, creator and subject.
-
Flexibel utbildning i praktiken. En fallstudie av pedagogiska processer i en distansutbildning med en öppen design för samarbetslärande
Mattsson, Anita (2009-01-15)The aim of this study is to examine the pedagogical processes that evolve in an "open" design for online learning realized in relation to a specific setting. The study describes and analyzes pedagogical activities in a ... -
Learning aspects of out-of-hospital cardiac arrest and learning activities in basic life support - a study among laypersons at workplaces in Sweden
Bylow, Helene (2021-01-29)Abstract Background: Out-of-hospital cardiac arrest (OHCA) is one of the leading causes of death worldwide. Despite healthcare improvements, prevention for cardiovascular health, training in adult basic life support (BLS) ... -
Learning animal welfare and conservation at the zoo:
Ayalkibet Zeberga, Fikru (2020-11-17)Purpose: The aim of this research is to explore theories of learning animal welfare and conservation at the zoo using digital media. Hence, theories mainly focusing on animal welfare and conservation and learning will ...