• English
    • svenska
  • English 
    • English
    • svenska
  • Login
View Item 
  •   Home
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • View Item
  •   Home
  • Student essays / Studentuppsatser
  • Department of Computer Science and Engineering / Institutionen för data- och informationsteknik
  • Masteruppsatser
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Improving sample-efficiency of model-free reinforcement learning algorithms on image inputs with representation learning

Abstract
Reinforcement learning struggles to solve control tasks on directly on images. Performance on identical tasks with access to the underlying states is much better. One avenue to bridge the gap between the two is to leverage unsupervised learning as a means of learning state representations from images, thereby resulting in a better conditioned reinforcement learning problem. Through investigation of related work, characteristics of successful integration of unsupervised learning and reinforcement learning are identified. We hypothesize that joint training of state representations and policies result in higher sample-efficiency if adequate regularization is provided. We further hypothesize that representations which correlate more strongly with the underlying Markov decision process result in additional sample-efficiency. These hypotheses are tested through a simple deterministic generative representation learning model (autoencoder) trained with image reconstruction loss and additional forward and inverse auxiliary losses. While our algorithm does not reach state-of-the-art performance, its modular implementation integrated in the reinforcement learning library Tianshou enables easy use to reinforcement learning practitioners, and thus also accelerates further research. We also identify which aspects of our solution are most important and use them to formulate promising research directions. In our tests we limited ourselves to Atari environments and primarily used Rainbow as the underlying reinforcement learning algorithm.
Degree
Student essay
URI
https://hdl.handle.net/2077/73890
Collections
  • Masteruppsatser
View/Open
CSE 22-31 Guberina Desta.pdf (1.473Mb)
Date
2022-10-14
Author
Guberina, Marko
Desta, Betelhem Dejene
Keywords
sample-efficient reinforcement learning
state representation learning
unsupervised learning
autoencoder
Language
eng
Metadata
Show full item record

Related items

Showing items related by title, author, creator and subject.

  • Flexibel utbildning i praktiken. En fallstudie av pedagogiska processer i en distansutbildning med en öppen design för samarbetslärande 

    Mattsson, Anita (2009-01-15)
    The aim of this study is to examine the pedagogical processes that evolve in an "open" design for online learning realized in relation to a specific setting. The study describes and analyzes pedagogical activities in a ...
  • Learning aspects of out-of-hospital cardiac arrest and learning activities in basic life support - a study among laypersons at workplaces in Sweden 

    Bylow, Helene (2021-01-29)
    Abstract Background: Out-of-hospital cardiac arrest (OHCA) is one of the leading causes of death worldwide. Despite healthcare improvements, prevention for cardiovascular health, training in adult basic life support (BLS) ...
  • Learning animal welfare and conservation at the zoo: 

    Ayalkibet Zeberga, Fikru (2020-11-17)
    Purpose: The aim of this research is to explore theories of learning animal welfare and conservation at the zoo using digital media. Hence, theories mainly focusing on animal welfare and conservation and learning will ...

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV