dc.contributor.author | Lööf, Emelie | |
dc.date.accessioned | 2022-06-28T13:11:52Z | |
dc.date.available | 2022-06-28T13:11:52Z | |
dc.date.issued | 2022-06-28 | |
dc.identifier.uri | https://hdl.handle.net/2077/72386 | |
dc.description.abstract | The project presents an allocation strategy for the stochastic multi armed bandit
when considering instances with a clustered structure. Using the architecture
of the KL-UCB policy as a source of inspiration, an algorithm which exploits and
takes advantage from a clustered structure is derived. Firstly, encouraged by previous
work related to the subject, a multi-level structure approach will constitute as an initial
examination. Secondly, the Cluster KL-UCB policy will be derived and evaluated
considering three di erent approaches. It will be shown, both theoretically and empirically,
that adapting to a clustered environment improves the performance compared
to its non cluster-adapting ancestor. Both upper and lower bounds on the regret will
be provided in order to theoretically ensure the performance of the algorithm. Lastly,
a number of empirical experiments will be performed in order to further ensure the
performance and validate the theoretical results. | en_US |
dc.language.iso | eng | en_US |
dc.title | Cluster KL-UCB: Optimism for the Best, Pessimism for the Rest | en_US |
dc.title.alternative | An improvement and extension of the KL-UCB algorithm in a clustered multi armed bandit setting | en_US |
dc.type | text | |
dc.setspec.uppsok | PhysicsChemistryMaths | |
dc.type.uppsok | H2 | |
dc.contributor.department | University of Gothenburg/Department of Mathematical Science | eng |
dc.contributor.department | Göteborgs universitet/Institutionen för matematiska vetenskaper | swe |
dc.type.degree | Student essay | |