Cluster KL-UCB: Optimism for the Best, Pessimism for the Rest

Lööf, Emelie

dc.contributor.author	Lööf, Emelie
dc.date.accessioned	2022-06-28T13:11:52Z
dc.date.available	2022-06-28T13:11:52Z
dc.date.issued	2022-06-28
dc.identifier.uri	https://hdl.handle.net/2077/72386
dc.description.abstract	The project presents an allocation strategy for the stochastic multi armed bandit when considering instances with a clustered structure. Using the architecture of the KL-UCB policy as a source of inspiration, an algorithm which exploits and takes advantage from a clustered structure is derived. Firstly, encouraged by previous work related to the subject, a multi-level structure approach will constitute as an initial examination. Secondly, the Cluster KL-UCB policy will be derived and evaluated considering three di erent approaches. It will be shown, both theoretically and empirically, that adapting to a clustered environment improves the performance compared to its non cluster-adapting ancestor. Both upper and lower bounds on the regret will be provided in order to theoretically ensure the performance of the algorithm. Lastly, a number of empirical experiments will be performed in order to further ensure the performance and validate the theoretical results.	en
dc.language.iso	eng	en
dc.title	Cluster KL-UCB: Optimism for the Best, Pessimism for the Rest	en
dc.title.alternative	An improvement and extension of the KL-UCB algorithm in a clustered multi armed bandit setting	en
dc.type	text
dc.setspec.uppsok	PhysicsChemistryMaths
dc.type.uppsok	H2
dc.contributor.department	University of Gothenburg/Department of Mathematical Science	eng
dc.contributor.department	Göteborgs universitet/Institutionen för matematiska vetenskaper	swe
dc.type.degree	Student essay

Files in this item

Name:: Master_Thesis_Emelie_Lööf_20 ...
Size:: 1.084Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Masteruppsatser

Show simple item record