Author Archives: Josep Lumbreras
Bandits roaming Hilbert Space
The (classical) multi-armed bandit problem The multi-armed bandit problem is a simple model of decision-making with uncertainty that lies in the class of classical reinforcement learning problems. Given a set of arms, a learner interacts sequentially with these arms sampling … Continue reading
Posted in Uncategorized
Comments Off on Bandits roaming Hilbert Space