Author Archives: Josep Lumbreras

Bandits roaming Hilbert Space

The (classical) multi-armed bandit problem The multi-armed bandit problem is a simple model of decision-making with uncertainty that lies in the class of classical reinforcement learning problems. Given a set of arms, a learner interacts sequentially with these arms sampling … Continue reading

Posted in Uncategorized | Comments Off on Bandits roaming Hilbert Space