Explore-exploit dilemma in Ranking model

The balance between exploration and exploitation explained

Nov 04, 2022

A common problem that appears in many contexts is the balance between exploiting something known to be good, with exploring the unknown in hopes to find something better. In the Data Science world, the friction between exploration and exploitation is often explained using the well-known multi-armed bandit problem. The objective of the problem is to divide a fixed number of resources between competing choices to maximize their expected gains, given that the properties of each choice are not fully known at the time of allocation.

Read the full article on our tech blog

trivago tech newsletter

Discussion about this post