Explore-exploit dilemma in Ranking model
The balance between exploration and exploitation explained
A common problem that appears in many contexts is the balance between exploiting something known to be good, with exploring the unknown in hopes to find something better. In the Data Science world, the friction between exploration and exploitation is often explained using the well-known multi-armed bandit problem. The objective of the problem is to divide a fixed number of resources between competing choices to maximize their expected gains, given that the properties of each choice are not fully known at the time of allocation.