Expanding on Repeated Consumer Search Using Multi-Armed Bandits and Secretaries (original) (raw)
We seek to take a different approach in deriving the optimal search policy for the repeated consumer search model found in Fishman & Rob (1995) with the main motivation of dropping the assumption of prior knowledge of the price distribution F (p) in each period. We will do this by incorporating the famous multi-armed bandit problem (MAB). We start by modifying the MAB framework to fit the setting of the repeated consumer search model and formulate the objective as a dynamic optimization problem. Then, given any sequence of exploration we assign a value to each store in that sequence using Bellman equations. We then proceed to break down the problem into individual optimal stopping problems for each period which incidentally coincides with the framework of the famous secretary problem where we proceed to derive the optimal stopping policy. We will see that implementing the optimal stopping policy in each period solves the original dynamic optimization by ‘forward induction’ reasoning.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.