Stochastic Multiplicative Weights Updates in Zero-Sum Games (original) (raw)

We study agents competing against each other in a repeated network zero-sum game while applying the multiplicative weights update (MWU) algorithm with fixed learning rates. In our implementation, agents select their strategies probabilistically in each iteration and update their weights/strategies using the realized vector payoff of all strategies , i.e., stochastic MWU with full information. We show that the system results in an irreducible Markov chain where agent strategies diverge from the set of Nash equilibria. Further, we show that agents will play pure strategies with probability 1 in the limit. (a) 10 Iterations (b) 100 Iterations (c) 250 Iterations (d) 500 Iterations Figure 1: The Game Matching Pennies Updated with Stochastic MWU. The -axis and y-axis Show Agent 1’s and Agent 2’s Probabilities of Playing “Heads” Respectively. The Opaqueness of Each Rectangle is Proportional to the Probability that the Agents’ Strategies Appear in the Region. The Four Figures Demonstrate th...