Randomized Sampling for Large Zero-Sum Games I,II (original) (raw)

Randomized sampling for large zero-sum games

2010

This paper addresses the solution of large zero-sum matrix games using randomized methods. We formalize a procedure, termed as sampled saddle point (SSP), by which a player can compute mixed policies that, with a high probability, are security policies against an adversary playing the same game and who is also using the SSP procedure. The computational savings result from solving stochastically sampled subgames that are much smaller than the original game. We provide two methodologies and determine how large the subgames should be to guarantee the desired high probability. The first methodology provides a game-independent bound on the size of the subgames that can be computed a priori. The second methodology is useful when computation limitations prevent a player from satisfying the first game-independent bound and provides a high-probability bound on how much the outcome of the game can violate the precomputed security level. We also analyze the effect of mismatch between the distributions used by the two players to obtain their respective subgames using the SSP procedure, and extend the previous bounds based on the distributions used by the players. Finally, we demonstrate the usefulness of these results in solving a hide-and-seek game that is known to exhibit exponential complexity.

Computing Optimal Randomized Resource Allocations for Massive Security Games

Algorithms, Deployed Systems, Lessons Learned, 2009

Predictable allocations of security resources such as police officers, canine units, or checkpoints are vulnerable to exploitation by attackers. Recent work has applied game-theoretic methods to find optimal randomized security policies, including a fielded application at the Los Angeles International Airport (LAX). This approach has promising applications in many similar domains, including police patrolling for subway and bus systems, randomized baggage screening, and scheduling for the Federal Air Marshal Service (FAMS) on commercial flights. However, the existing methods scale poorly when the security policy requires coordination of many resources, which is central to many of these potential applications.

Security games with decision and observation errors

Proceedings of the 2010 American Control Conference, 2010

We study two-player security games which can be viewed as sequences of nonzero-sum matrix games played by an Attacker and a Defender. The evolution of the game is based on a stochastic fictitious play process. Players do not have access to each other's payoff matrix. Each has to observe the other's actions up to present and plays the action generated based on the best response to these observations. However, when the game is played over a communication network, there are several practical issues that need to be taken into account: First, the players may make random decision errors from time to time. Second, the players' observations of each other's previous actions may be incorrect. The players will try to compensate for these errors based on the information they have. We examine convergence property of the game in such scenarios, and establish convergence to the equilibrium point under some mild assumptions when both players are restricted to two actions.

Security Games with Incomplete Information

2009 IEEE International Conference on Communications, 2009

We study two-player security games which can be viewed as sequences of nonzero-sum matrix games where at each stage of the iterations the players make imperfect observations of each other's previous actions. The players are the Attacker and the Defense System, who have at their disposal two possible actions each. For the former, the two actions are "attack" and "not to attack", and for the latter they are "defend" and "not to defend". The underlying decision process can be viewed as a fictitious play (FP) game, but what differentiates this class from the standard one is that the communication channels that carry action information from one player to the other, or the sensor systems, are error prone. Two possible scenarios are addressed in the paper: (i) the error probabilities associated with the sensor systems are known to the players, then our analysis provides guidelines for each player to reach the Nash equilibrium (NE), which is related to the NE of the underlying static game; (ii) the error probabilities are unknown to the players, in which case we study the effect of errors in the observations on the convergence to the NE and the final outcome of the game. We discuss both classical FP and stochastic FP, where for the latter the payoff function of each player includes an entropy term to randomize its own strategy, which can be interpreted as a way of concealing its true strategy.

Security Games With Information Leakage: Modeling and Computation

Most models of Stackelberg security games assume that the attacker only knows the defender's mixed strategy, but is not able to observe (even partially) the instantiated pure strategy. Such partial observation of the deployed pure strategy -- an issue we refer to as {\it information leakage} -- is a significant concern in practical applications. While previous research on patrolling games has addressed the attacker's real-time surveillance, we provide a significant advance. More specifically, after formulating an LP to compute the defender's optimal strategy in the presence of leakage, we start with a hardness result showing that a subproblem (more precisely, the defender oracle) is NP-hard {\it even} for the simplest of security game models. We then approach the problem from three possible directions: efficient algorithms for restricted cases, approximation algorithms, and better sampling algorithms. Our experiments confirm the necessity of handling information leakage ...

Computing Optimal Mixed Strategies for Security Games with Dynamic Payoffs

2015

Security agencies in the real world often need to protect targets with time-dependent values, e.g., tourist sites where the number of travelers changes over time. Since the values of different targets often change asynchronously, the defender can relocate security resources among targets dynamically to make the best use of limited resources. We propose a game-theoretic scheme to develop dynamic, randomized security strategies in consideration of adversary's surveillance capability. This differs from previous studies on security games by considering varying target values and continuous strategy spaces of the security agency and the adversary. The main challenge lies in the computational intensiveness due to the continuous, hence infinite strategy spaces. We propose an optimal algorithm and an arbitrarily near-optimal algorithm to compute security strategies under different conditions. Experimental results show that both algorithms significantly outperform existing approaches.

Security games with interval uncertainty

Adaptive Agents and Multi-Agents Systems, 2013

Security games provide a framework for allocating limited security resources in adversarial domains, and are currently used in deployed systems for LAX, the Federal Air Marshals, and the U.S. Coast Guard. One of the major challenges in security games is finding solutions that are robust to uncertainty about the game model. Bayesian game models have been used to model uncertainty, but algorithms for these games do not scale well enough for many applications. We take an alternative approach based on using intervals to model uncertainty in security games. We present a fast polynomial time algorithm for security games with interval uncertainty, which represents the first viable approach for computing robust solutions to very large security games. We also introduce a methodology for using intervals to approximate solutions to infinite Bayesian games with distributional uncertainty. Our experiments show that intervals can be an effective approach for these more general Bayesian games; our algorithm is faster and results in higher quality solutions than previous methods.

Risk-averse strategies for security games with execution and observational uncertainty

2011

Attacker-defender Stackelberg games have become a popular game-theoretic approach for security with deployments for LAX Police, the FAMS and the TSA. Unfortunately, most of the existing solution approaches do not model two key uncertainties of the real-world: there may be noise in the defender's execution of the suggested mixed strategy and/or the observations made by an attacker can be noisy. In this paper, we provide a framework to model these uncertainties, and demonstrate that previous strategies perform poorly in such uncertain settings. We also provide RECON, a novel algorithm that computes strategies for the defender that are robust to such uncertainties, and provide heuristics that further improve RE-CON's efficiency.

The Price of Uncertainty in Security Games

Economics of Information Security and Privacy, 2010

In the realm of information security, lack of information about other users' incentives in a network can lead to inefficient security choices and reductions in individuals' payoffs. We propose, contrast and compare three metrics for measuring the price of uncertainty due to the departure from the payoffoptimal security outcomes under complete information. Per the analogy with other efficiency metrics, such as the price of anarchy, we define the price of uncertainty as the maximum discrepancy in expected payoff in a complete information environment versus the payoff in an incomplete information environment. We consider difference, payoff-ratio, and cost-ratio metrics as canonical nontrivial measurements of the price of uncertainty.

Approximation Algorithm for Security Games with Costly Resources

Lecture Notes in Computer Science, 2011

In recent years, algorithms for computing game-theoretic solutions have been developed for real-world security domains. These games are between a defender, who must allocate her resources to defend potential targets, and an attacker, who chooses a target to attack. Existing work has assumed the set of defender's resources to be fixed. This assumption precludes the effective use of approximation algorithms, since a slight change in the defender's allocation strategy can result in a massive change in her utility. In contrast, we consider a model where resources are obtained at a cost, initiating the study of the following optimization problem: Minimize the total cost of the purchased resources, given that every target has to be defended with at least a certain probability. We give an efficient logarithmic approximation algorithm for this problem.