Satinder Singh | Mdu Rohtak (original) (raw)

Uploads

Papers by Satinder Singh

Research paper thumbnail of Between MDPs and Semi-MDPs: Learning, Planning, and Representing Knowledge at Multiple Temporal Scales

Artificial Intelligence, 1998

Research paper thumbnail of Strategic Interactions in a Supply Chain Game

Computational Intelligence, 2005

The TAC 2003 supply-chain game presented automated trading agents with a challenging strategic pr... more The TAC 2003 supply-chain game presented automated trading agents with a challenging strategic problem. Embedded within a high-dimensional stochastic environment was a pivotal strategic decision about initial procurement of components. Early evidence suggested that the entrant field was headed toward a self-destructive, mutually unprofitable equilibrium. Our agent, Deep Maize, introduced a preemptive strategy designed to neutralize aggressive procurement, perturbing the field to a more profitable equilibrium; it worked. Not only did preemption improve Deep Maize's profitability, it improved profitability for the whole field. Whereas it is perhaps counterintuitive that action designed to prevent others from achieving their goals actually helps them, strategic analysis employing an empirical game-theoretic methodology verifies and provides insight about this outcome.

Research paper thumbnail of Predictive State Representations: A New Theory for Modeling Dynamical Systems

Research paper thumbnail of Distributed Feedback Control for Decision Making on Supply Chains

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Policy Gradient Methods for Reinforcement Learning with Function Approximation

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Policy Gradient Methods for Reinforcement Learning with Function Approximation

Research paper thumbnail of Predictive Representations of State

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Predictive Representations of State

Research paper thumbnail of Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System

Journal of Artificial Intelligence Research, 2002

Research paper thumbnail of Policy Gradient Methods for Reinforcement Learning with Function Approximation

Research paper thumbnail of Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System

Journal of Artificial Intelligence Research, 2002

Research paper thumbnail of Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- kno... more 1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- known stochastic dynamic environments. For Markov environments a variety of dierent reinforcement learning algorithmshave been devised to predict and control the ...

Research paper thumbnail of Predictive Representations of State

Research paper thumbnail of Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- kno... more 1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- known stochastic dynamic environments. For Markov environments a variety of dierent reinforcement learning algorithmshave been devised to predict and control the ...

Research paper thumbnail of Near-Optimal Reinforcement Learning in Polynominal Time

Abstract. We present new algorithms for reinforcement learning and prove that they have polynomia... more Abstract. We present new algorithms for reinforcement learning and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal ...

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Near-Optimal Reinforcement Learning in Polynominal Time

Abstract. We present new algorithms for reinforcement learning and prove that they have polynomia... more Abstract. We present new algorithms for reinforcement learning and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal ...

Research paper thumbnail of Between MDPs and Semi-MDPs: Learning, Planning, and Representing Knowledge at Multiple Temporal Scales

Artificial Intelligence, 1998

Research paper thumbnail of Strategic Interactions in a Supply Chain Game

Computational Intelligence, 2005

The TAC 2003 supply-chain game presented automated trading agents with a challenging strategic pr... more The TAC 2003 supply-chain game presented automated trading agents with a challenging strategic problem. Embedded within a high-dimensional stochastic environment was a pivotal strategic decision about initial procurement of components. Early evidence suggested that the entrant field was headed toward a self-destructive, mutually unprofitable equilibrium. Our agent, Deep Maize, introduced a preemptive strategy designed to neutralize aggressive procurement, perturbing the field to a more profitable equilibrium; it worked. Not only did preemption improve Deep Maize's profitability, it improved profitability for the whole field. Whereas it is perhaps counterintuitive that action designed to prevent others from achieving their goals actually helps them, strategic analysis employing an empirical game-theoretic methodology verifies and provides insight about this outcome.

Research paper thumbnail of Predictive State Representations: A New Theory for Modeling Dynamical Systems

Research paper thumbnail of Distributed Feedback Control for Decision Making on Supply Chains

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Policy Gradient Methods for Reinforcement Learning with Function Approximation

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Policy Gradient Methods for Reinforcement Learning with Function Approximation

Research paper thumbnail of Predictive Representations of State

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Predictive Representations of State

Research paper thumbnail of Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System

Journal of Artificial Intelligence Research, 2002

Research paper thumbnail of Policy Gradient Methods for Reinforcement Learning with Function Approximation

Research paper thumbnail of Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System

Journal of Artificial Intelligence Research, 2002

Research paper thumbnail of Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- kno... more 1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- known stochastic dynamic environments. For Markov environments a variety of dierent reinforcement learning algorithmshave been devised to predict and control the ...

Research paper thumbnail of Predictive Representations of State

Research paper thumbnail of Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems

1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- kno... more 1 INTRODUCTION Reinforcement learning provides a sound framework for credit assignment in un- known stochastic dynamic environments. For Markov environments a variety of dierent reinforcement learning algorithmshave been devised to predict and control the ...

Research paper thumbnail of Near-Optimal Reinforcement Learning in Polynominal Time

Abstract. We present new algorithms for reinforcement learning and prove that they have polynomia... more Abstract. We present new algorithms for reinforcement learning and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal ...

Research paper thumbnail of ATTac-2000: an adaptive autonomous bidding agent

Research paper thumbnail of Near-Optimal Reinforcement Learning in Polynominal Time

Abstract. We present new algorithms for reinforcement learning and prove that they have polynomia... more Abstract. We present new algorithms for reinforcement learning and prove that they have polynomial bounds on the resources required to achieve near-optimal return in general Markov decision processes. After observing that the number of actions required to approach the optimal ...

Log In