Exploiting structure in policy construction (original) (raw)

Decision-theoretic planning: Structural assumptions and computational leverage

Journal of Artificial Intelligence Research, 1999

Planning under uncertainty is a central problem in the study of automated sequential decision making, and has been addressed by researchers in many different fields, including AI planning, decision analysis, operations research, control theory and economics. While the assumptions and perspectives adopted in these fields often differ in substantial ways, many planning problems of interest to researchers in these fields can be modeled as Markov decision processes (MDPs) and analyzed using the techniques of decision theory. This paper presents an overview and synthesis of MDP-related methods showing how they provide a unifying framework for modeling many classes of planning problems studied in AI. It also describes structural properties of MDPs that, when exhibited by particular classes of problems, can be exploited in the construction of optimal or approximately optimal policies or plans. Planning problems commonly possess structure in the reward and value functions used to describe performance criteria, in the functions used to describe state transitions and observations, and in the relationships among features used to describe states, actions, rewards, and observations. Specialized representations, and algorithms employing these representations, can achieve computational leverage by exploiting these various forms of structure. Certain AI techniques---in particular those based on the use of structured, intensional representations---can be viewed in this way. This paper surveys several types of representations for both classical and decision theoretic planning problems, and planning algorithms that exploit these representations in a number of different ways to ease the computational burden of constructing policies or plans. It focuses primarily on abstraction, aggregation and decomposition techniques based on AI-style representations.

Planning and programming with first-order markov decision processes: insights and challenges

2001

Abstract Markov decision processes (MDPs) have become the de facto standard model for decision-theoretic planning problems. However, classic dynamic programming algorithms for MDPs [22] require explicit state and action enumeration. For example, the classical representation of a value function is a table or vector associating a value with each system state; such value functions are produced by iterating over the state space.

Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstraction

In this paper, we consider planning in stochastic shortest path (SSP) problems, a subclass of Markov Decision Problems (MDP). We focus on medium-size problems whose state space can be fully enumerated. This problem has numerous important applications, such as navigation and planning under uncertainty. We propose a new approach for constructing a multi-level hierarchy of progressively simpler abstractions of the original problem. Once computed, the hierarchy can be used to speed up planning by first finding a policy for the most abstract level and then recursively refining it into a solution to the original problem. This approach is fully automated and delivers a speed-up of two orders of magnitude over a state-of-the-art MDP solver on sample problems while returning near-optimal solutions. We also prove theoretical bounds on the loss of solution optimality resulting from the use of abstractions.

Solving Hybrid Markov Decision Processes

Lecture Notes in Computer Science, 2006

Markov decision processes (MDPs) have developed as a standard for representing uncertainty in decision-theoretic planning. However, MDPs require an explicit representation of the state space and the probabilistic transition model which, in continuous or hybrid continuousdiscrete domains, are not always easy to define. Even when this representation is available, the size of the state space and the number of state variables to consider in the transition function may be such that the resulting MDP cannot be solved using traditional techniques. In this paper a reward-based abstraction for solving hybrid MDPs is presented. In the proposed method, we gather information about the rewards and the dynamics of the system by exploring the environment. This information is used to build a decision tree (C4.5) representing a small set of abstract states with equivalent rewards, and then is used to learn a probabilistic transition function using a Bayesian networks learning algorithm (K2). The system output is a problem specification ready for its solution with traditional dynamic programming algorithms. We have tested our abstract MDP model approximation in real-world problem domains. We present the results in terms of the models learned and their solutions for different configurations showing that our approach produces fast solutions with satisfying policies.

Reduction of temporal complexity in Markov decision processes

CONIELECOMP 2012, 22nd International Conference on Electrical Communications and Computers, 2012

In this paper we present a new approach for the solution of Markov decision processes based on the use of an abstraction technique over the action space, which results in a set of abstract actions. Markovian processes have successfully solved many probabilistic problems such as: process control, decision analysis and economy. But for problems with continuous or high dimensionality domains, high computational complexity arises because the search space grows exponentially with the number of variables. In order to reduce computational complexity, our approach avoids the use of the whole domain actions during value iteration, calculating instead over the abstract actions that really operate on each state, as a state function. Our experimental results on a robot path planning task show an important reduction of computational complexity.

Speeding up planning in Markov decision processes via automatically constructed abstractions

2012

Abstract: In this paper, we consider planning in stochastic shortest path (SSP) problems, a subclass of Markov Decision Problems (MDP). We focus on medium-size problems whose state space can be fully enumerated. This problem has numerous important applications, such as navigation and planning under uncertainty. We propose a new approach for constructing a multi-level hierarchy of progressively simpler abstractions of the original problem.

SPUDD: Stochastic planning using decision diagrams

Proceedings of the Fifteenth …, 1999

Recently, structured methods for solving factored Markov decisions processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for complete state enumeration. We propose and examine a new value iteration algorithm for MDPs that uses algebraic decision diagrams (ADDs) to represent value functions and policies, assuming an ADD input representation of the MDP. Dynamic programming is implemented via ADD manipulation. We demonstrate our method on a class of large MDPs (up to 63 million states) and show that significant gains can be had when compared to tree-structured representations (with up to a thirty-fold reduction in the number of nodes required to represent optimal value functions).

Reducing Computational Complexity in Markov Decision Processes Using Abstract Actions

2008 Seventh Mexican International Conference on Artificial Intelligence, 2008

Integrating planning and execution in stochastic domains

Proceedings of the Tenth Conference on …, 1994

We investigate planning in time-critical domains represented as Markov Decision Processes. To reduce the computational cost of the algorithm we execute actions as we construct the plan, and sacrifice optimality by searching to a fixed depth and using a heuristic function to estimate the value of states. Although this paper concentrates on the search procedure, we also discuss ways of constructing heuristic functions that are suitable for this approach.

Intelligent Planning: A Markov Decision Process (MDP) Approach to Account for the Adversary

The problem of planning under uncertainty is one of the most important elements of a successful operation. In this context, planning that only accounts for a static, preconceived adversary will not suffice. Instead, an analysis of evolving enemy's centers of gravity and the available means of attacking those centers is necessary. This latter approach provides better estimates of the enemy's plans and strategic operations; it captures an evolving picture of the interactive and inter-related dynamics of the organization and the adversary.

Exploiting structure in policy construction (original) (raw)

Related papers