Reinforcement Learning exploiting state-action equivalence (original) (raw)

Model-Based Reinforcement Learning Exploiting State-Action Equivalence

Odalric-Ambrym Maillard

2019

View PDFchevron_right

Optimal Regret Bounds for Selecting the State Representation in Reinforcement Learning

Phuong Nguyen

View PDFchevron_right

Clustering markov decision processes for continual transfer

Benjamin Rosman, Subramanian Ramamoorthy

View PDFchevron_right

Learning in Markov Decision Processes under Constraints

Abhishek Gupta

ArXiv, 2020

View PDFchevron_right

Near-optimal Regret Bounds for Reinforcement Learning

Peter Auer

Journal of Machine Learning Research, 2010

View PDFchevron_right

Using expert knowledge to construct error bound state-action aggregations for reinforcement learning

Robby Goetschalckx

Proceedings of the 19th Belgian-Dutch …, 2007

View PDFchevron_right

Unsupervised Discovery of Decision States for Transfer in Reinforcement Learning

Prithvijit Chattopadhyay

2019

View PDFchevron_right

Selecting near-optimal approximate state representations in reinforcement learning

Odalric-Ambrym Maillard

View PDFchevron_right

A learning algorithm for Markov decision processes with adaptive state aggregation

John Baras

2000

View PDFchevron_right

Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions

Yevgeny Seldin

2013

View PDFchevron_right

Efficient Learning in Non-Stationary Linear Markov Decision Processes

ahmed touati

ArXiv, 2020

View PDFchevron_right

Online Learning in Markov Decision Processes with Changing Cost Sequences

Csaba Szepesvari

2014

View PDFchevron_right

Navigating to the Best Policy in Markov Decision Processes

Aurélien Garivier

2021

View PDFchevron_right

Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs

Odalric-Ambrym Maillard

2018

View PDFchevron_right

Metrics for Finite Markov Decision Processes

Doina Precup

2004

View PDFchevron_right

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes

John Loch

1998

View PDFchevron_right

Markov Decision Processes with Long-Term Average Constraints

mridul agarwal

ArXiv, 2021

View PDFchevron_right

Maximum Expected Hitting Cost of a Markov Decision Process and Informativeness of Rewards

Falcon Dai

2019

View PDFchevron_right

Regret Bounds for Restless Markov Bandits

Peter Auer

Lecture Notes in Computer Science, 2012

View PDFchevron_right

The effect of eligibility traces on finding optimal memoryless policies in partially observable Markov decision processes

John Loch

Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems Ii, 1999

View PDFchevron_right

Learning Successor States and Goal-Dependent Values: A Mathematical Viewpoint

Léonard Blier

2021

View PDFchevron_right

L G ] 2 9 A ug 2 01 9 Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Gaurav Mahajan

2019

View PDFchevron_right

The Advantage Regret-Matching Actor-Critic

Mohammad Azar

ArXiv, 2020

View PDFchevron_right

Improved Exploration in Factored Average-Reward MDPs

Odalric-Ambrym Maillard

2021

View PDFchevron_right

Learning Algorithms for Markov Decision Processes with Average Cost

Dimitri Bertsekas, J. Abounadi

SIAM Journal on Control and Optimization, 2001

View PDFchevron_right

A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes

Honghao Wei

arXiv (Cornell University), 2021

View PDFchevron_right

A Sliding-Window Algorithm for Markov Decision Processes with Arbitrarily Changing Rewards and Transitions

Peter Auer

2018

View PDFchevron_right

Reinforcement Learning with Non-Markovian Rewards

Vaneet Aggarwal

Proceedings of the AAAI Conference on Artificial Intelligence, 2020

View PDFchevron_right

A generalized reinforcement-learning model: Convergence and applications

Csaba Szepesvari

MACHINE LEARNING-INTERNATIONAL WORKSHOP THEN CONFERENCE-, 1996

View PDFchevron_right

Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs

Tamer Basar

2021

View PDFchevron_right

Reinforcement Learning and Markov Decision Processes

Marco A. Wiering

Reinforcement Learning, 2012

View PDFchevron_right

Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model

Mohammad Azar

Machine Learning, 2013

View PDFchevron_right

State Representation Learning for Goal-Conditioned Reinforcement Learning

LORENZO STECCANELLA

2022

View PDFchevron_right

Hierarchical Representation Learning for Markov Decision Processes

LORENZO STECCANELLA

2021

View PDFchevron_right

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Gaurav Mahajan

2020

View PDFchevron_right