Eligibility Traces for Off-Policy Policy Evaluation (original) (raw)
Related papers
Improving the Efficiency of Off-Policy Reinforcement Learning by Accounting for Past Decisions
ArXiv, 2021
META-Learning State-based Eligibility Traces for More Sample-Efficient Policy Evaluation
2020
Off-policy learning with recognizers
Advances in Neural …, 2006
Efficient Eligibility Traces for Deep Reinforcement Learning
2018
1998
Conditional Importance Sampling for Off-Policy Learning
2019
Stateful Offline Contextual Policy Evaluation and Learning
ArXiv, 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
2021
Off-policy learning with options and recognizers
Advances in Neural …, 2006
Off-Policy Temporal Difference Learning with Function Approximation
2001
On Minimax Optimal Offline Policy Evaluation
Combining Off and On-Policy Training in Model-Based Reinforcement Learning
ArXiv, 2021
Off-Policy Correction for Actor-Critic Methods without Importance Sampling
arXiv (Cornell University), 2022
Learning State Features from Policies to Bias Exploration in Reinforcement Learning
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
ArXiv, 2020
2009
Model-free Monte Carlolike policy evaluation
Proceedings of the …, 2010
Policy Learning and Evaluation with Randomized Quasi-Monte Carlo
arXiv (Cornell University), 2022
META-Learning State-based {\lambda} for More Sample-Efficient Policy Evaluation
arXiv: Learning, 2019
Inferring the Optimal Policy using Markov Chain Monte Carlo
ArXiv, 2019
Active Offline Policy Selection
arXiv (Cornell University), 2021
IEEE transactions on neural networks and learning systems, 2018
An incremental off-policy search in a model-free Markov decision process using a single sample path
Machine Learning, 2018