Recurrent Softmax Delayed Deep Double Deterministic Policy Gradient (

$\mathtt {RSD4}$

) (https://github.com/hupihe/RSD4), which is a data-driven method based on a Partially Observed Markov Decision Process (POMDP) formulation.

$\mathtt {RSD4}$

guarantees resource and delay constraints by Lagrangian dual and delay-sensitive queues, respectively. It also efficiently handles partial observability with a memory mechanism enabled by the recurrent neural network (RNN). Moreover, it introduces user-level decomposition and node-level merging to support large-scale multihop scenarios. Extensive experiments on simulated and real-world datasets demonstrate that

$\mathtt {RSD4}$

is robust to system dynamics and partially observable environments and achieves superior performance over existing methods.">

Multi-User Delay-Constrained Scheduling With Deep Recurrent Reinforcement Learning (original) (raw)

IEEE Account

Purchase Details

Profile Information

Need Help?

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.
© Copyright 2026 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.