Packet Drop Probability-Optimal Cross-layer Scheduling: Dealing with Curse of Sparsity using Prioritized Experience Replay

Page view(s)
50
Checked on Nov 14, 2024
Packet Drop Probability-Optimal Cross-layer Scheduling: Dealing with Curse of Sparsity using Prioritized Experience Replay
Title:
Packet Drop Probability-Optimal Cross-layer Scheduling: Dealing with Curse of Sparsity using Prioritized Experience Replay
Journal Title:
2021 IEEE International Conference on Communications Workshops (ICC Workshops)
Keywords:
Publication Date:
09 July 2021
Citation:
Sharma, M. K., Hui, T. P., Kurniawan, E., Sumei, S. (2021). Packet Drop Probability-Optimal Cross-layer Scheduling: Dealing with Curse of Sparsity using Prioritized Experience Replay. 2021 IEEE International Conference on Communications Workshops (ICC Workshops). doi:10.1109/iccworkshops50388.2021.9473857
Abstract:
In this work, we develop a reinforcement learning (RL) based model-free approach to obtain a policy for joint packet scheduling and rate adaptation, such that the packet drop probability (PDP) is minimized. The developed learning scheme yields an online cross-layer scheduling policy which takes into account the randomness in packet arrivals and wireless channels, as well as the state of packet buffers. Inherent difference in the time-scales of packet arrival process and the wireless channel variations leads to sparsity in the observed reward signal. Since an RL agent learns by using the feedback obtained in terms of rewards for its actions, the sample complexity of RL approach increases exponentially due to resulting sparsity. Therefore, a basic RL based approach, e.g., double deep Q-network (DDQN) based RL, results in a policy with negligible performance gain over the state-of-the-art schemes, such as shortest processing time (SPT) based scheduling. In order to alleviate the sparse reward problem, we leverage prioritized experience replay (PER) and develop a DDQN-based learning scheme with PER. We observe through simulations that the policy learned using DDQN-PER approach results in a 3-5% lower PDP, compared to both the basic DDQN based RL and SPT scheme.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the Agency for Science, Technology and Research - RIE2020 Advanced Manufacturing and Engineering (AME) Industry Alignment Fund-Pre-Positioning (IAF-PP), Project 5G-AMSUS
Grant Reference no. : A20F8a0044
Description:
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
ISSN:
2694-2941
2164-7038
ISBN:
978-1-7281-9441-7
978-1-7281-9442-4
Files uploaded:

File Size Format Action
sh-pdp-dqn-per-double-column.pdf 306.15 KB PDF Open