Guo, H., Hou, X., & Peng, Q. (2022). CTD: Cascaded Temporal Difference Learning for the Mean-Standard Deviation Shortest Path Problem. IEEE Transactions on Intelligent Transportation Systems, 23(8), 10868–10886. https://doi.org/10.1109/tits.2021.3096829
Abstract:
This paper investigates the reliable shortest path (RSP) planning problem from the reinforcement learning perspective. Different from canonical path planning methods, which require at least the first- order statistic (mean) and second-order statistic (variance) information of travel time distribution, we target at the RSP planning problem without the assumption of knowing any travel time distribution characteristic beforehand, and propose a cascaded temporal difference learning (CTD) method, which simultaneously estimates the mean and variance of the executing path and thereby gradually makes improvements through the generalized policy iteration (GPI) scheme, as the ego vehicle interacts with the environment. Extensive simulation results demonstrate the applicability of the proposed method for RSP learning in various transportation networks.
License type:
Publisher Copyright
Funding Info:
There was no specific funding for the research done