通信学报
• 学术论文 • 上一篇 下一篇
黄 蔚,刘 全,孙洪坤,傅启明,周小科
出版日期:
发布日期:
基金资助:
Online:
Published:
摘要: 提出一种基于拓扑序列更新的值迭代算法,利用状态之间的迁移关联信息,将任务模型的有向图分解为一系列规模较小的强连通分量,并依据拓扑序列对强连通分量进行更新。在经典规划问题Mountain Car和迷宫实验中的结果表明,算法的收敛速度更快,精度更高,且对状态空间的增长有较强的顽健性。
Abstract: In order to improve the convergence performance, an optimized value iteration based on topological sequence backups, VI-TS, is proposed. The key idea of VI-TS is to circumvent the problem of unnecessary backups by dividing an MDP into strongly-connected components and solving these components in topological sequences after detecting the structure of MDP. The experiment results show that VI-TS has a better convergence performance and robustness for state space growth when applied to classical planning experiment scenarios.
黄 蔚,刘 全,孙洪坤,傅启明,周小科. 基于拓扑序列更新的值迭代算法[J]. 通信学报.
0 / / 推荐
导出引用管理器 EndNote|Reference Manager|ProCite|BibTeX|RefWorks
链接本文: https://www.infocomm-journal.com/txxb/CN/
https://www.infocomm-journal.com/txxb/CN/Y2014/V35/I8/8