Journal on Communications ›› 2014, Vol. 35 ›› Issue (8): 56-62.doi: 10.3969/j.issn.1000-436x.2014.08.008

• Academic paper • Previous Articles     Next Articles

Optimized algorithm for value iteration based on topological sequence backups

Wei HUANG1,Quan LIU1,2,Hong-kun SUN1,Qi-ming FU1,HOUXiao-ke Z1   

  1. 1 School of Computer Science and Technology, Soochow University, Suzhou 215006, China
    2 Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
  • Online:2014-08-25 Published:2017-06-29
  • Supported by:
    The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China;The National Natural Science Foundation of China;The Natural Science Foundation of Jiangsu Province;High School Natural Foundation of Jiangsu Province;High School Natural Foundation of Jiangsu Province;Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin Univer-sity

Abstract:

In order to improve the convergence performance, an optimized value iteration based on topological sequence backups, VI-TS, is proposed. The key idea of VI-TS is to circumvent the problem of unnecessary backups by dividing an MDP into strongly-connected components and solving these components in topological sequences after detecting the structure of MDP. The experiment results show that VI-TS has a better convergence performance and robustness for state space growth when applied to classical planning experiment scenarios.

Key words: reinforcement learning, value iteration, topological sequence, VI-TS

No Suggested Reading articles found!