基于优先级扫描Dyna结构的贝叶斯Q学习方法
于俊,刘全,傅启明,孙洪坤,陈桂兴
Bayesian Q learning method with Dyna architecture and prioritized sweeping
Jun YU,Quan LIU,Qi-ming FU,Hong-kun SUN,Gui-xing CHEN
通信学报 . 2013, (11): 129 -139 .  DOI: 10.3969/j.issn.1000-436x.2013.11.015