基于自适应势函数塑造奖赏机制的梯度下降Sarsa(?)算法
肖飞,刘全,傅启明,孙洪坤,高龙
Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
Fei XIAO,Quan LIU,Qi-ming FU,Hong-kun SUN,Long GAO
通信学报
.
2013, (1): 77
-89
.
DOI: 1000-436X(2013)01-0077-12