基于自适应势函数塑造奖赏机制的梯度下降Sarsa(?)算法
肖飞,刘全,傅启明,孙洪坤,高龙
Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
Fei XIAO,Quan LIU,Qi-ming FU,Hong-kun SUN,Long GAO
通信学报 . 2013, (1): 77 -89 .  DOI: 1000-436X(2013)01-0077-12